Why Medical Studies Are Often Wrong


Aug. 7, 2005 — -- How many times have you heard people exclaim something like, "First they tell us this is good or bad for us, and then they tell us just the opposite"?

In case you need more confirmation for the "iffy-ness" of many health studies, Dr. John Ioannidis, a researcher at the University of Ioannina in Greece writing in the Journal of the American Medical Association, recently analyzed 45 well publicized studies from major journals appearing between 1990 and 2003. His conclusion: the results of approximately one third of these studies were flatly contradicted or significantly weakened by later work.

There's the well-known story of hormone replacement therapy, which was supposed to protect against heart disease and other maladies, but apparently does not. A good part of the apparent effect may have been the result of attributing the well-being of upper middle class health-conscious women to the hormones.

Another bit of health folklore that "everybody knows" that has turned out to be unfounded is vitamin E's protective effect against cardiac problems. Not so says a recent large study.

And how about red wine, tea, fruits and vegetables? Surely the anti-oxidant effect of these wondrous nutrients can't be doubted. Even here, however, the effect appears to be more modest than pinot noir lovers, among others, had thought.

And certainly many lung patients who inhale nitrous oxide and swear by its efficacy will be surprised to learn that a larger study does not show any beneficial effect.

A common procedure to remove fat from neck arteries, prescription drugs used by millions of people, the herb echinacea … The examples extend beyond those in the JAMA article and go on and on, but the general point is that a single health study by itself cannot be taken as indubitable. The totality of the available evidence, appropriately weighted, is what counts, and this balanced appraisal is difficult to fit into a news article, much less into a catchy headline.

One obvious problem is that studies vary in size and quality. Some are well-designed, others are not, yet most media reports give all of them the same status -- the medical variant of "astonomers say one thing, astrologers another, so let's hear from both." Margins of error, low correlations, or very large ones that mask confounding variables seldom make it into the lede of news stories, whereas "X will cure you" or "Y will kill you" always seem to.

Another issue is that many health studies rely on self-reporting, which is notoriously unreliable. The average number of sex partners reported by heterosexual males, for example, is almost always considerably larger than the average number reported by heterosexual females. Certainly if these numbers, which should be equal, are so out of whack, it's hard to put too much credence into sex surveys as a whole. Similar bias results if people are asked whether their incessant drinking of green tea has lessened their angina.

And the evaluation of all studies must contend with wishful thinking: people naturally want to believe in the value of new treatments, sometimes so much that their critical faculties are dulled or extinguished altogether. For an extreme example consider the studies on the purported effectiveness of prayer.

In the other direction, people often over-react to bad news and fall subject to the "tyranny of the anecdote." For example, TV viewers see parents keening about the unfortunate effect of some vaccine on their child and give little weight to the hundreds of thousands of children who've benefited from the same vaccine.

A distinction from statistics is marginally relevant. We're said to commit a Type I error when we reject a truth and a Type II error when we accept a falsehood. In listening to news reports people often have an inclination to suspend their initial disbelief in order to be cheered and thereby risk making a Type II error. In evaluating medical claims, however, researchers generally have an opposite inclination to suspend their initial belief in order not to be beguiled and thereby risk making a Type I error. There is, of course, no way to always avoid both types of error, and we have different error thresholds in different endeavors.

Moreover, the questions health studies address are often subtly different so seemingly contradictory or confirmatory results are difficult to compare and evaluate. Also sobering is the realization, acknowledged by the JAMA author Ioannidis, that there's no conclusive proof that the results of later studies will not also be rescinded or modified.

So what should you conclude about, say, a small new study that flavonoids in dark chocolate help lower blood pressure? It's your call, but realize how credible you find this chocolate study may say more about your psychology than the biochemistry of chocolate.

As I've written before (although with a different number), it's been conclusively established that 43.58871563% of all statistics are made up on the spot.

-- Professor of mathematics at Temple University, John Allen Paulos is the author of best-selling books, including "Innumeracy" and "A Mathematician Plays the Stock Market." His "Who's Counting?" column on ABCNews.com appears the first weekend of every month.

ABC News Live

ABC News Live

24/7 coverage of breaking news and live events