Peter Austin wasn't a bit surprised when he found evidence of what appears to be a strong link between specific astrological signs and life-threatening medical problems. Is it possible that our health is really determined by the zodiac, or the sign under which each of us was born?
Well, no, despite the fact that Austin's research seems to prove that a Taurus is more likely to suffer a broken neck, a Pisces more likely to have heart failure, and a Virgo more likely to vomit during pregnancy. I'm a Gemini, so I should be drunk right now, because the study, which encompassed the health records of more than 10 million Canadians, shows that a Gemini is very likely to suffer from alcohol dependency.
That's precisely what Austin expected to find, but he says it's all rubbish.
"Taking this study too seriously can be hazardous to your health," says Austin, a statistician with the Institute for Clinical Evaluative Sciences in Toronto. Austin and three other scientists weren't trying to demonstrate the power of astrology in their study. They were trying to prove just how easy it is to show something is right, when in fact it's dead wrong.
Especially when the evidence comes from something called data mining, a current rage not only in science but in the business and social worlds as well. High-speed computers have made it possible for researchers to crunch enormous numbers, massaging mountains of data that would have been impossible to analyze just a few years ago.
That's been a good thing in some ways, because it has helped researchers spot trends in everything from politics to the stock market to long range weather patterns. But it's probably also why you get advertisements for stuff you don't want, and why sometimes it rains when it's supposed to be sunny. According to Austin and his colleagues, data mining is fraught with peril.
"Results from data mining should be treated with skepticism," Austin says.
If someone looks for patterns long enough, he says, they're probably going to find them, even if they don't exist.
For his study, Austin had access to the medical records of more than 10 million adults in Ontario, Canada's most populous province. The records do not show names, but they do show when each person was born, and why each was admitted to the hospital. The birth date made it possible to determine the astrological sign for all 10 million.
The subjects were randomly divided into two groups of slightly more than 5 million.
Beginning with the first group, the scientists identified two medical conditions "for which residents born under one sign had a significantly higher probability of hospitalization compared to the residents born under the remaining 11 signs," Austin says.
Out of that emerged a clear picture showing that a Libra has an increased risk of fracturing a pelvis, and a Scorpio is more likely to suffer leukemia, and so on. After all, 5 million Canadians can't be wrong, and Austin is a very reputable statistician.
Which is why he turned to the second group. If the findings in the first group were correct, he should find the same thing in the second sample, since both samples were random. But he didn't. Twenty-two of the "discoveries" evaporated, because the same pattern did not occur.
Only two remained: Leo with a tendency for intestinal bleeding, and Sagittarius with a risk of breaking the upper arm bone. But those two are well within the recognized margin of error for a study with so many variables.
"All 24 disappear," Austin says.
So there is no connection between astrological signs and health, despite the fact that a huge sample indicated there was. The second sample, called the validation sample, blew it away.
Austin says this isn't just a problem for scientists. We all engage in data mining from time to time.
"If you look at enough clouds, eventually one of them is going to look like a dog chasing a cat," he says. "It's not really a dog chasing a cat. It's a random, atmospheric pattern. But if you look at enough clouds you're going to see a dog and a cat.
"We tend to impose patterns," he adds. "People do it with their horoscopes. They remember the one that came true, but they don't remember all the times that it didn't. They conveniently forget all the times that there wasn't a pattern."
Austin presented his study during the recent meeting of the American Association for the Advancement of Science in San Francisco, and he had a word of advice for his fellow scientists.
Don't believe your own findings until they are validated by someone else using a different method.
And as a statistician, he warns that asking too many questions of the data increases the risk of error. In seeking 24 reasons for hospitalization he was asking 24 questions. That gave him "a 71 percent chance of mistakenly concluding one association that doesn't exist," he says.
That huge error rate is what allowed him to disregard Leo and Sagittarius, the only two that showed up in the second sample.
Personally, I'm glad. As a Gemini, I don't want to wake up every morning with a hangover.