Statistical correlation and transitivity are perhaps off-putting words to some.

They're not hard to understand, however. Two quantities -- say, people's heights and weights -- are positively correlated if whenever one of them goes up or down, the other tends to do so as well. Two quantities -- say, people's longevity and smoking rates -- are negatively correlated if whenever one of them goes up or down, the other tends to do the opposite.

And transitivity? A relation is transitive if whenever it holds between X and Y and between Y and Z, it also holds between X and Z. Most people assume that correlation is transitive. That is, they think that if a quantity X correlates positively with another quantity Y, and Y correlates positively with a third quantity Z, then X correlates positively with Z. But this can be a mistake. In many situations X may correlate negatively with Z.

An example cited in the American Statistician from 2001 provides a good counterexample from baseball. Examining the batting records of the New York Yankees with more than 300 at bats-in the previous year, the authors (Langford, Schwertman, and Owens) found that the number of triples hit by a player correlated positively with the number of base hits he had, which in turn correlated positively with the number of home runs he hit. Yet the number of triples a player hit correlated negatively with the number of home runs he hit.

Stated differently, players who got a lot of triples generally got a lot of hits of all kinds, and players who got a lot of hits also tended to hit a lot of home runs.

The reason triples and home runs were nevertheless negatively correlated is that players who hit a lot of triples were usually lithe and fast, traits that do not lend themselves to home run hitting, and players who homered a lot were generally big and slow, traits that do not lend themselves to hitting a lot of triples.

In general, even if a quantity X correlates positively with another quantity Y, and Y correlates positively with a third quantity Z, we can't conclude that X correlates positively with Z. Transitivity may not hold.

Still, we often assume uncritically the transitivity of correlation, particularly in medicine. If, for example, generally good health is positively correlated with personal income, which in turn is correlated with a certain health practice, say the taking of certain expensive vitamin and mineral supplements, we might conclude there is a positive correlation between good health and the ingestion of these supplements. Again, not necessarily so.

Furthermore, the more links there are between two quantities, the more likely transitivity will fail. Once again, the problem is that many people unconsciously reason that if U and V are positively correlated, and so are V and W, W and X, and X and Y, then U and X must be as well. It might be helpful to think of correlations as akin to distances. If U and V are less than a mile apart, and V and W are also within a mile of each other, it doesn't follow that U and W are less than a mile apart.

You are using an outdated version of Internet Explorer. Please click here to upgrade your browser in order to comment.