Who's Counting: Non-Transitivity in Baseball, Medicine, Gambling and Politics
A mathematician makes sense of statistics.
Dec. 5, 2010 — -- Statistical correlation and transitivity are perhaps off-putting words to some.
They're not hard to understand, however. Two quantities -- say, people's heights and weights -- are positively correlated if whenever one of them goes up or down, the other tends to do so as well. Two quantities -- say, people's longevity and smoking rates -- are negatively correlated if whenever one of them goes up or down, the other tends to do the opposite.
And transitivity? A relation is transitive if whenever it holds between X and Y and between Y and Z, it also holds between X and Z. Most people assume that correlation is transitive. That is, they think that if a quantity X correlates positively with another quantity Y, and Y correlates positively with a third quantity Z, then X correlates positively with Z. But this can be a mistake. In many situations X may correlate negatively with Z.
An example cited in the American Statistician from 2001 provides a good counterexample from baseball. Examining the batting records of the New York Yankees with more than 300 at bats-in the previous year, the authors (Langford, Schwertman, and Owens) found that the number of triples hit by a player correlated positively with the number of base hits he had, which in turn correlated positively with the number of home runs he hit. Yet the number of triples a player hit correlated negatively with the number of home runs he hit.
Stated differently, players who got a lot of triples generally got a lot of hits of all kinds, and players who got a lot of hits also tended to hit a lot of home runs.
The reason triples and home runs were nevertheless negatively correlated is that players who hit a lot of triples were usually lithe and fast, traits that do not lend themselves to home run hitting, and players who homered a lot were generally big and slow, traits that do not lend themselves to hitting a lot of triples.
In general, even if a quantity X correlates positively with another quantity Y, and Y correlates positively with a third quantity Z, we can't conclude that X correlates positively with Z. Transitivity may not hold.