DFS Roundtable: Most misused stats

ByDFS EXPERTS

August 27, 2015, 1:53 PM

 -- When creating lineups each day for DFS baseball, there are a multitude of stats to consider for each player. Which deserve more weight than others? Which seemingly crucial stat may not be as important as you think? The answers could be the difference in whether you win that night.

Derek Carty, Renee Miller and Todd Zola got together to discuss this and more in our latest MLB roundtable discussion.

Derek Carty: Now you've gone and done it. Get ready for some nerd talk.

People often seem to treat wOBA as gospel, as this perfectly predictive stat that is some sort of godsend. It's not. wOBA was created in the sabermetric community as a descriptive stat to calculate the exact impact the stats a player has put up has on run-scoring. The weighting given to each stat is based on on-field run scoring, but for the purposes of DFS, they are completely arbitrary. On DraftKings a home run is worth 4.6 times as much as a single, but in wOBA, it's worth just 2.4 times more. Steals aren't incorporated into wOBA at all.

wOBA gives the illusion of being predictive because it's fairly stable from year to year -- much more so than something like BABIP. However, this is misleading because it incorporates more stable stats like strikeout rate and walk rate into its formula, but it still also includes BABIP along with all of its variability. It's just masked by the more stable stats. As a result, it can be misleading if people don't understand what they're looking at, how to use it, or take it as gospel. First and foremost wOBA is descriptive, and trying to use it predictively is at best simplistic and at worst like trying to fit a square peg into a round hole.

Renee Miller: No stat is perfectly predictive, obviously, but wOBA and OPS are useful even if the weighting isn't a linear translation from the stat to the DFS score -- more is more, in the end.

Carty: I wrote some pieces a while back for Baseball Prospectus that looked at the stabilization points of various stats, but as I cautioned then, it's important to remember that reaching these points doesn't mean we can throw out everything that's come before them. They're one piece of the puzzle and ideally should be used to determine how much confidence we have in a given sample size of a given player for a particular stat and should be combined with past data that extends past the "stabilization point" using a proper weighting scheme to tie it all together.

Todd Zola: wOBA (and RC+) are decent eyeball tests to ballpark possible good matchups. I don't think people realize wOBA is non-park corrected, so the same wOBA from a Padre is more impressive than from a Rockie. RC+ is corrected. It's all about context and how they're applied.

I've used the in-season skills stabilization in several ESPN Insider pieces. One year, I infamously predicted doom and gloom for Jay Bruce, which I caught all sorts of reader flack for, including a couple Twitter trolls. When Bruce was terrible all season, oddly, no one said anything.

Carty: And this is one of the other (many) reasons why I'm not huge on wOBA for predictive purposes. If we're accepting that various stats "stabilize" at different intervals, why are we okay with throwing them all together into one catch-all stat like wOBA (even if it was weighted for DraftKings scoring instead of arbitrarily) and calling it a day? If certain stats take a long time to stabilize (and certain ones do), then why are we accepting them at face value and rolling them into the wOBA calculation along with the ones that take less time to stabilize -- implicitly treating them as if they stabilize at the same rate and can be trusted at the same confidence level? I get that wOBA is meant to be more of a shortcut and meant to be accessible to and usable by the masses, and maybe I'm just the nerd pushing his glasses up the bridge of his nose and throwing off the curve for the rest of the class, but I've never really been one for shortcuts.

Miller: Just wait until you have kids, Derek. You'll change your tune about shortcuts! Kidding aside, for The Bat, considering and weighting individual stats to arrive at the most accurate projection possible is what makes your system so good. It is a brilliant model, if anyone isn't familiar with it. In that case, doing it right, with no shortcuts, is absolutely the way to go. I would need a lot of data to convince me that my considering and differentially weighting individual stats for every player alongside the day's contextual factors for every article I write allows me to ascertain who has the combination of talent and matchup I want better than my using a catch-all stat like wOBA or OPS.

Zola: Looks like we're stacking against wOBA. Here's a few other deficiencies.

There's no component for steals, so anyone with a significant portion of their DFS potential coming from the stolen base isn't judged properly. Since wOBA is a rate stat, its impact is different from different parts of the order. Finally, the underlying principle of wOBA is it does a decent job of predicting run production. An isolated player can have a really good wOBA but be surrounded by lesser hitters, while another player can have the same wOBA but be in a better lineup. Excluding correction for parks, the latter hitter has a better chance at DFS points from runs and RBI.

Miller: One statistic that I personally never use in my DFS research is batter versus pitcher (BvP). I believe that certain batters can do well (or poorly) against certain pitchers, like I believe that hot streaks exist. But I do think the statistic is frequently misused. The most common reason given by those who eschew BvP is its small sample size. Very few batters have faced a pitcher enough to accumulate meaningful, predictive data. Going beyond that, however, my problem with BvP is that it is often presented without any context. If you tell me that left-handed hitter Smith is 6-for-13 with two home runs off right-handed pitcher Jones as evidence that I should roster Smith against Jones tonight, I'm left with a lot of questions. That sounds good in a vacuum, but what is Smith's career or season OPS/wOBA/ISO versus right-handed pitchers? How has Jones typically performed vs. other left-handed batters? Does the BvP trend buck the larger trend? If so, and the sample size is a heck of a lot bigger than 13 at bats, maybe it's worth considering as a tie-breaker between similarly situated, similarly priced players. If BvP agrees with the overall trend -- Jones is a particularly weak righty and gets knocked around by pretty much all lefty bats, why use the tiny sample over the much larger sample?

Carty: This brings up something that bothers me quite a bit. People will often quote lefty/righty (i.e. platoon) splits for players, but they'll do it without any understanding of the statistical significance of these stats. It's firmly established that, on the league level, lefties hit righties better than they hit lefties (and vice-versa), but people extend this assumption onto the player level and cite individual player platoon splits -- which can be a big no-no. On the player level, platoon splits can be incredibly noisy, and this is true for no players more so than for right-handed hitters, which make up the biggest chunk of the player population.

That is to say, there is so much variance in individual player platoon splits that, unless we have a massive sample size to deal with (or at least utilize proper statistical techniques, like regressing the splits, and even that is tricky to do properly), that they simply can't be trusted on their face. Rather than using the actual platoon splits of a player who's been in the league for three or five or possibly more years, we would be more accurate simply assuming that the player has a league average platoon split -- that, say, right-handed batter X hits left-handed pitching as well as any random right-handed batter does rather than assuming he hits them to whatever crazy extent his career platoon splits say he does.

Miller: There's a lot of interesting stuff here. At the end of the day what we're trying to do is narrow the field of options down to players in the best positions to score for us. Derek, is what you're saying about platoon splits a case of just sliding the scale for great hitters? We know that not all righty batters hit lefty pitching equally well (your last sentence). So if we assume that all right-handed batters hit all left-handed pitchers on average maybe 40 percent better than they hit right-handed pitchers regardless of their baseline skill (could be .100 vRHP vs. .140 vLHP or .300 vRHP vs. 420 vLHP), then a crazy high average or wOBA or OPS doesn't reflect a better platoon split as much as it reflects a superior hitter overall? That certainly makes sense, but it also highlights an opportunity to use a (probably) expensive player when he's at his best...which is what we want. The fact that not all pitchers are equivalent in terms of splits factors in too as we search for the players that stand out from the crowd.

Carty: So theoretically that would make sense if all hitters were, say, 40 percent better against opposite-handed pitchers. +40 percent of a baseline .400 wOBA (.160 points) is bigger than +40 percent of a baseline .250 wOBA (.100 points). It's not really multiplicative like that, though. It's a little more complicated than we should probably get into in this forum, but it may be easier to think of it as additive. All right-handers are a static .020 points of wOBA better against left-handed pitching, rather than 40 percent better. Otherwise, by this same logic, the implication would be that elite hitters are relatively worse than weak hitters are in situations without the platoon advantage, which isn't the case. It's actually quite the opposite. Elite hitters, in general, tend to have smaller true platoon splits, and weaker hitters tend to have bigger ones, which is why they're often cast into platoon roles because they aren't good enough to play full-time. Mike Trout is better against lefties, but he's still really good against righties. The same isn't necessarily true of Jonny Gomes.

Miller: Yes, Derek, that reply makes total sense. I was reacting to this: "we would be more accurate simply assuming that the player has a league average platoon split," and I purely invented the 40 percent. It sounded to me like you were saying that the argument against citing platoon splits was that all players had them, e.g. it doesn't matter how well Josh Donaldson hits lefties (he has crazy high wOBA) because all right-handed batters hit left-handed pitchers better. I think in your reply, you actually make a good case for citing the splits: it's a great place to find value in the Johnny Gomes and Scott Van Slykes (and Mr. Tuttle's favorite Ryan Raburn). I'm still confused how we are more accurate in assessing a player's value on any given night if we assume he has a league average split rather than looking at his own true split in the context of the pitcher, park, etc. Last thought ... the split only really matters in terms of if you're getting a discount because of it. At the end of the day, we're just looking for strong performers in whatever situation they're in (good numbers RHB facing a LHP; I don't care really what his numbers are versus RHP unless there's a very predictable bullpen/substitution situation).

Carty: Right, so the best way is to look a player's "true split." But that's a very different thing than his actual split. When I say true split, I mean his split once it has been properly treated statistically -- regressed, weighted based on how old the data being used is, etc. This is the best way to do it, but obviously too complex for the average player. For the average player, it's much better to simply assume a league average split rather than use the actual split. The reason for this is because of how much variance there is in splits.

Think of it this way. It's common knowledge now, even outside of sabermetric circles, that a pitcher's BABIP has lots of variability. When Jorge de la Rosa posts a .263 BABIP in 2014, we're not assuming that his true talent BABIP is .263 or anywhere near it. We're still going to assume it's closer to league average of around .295, and indeed, in 2015 his BABIP is .289. It's the same concept with platoon splits, but nobody realizes it or thinks about it that way. They're too noisy to trust at face value. Josh Donaldson has one of the most extreme platoon splits in baseball (both actual and true platoon split). This season, there's a .079 point gap between his wOBA vs. RHP and vs. LHP. When accounting for the variance of platoon splits, though, his true platoon split gap is something like .028 points. Ultimately, if we were to create a flow chart of the different ways to approach platoon splits according to their accuracy, it would look something like this: True Split > Career Split (10-15 year veteran) > League Average Split > Career Split > 2015 Split

Zola: I'm with Derek on the misuse of small sample size split data -- most often applied to a younger player or when an established player is displaying a split different than their career norm. I use the guidelines discussed in The Book: Playing the Percentages in Baseball (Tom M. Tango, Mitchel G. Litman and Andrew E. Dolphin) that say a right-handed hitter needs about 2,000 plate appearances versus a lefty while a left-handed hitter needs 1,000 left-on-left plate appearances before they own their splits and can be considered actionable. A full-time player sees a southpaw about 200 times a season which means roughly, it takes 10 years for a right-hander's splits to be considered real and about five years for lefty-swingers. As Derek discusses, the career data can be regressed to global expectation which is something I build into my DFS projection system.

Carty: It's also worth noting that, for these reasons of variance, "reverse splits" are almost never a real thing. That is to say, a right-handed hitter who goes against expectation and hits right-handed pitchers better than he hits left-handed pitchers should not be expected to continue to do so. What we're seeing is variance, the same as if a pitcher posted a .195 BABIP.

Zola: To address Renee's comment about Ryan Raburn, or perhaps someone like Scott Van Slyke - right-handed platoon hitters with a reputation for raking against lefties, their splits on the surface aren't reliable but when they're regressed, the resultant number is a bit better than average so now it comes down to the pricing algorithm and how many reverse platoon splits they have to bring their production down. To be honest, the reason Raburn and Van Slyke are such great DFS choices isn't because of their talent versus lefties but rather their price is calculated based on outcomes relative to hitters with more chances and not based on their rate of performance.

Here's an example where I'm comfortable using split data. Switch-hitter Neil Walker has a career .350 wOBA left-on-right and a .292 mark right-on-left. League average in both instances is about .320 so he's +/- 30 points. For his career, he's faced southpaws about 660 times, not the requisite 1,000 but well over half. I'm comfortable saying not to use Walker against lefties but consider him against righties. Determining bang-for-the-buck for a switch-hitter can be tricky, so it will come down to price, but I am fine saying to fade Walker against a left-hander.

My final point with respect to splits is sometimes they're blindly used to jump a less skilled player with the platoon edge over a more skilled hitter without it. The most important factor is the hitter-pitcher matchup. Platoon boosts are nice -- and sometimes indeed make the lesser hitter more worthy. I just think that often right-on-right and sometimes left-on-left opportunities are filtered out too quickly. Not to mention, once the starting pitcher is taken out, hitters often lose the advantage.