Projection models aren’t perfect and can’t foresee everything that will happen in 2020.
The baseball world was sent into a frenzy last week when Baseball Prospectus released their annual PECOTA projections. The internet was set ablaze with hot takes about how the projection for the Sox couldn’t possibly be correct and that there was an inherent bias. Surely with the roster turnover this winter, the team is going to win more than PECOTA’s projected 82 games, right?
First, let’s go directly to Baseball Prospectus’ site to understand what PECOTA is:
Stands for Player Empirical Comparison and Optimization Test Algorithm. PECOTA is BP’s proprietary system that projects player performance based on comparison with historical player-seasons. There are three elements to PECOTA:
1) Major-league equivalencies, to allow us to use minor-league stats to project how a player will perform in the majors;
2) Baseline forecasts, which use weighted averages and regression to the mean to produce an estimate of a player’s true talent level;
3) A career-path adjustment, which incorporates information about how comparable players’ stats changed over time.
So, now that we know what PECOTA is let’s try to have an intelligent conversation about it and what it’s projections mean. There is one key word that I believe many people aren’t taking into account when discussing this algorithm and its projections. The first word of the second bullet point “baseline” is being overlooked frequently, in my opinion. Whether it’s PECOTA, Dan Szymborski’s ZiPS, Steamer, or any other projection system out there, we need to remember that these are baseline projections. They serve as a foundation for the likeliest outcomes that we can see in 2020.
All of the various projection systems out there are constantly being tweaked to try to improve accuracy because they still have flaws. So for everyone decrying an 82-80 projection from PECOTA or an 85-win projection from ZiPS (ZiPS team win projections haven’t been officially released, however, creator Dan Szymborski indicated as of now he believes in about an 85-win projection for the 2020 White Sox), remember as good as the systems are they still can’t predict the impact a mechanical change will have on a particular player, i.e. Lucas Giolito in 2019. They can’t predict a player adjusting his swing plane in the offseason to create more loft with his swing. All the systems can do is go off of historically similar players and try to project how current players will perform based on past trends.
White Sox Twittersphere has been littered with people screaming that this team’s young core has considerable upside — enough to exceed these low-to-mid 80s win projections — and that is very true. However, these systems can’t assume that just because the upside exists, that all or a large number of these players will tap into that upside. As we’ve seen with Yoan Moncada, Lucas Giolito, and Eloy Jimenez, even top prospects oozing with superstar upside don’t always tap into it immediately. So, it’s rather unreasonable to expect an algorithm to make that assumption.
In all honesty, I think the projections to this point serve as an accurate baseline for what the 2020 White Sox could be. Again, there’s that word baseline. I’ve gone on record as saying I believe the White Sox have enough upside talent to win the AL Central in 2020 (there’s another one of my famously vague opinions, I know). But I can’t guarantee that they will reach enough of that upside. I can’t tell you if Dylan Cease will figure out his home run problem, or if Reynaldo Lopez will finally provide above-average production to warrant staying in the rotation. I don’t know if Michael Kopech will come back from Tommy John surgery and emerge as a presence at the top of the rotation. And I don’t know if Luis Robert and Nick Madrigal will immediately acclimate to the highest level of competition in the sport and make meaningful impacts on this team.
There are too many opportunities for variance with this roster to feel confident that they will win the division in 2020. That’s exactly what the various projection systems are forecasting for this team.
The elongated top of the Sox distribution curve illustrates the wider range of outcomes for this team than those of the Twins and Indians. So the models are literally telling you that there’s a greater chance it could be wrong about the Sox in 2020.
Here’s another thing about projection systems released in February: they can’t predict someone getting hurt during Spring Training or at any other point. Just a few days ago, we received news that Indians ace Mike Clevinger will have surgery to repair a torn meniscus in his knee that will sideline him for 6-8 weeks. Injury recovery is not a perfect science as everyone’s body recovers differently, so what if Clevinger doesn’t see the mound until mid-June? Do we think that would drastically impact the Indians’ performance in 2020? How confident are you that an Indians team without Clevinger for a sizable portion of the season will win its projected 86 games?
The Minnesota Twins, the team PECOTA projects to win the AL Central in 2020, have a roster filled with players that have had varying levels of injury concerns in recent years. Byron Buxton has played at least 100 Major League games in a season one time since initially being called up in 2015. Max Kepler missed almost a month with shoulder issues, Nelson Cruz is 39 and dealt with oblique issues in 2019. What happens to that team if Jose Berrios misses any time in 2020?
I guess what I’m saying is that these projection models aren’t perfect and can’t foresee everything that will happen in 2020. The Sox could drastically outperform their projections and I wouldn’t be the least bit surprised. They could also underperform them. Critical players in the division could miss significant amounts of time altering the state of the division, something an algorithm can’t predict.
Take the projections as a baseline, because that’s what they are meant to be. This team can still win the division, it just isn’t the likeliest statistical outcome and that’s alright. 2020 looks like it will be a fun one at 35th/Shields, so quit projecting, sit back, relax, and strap it down and enjoy the ride.