Monday, November 29, 2010
2011 CAIRO Projections v0.1
Here is the first set of CAIRO projections for 2011. CAIRO is based on Tango Tiger’s Marcel, which uses the basic principles that most projection systems should use. Those basic principles are:
1) Use a weighted average of a player’s recent performance as the primary basis of their progression
2) Understand that players tend to regress towards some mean when projecting going forward, so account for that
3) Ensure you account for a player’s age when projecting them. Older players usually decline, younger players usually improve.
Despite its lack of complexity, Marcel’s shown itself to be about as accurate as even the best projection systems in most instances.
So why create another projection system? There are things Marcel doesn’t account for which I feel are important.
1) Minor league performance is ignored by Marcel. So CAIRO uses MLEs(major league equivalencies) as part of its projection.
2) Marcel doesn’t account for park or league. A pitcher who pitched in San Diego for three years is going to have much better raw stats than a pitcher who pitched in Colorado even if he’s not actually a better pitcher. Similarly, it doesn’t account for the fact that AL pitchers have to face a DH while NL pitchers get to sail by against people who shouldn’t even be in the batter’s box. So CAIRO accounts for park and league (and defensive support that a pitcher gets)
3) Marcel does an overall aging for batters and pitchers, but component stats don’t age that way. Some things improve and some things get worse at many different ages. For example, A player’s walk rate tends to increase through their mid 30s, while triples tend to peak in their early 20s. So CAIRO ages a player’s component stats individually.
4) Marcel regresses everyone towards league average for its regression towards the mean. The truth is it is more accurate to identify a specific mean for a player and regress him towards that. We shouldn’t necessarily regress a 37 year old SS towards league average, we should also account for the fact that he’s 37 AND a SS. So a player’s mean in CAIRO incorporate age and position as well.
5) Marcel uses three years of data, but research shows four years of data (properly weighed) tends to be better. So CAIRO includes 2007-2010 data in the 2011 projections.
In addition to all that, I prefer to do my own analysis to better understand stuff rather than just spouting off someone else’s numbers, and I like the fact that I can run my own projections whenever I want and not have to wait for someone else to publish theirs.
Also, if you create your own projection system you can rig it to make your favorite team look better. Then you can pretend they’re going to win more games than they are, and apparently ignore the fact that you know that your entire system is built on a lie that’s not going to hold up.
But I digress.
A few numbers have changed from the projections that I posted for the Yankees because of a couple of minor errors I found. This was mostly on the defense and pitching side of things, but I don’t think it changed anyone’s value by more than a handful of runs.
So here they are. I should eventually add some other stuff like platoon splits, depth charts and projected standings. I’m sure there are still some bugs in here, so if anyone sees anything wonky or has any questions feel free to let me know.
Update: version 0.6 now available.
Next entry: Toronto Sun: Yanks close to signing Rivera
Previous entry: NY Times: Jeter Said to Be Asking for $23 to $24 Million a Year