Tuesday, April 3, 2012
The 2012 MLB Projection Blowout
Although technically, Opening Day happened last week, I decided to hold off on running my standings projections until I had a better idea of how rosters would end up looking as we approach the start of the entire MLB schedule.
The idea behind this series of posts is to try and project how the 2012 MLB season might look given what we think we may know right now. I’ve been doing a version of this since 2005, and you can see the results by looking at the following links.
2005
2006
2007
2008 Pt 1
2008 Pt 2
2009 AL
2009 NL
2010 AL
2010 NL
2011 AL
2011 NL
A quick look at the previous seasons shows that the results are hit and miss. Projections don’t pretend to be omnisicent, so they can only tell us so much about how things play out. Hence the following disclaimers.
1) Projection systems are inherently limited in their accuracy, particularly for pitchers. We can get a rough idea of how most players will perform by looking at their past histories and how similar players have performed, and factoring in aging and regression, but abilities/talent can change in ways that can’t be forecasted.
2) Playing time distribution in these simulations will not match actual 2012 playing time. I used the rosters and depth charts available at the absolutely awesome MLB Depth Charts plus whatever I’ve read over the offseason as my guide to set these up as realistically as possible, but it’s a possible source of error. Rosters were set up to have 35-40 or so active players per team, and to get a reasonable amount of playing time from the bench and extra pitchers, to more closely model reality. Basically, no players are set to play more than 90% of the time, starting catchers are restricted to at most about 75% of the games, and I’ve made sure teams get a non-trivial amount of starts from their 6-8 starters. The healthier a team is in 2012, the more likely they will be to exceed these projections, and vice versa.
3) We cannot predict injuries and/or roster changes. These simulations do try to adjust projected playing time based on past health issues, so someone like Erik Bedard is not expected to make 30 starts. I’ve also included random injuries which may lead to some of the outlying results you see, but there’s no way to account for all the fluctuations that will happen with rosters this season.
4) These are NOT my predictions. These are projections based on running a computer simulation hundreds of thousands of times with projection data that is inherently limited. If your favorite team doesn’t project well, don’t blame me, blame the computers and spreadsheets that projected them. I guess you can blame me for the CAIRO results if you want, otherwise you can take heart in the 2006 Tigers projecting to win 80, the 2010 Giants projecting to go 81-81 or the 2011 Diamondbacks projecting to win 73 games. These are not meant to tell you how the season is going to play out. I prefer to think of them more as a starting point for discussion, with a range of something like 10 wins in either direction based on how things actually end up playing out. You can look at them and argue about why you think some teams will be better or worse.
5) Since this is all automated, I don’t break ties. I simply award all ties a share of either the division title or wild cards when it happens which is why you may see some funny decimal places in the standings that follow.
6) These are the averages of hundreds of thousands of simulated seasons, so the results will tend to regress towards the mean. The final standings will not look like this, because they only play the season once. If the first place team in a division projects to win 85 games, it doesn’t mean 85 wins will win the division, but I’ll get into that into more detail further down in this post.
7) Even if you knew exactly what every player would do, and exactly how much they’d play, you would not get the standings right. A few one run games or a disparate performance in more crucial situations can cause any team to over/under achieve what their stats say they should have done. So if that’s true, you have to figure that since we have no idea what any individual player do or how much they’ll play, the margin of error on these is massive.
OK, so now that the disclaimers are out of the way, onto the projected standings. The standings are rounded to the nearest win so if the total W-L doesn’t add up to 2430-2430 that’s why.
There’s too much stuff to fit it all into one post, so I’ve created a separate post for each projection system I will use this post to show the results of the aggregate/average of all the projections. You can follow the links below to look at the individual projection systems’ results.
This year, I’m using five different projection systems. You can click on each of the links below to get some more information about each system and to see how their projected standings look.
CAIRO, which is my own projection system.
Tangotiger’s Marcel.
The Hardball Times’s Oliver.
Baseball Prospectus’s PECOTA.
Dan Szymborski’s ZiPS.
I should note that the Marcel projections used here were generated using Python code provided by Jeff Sackmann and are not the “official” projections, although they should be almost identical. I’ll also mention that Oliver, PECOTA and ZiPS have their own projected standings so these should not be considered the official version of those forecasts. Playing time distribution, run environments and park factors may cause some divergence between what those forecasts say and what mine say. When in doubt, go with the official ones.
OK, enough of this palaver…let’s get this show on the road.
| Div | Team | W | L | RF | RA | Div | WC 1 | WC 2 | PS% | W 1 Std |
| AL East | NYA | 94 | 68 | 836 | 711 | 51.7% | 20.5% | 10.2% | 82.5% | 84 - 104 |
| AL East | BOS | 91 | 71 | 829 | 734 | 26.0% | 23.9% | 14.3% | 64.2% | 81 - 101 |
| AL East | TAM | 89 | 73 | 765 | 683 | 19.0% | 22.0% | 13.5% | 54.5% | 79 - 99 |
| AL East | TOR | 81 | 81 | 774 | 775 | 3.3% | 5.9% | 6.2% | 15.4% | 71 - 91 |
| AL East | BAL | 70 | 92 | 713 | 819 | 0.1% | 0.1% | 0.3% | 0.5% | 60 - 80 |
| Div | Team | W | L | RF | RA | Div | WC 1 | WC 2 | PS% | W 1 Std |
| AL Central | DET | 86 | 76 | 784 | 736 | 53.5% | 1.1% | 11.0% | 65.6% | 76 - 96 |
| AL Central | CLE | 82 | 80 | 767 | 751 | 30.7% | 1.6% | 7.8% | 40.1% | 72 - 92 |
| AL Central | CHA | 76 | 86 | 707 | 772 | 8.0% | 1.0% | 2.8% | 11.8% | 66 - 86 |
| AL Central | KC | 75 | 87 | 705 | 765 | 5.6% | 0.4% | 1.6% | 7.7% | 65 - 85 |
| AL Central | MIN | 71 | 91 | 731 | 822 | 2.1% | 0.1% | 0.6% | 2.9% | 61 - 81 |
| Div | Team | W | L | RF | RA | Div | WC 1 | WC 2 | PS% | W 1 Std |
| AL West | TEX | 91 | 71 | 807 | 702 | 52.2% | 9.2% | 13.8% | 75.2% | 81 - 101 |
| AL West | LAA | 90 | 72 | 741 | 660 | 43.2% | 12.5% | 14.8% | 70.5% | 80 - 100 |
| AL West | OAK | 76 | 86 | 707 | 756 | 2.7% | 0.9% | 2.0% | 5.6% | 66 - 86 |
| AL West | SEA | 75 | 87 | 682 | 741 | 2.0% | 0.6% | 1.6% | 4.2% | 65 - 85 |
| AL | WC1 | 92 | ||||||||
| AL | WC2 | 89 | ||||||||
| Div | Team | W | L | RF | RA | Div | WC 1 | WC 2 | PS% | W 1 Std |
| NL East | PHI | 89 | 73 | 698 | 629 | 41.6% | 14.7% | 10.1% | 66.3% | 79 - 99 |
| NL East | ATL | 87 | 75 | 714 | 666 | 27.9% | 16.4% | 10.3% | 54.6% | 77 - 97 |
| NL East | WAS | 84 | 78 | 683 | 657 | 14.4% | 11.4% | 8.8% | 34.5% | 74 - 94 |
| NL East | MIA | 84 | 78 | 707 | 681 | 15.5% | 10.2% | 8.4% | 34.0% | 74 - 94 |
| NL East | NYN | 73 | 89 | 683 | 752 | 0.7% | 0.7% | 1.1% | 2.5% | 63 - 83 |
| Div | Team | W | L | RF | RA | Div | WC 1 | WC 2 | PS% | W 1 Std |
| NL Central | CIN | 87 | 75 | 715 | 661 | 40.9% | 9.6% | 10.5% | 60.9% | 77 - 97 |
| NL Central | STL | 87 | 75 | 731 | 679 | 35.2% | 11.3% | 10.6% | 57.1% | 77 - 97 |
| NL Central | MIL | 84 | 78 | 699 | 664 | 22.5% | 9.6% | 9.5% | 41.6% | 74 - 94 |
| NL Central | PIT | 72 | 90 | 668 | 752 | 0.9% | 0.4% | 0.8% | 2.1% | 62 - 82 |
| NL Central | CHN | 71 | 91 | 656 | 750 | 0.5% | 0.3% | 0.6% | 1.4% | 61 - 81 |
| NL Central | HOU | 64 | 98 | 604 | 756 | 0.0% | 0.0% | 0.1% | 0.2% | 54 - 74 |
| Div | Team | W | L | RF | RA | Div | WC 1 | WC 2 | PS% | W 1 Std |
| NL West | SF | 84 | 78 | 672 | 647 | 32.9% | 4.9% | 9.2% | 47.0% | 74 - 94 |
| NL West | ARI | 84 | 78 | 693 | 673 | 31.3% | 4.6% | 9.0% | 44.9% | 74 - 94 |
| NL West | COL | 83 | 79 | 747 | 731 | 27.3% | 4.1% | 7.6% | 39.0% | 73 - 93 |
| NL West | SD | 76 | 86 | 647 | 688 | 4.7% | 1.1% | 2.1% | 7.9% | 66 - 86 |
| NL West | LAN | 75 | 87 | 641 | 690 | 3.8% | 0.9% | 2.0% | 6.7% | 65 - 85 |
| NL | WC1 | 90 | ||||||||
| NL | WC2 | 87 |
Div: Percentage of times team won division
WC 1: Percentage of times team won first wild card
WC 2: Percentage of times team won second wild card
PS%: Total percentage team qualified for the postseason (DIV + WC1 + WC2)
W 1 Std: Wins within one standard deviation
As noted earlier, this is NOT saying that you can win the NL West by winning 84 games. It’s saying that the teams that finished in first most frequently in that division averaged 84 wins over hundreds of thousands of seasons. Here are the average win totals for each spot in each division.
| Div (Place): avg W | Div (Place): avg W |
| AL East (1 ): 98 | NL East (1 ): 94 |
| AL East (2 ): 92 | NL East (2 ): 88 |
| AL East (3 ): 87 | NL East (3 ): 84 |
| AL East (4 ): 80 | NL East (4 ): 79 |
| AL East (5 ): 69 | NL East (5 ): 72 |
| Div (Place): avg W | Div (Place): avg W |
| AL Central (1 ): 89 | NL Central (1 ): 92 |
| AL Central (2 ): 83 | NL Central (2 ): 86 |
| AL Central (3 ): 78 | NL Central (3 ): 81 |
| AL Central (4 ): 73 | NL Central (4 ): 75 |
| AL Central (5 ): 68 | NL Central (5 ): 69 |
| NL Central (6 ): 62 | |
| Div (Place): avg W | Div (Place): avg W |
| AL West (1 ): 95 | NL West (1 ): 90 |
| AL West (2 ): 87 | NL West (2 ): 85 |
| AL West (3 ): 78 | NL West (3 ): 80 |
| AL West (4 ): 71 | NL West (4 ): 76 |
| NL West (5 ): 70 |
First AL Wild Card team won an average of 92 games, second won an average of 89. For the NL, those averages were 90 and 87.
Here are the AL and NL pie charts for the average projected division winners.






Nothing really stands out to me, although I think you can find plenty of things to quibble with on the margins. I am a bit surprised that the gap between Detroit and Cleveland is only four games, although it is the biggest gap between any first place and second place team. The NL Central and West look like strong three team races, and the AL West looks like a tossup between Texas and the Angels.
And for fans of anyone but the Astros, I guess you can say “at least we’re not the Astros.”
Page 1 of 1 pages:








