Pythagoras and Baseball

One neat little tool in Baseball is the Pythagorean theorem. No, not for figuring out how far second base is from home plate (that’s just over 127 feet), but for figuring out what a team’s won-loss record “should” be based off its run totals, both scored and allowed. In it’s simplest form, the “Pythagorean winning percentage” is simply runs scored squared divided by runs scored squared plus runs allowed squared. (Experimentation has revealed that an exponent of 1.83 works the best.)

So, say a team has allowed 40 runs while scoring 50. This projects them to winning percentage of .601. If they’ve played ten games, that would mean they should have won about 6 of them. (I’m keeping the numbers simple here to start.) If they’ve won 5, they’ve been slightly unlucky; if they’ve won 3, they’ve been very unlucky. Over the course of the season, teams will usually end up within about 2-3 games of their expected win totals.

Where this gets really interesting (to me, anyway) is that the Pythagorean winning percentage is actually a better predictor of success than the actual winning record. That is, teams tend to fall towards their Pythagorean winning percentage. In the example above, the team that only won 3 games despite a .601 Pythagorean percentage will over the course of time play like a .600 team, not the .300 team their actual winning percentage indicates them to be.

An example of this is this years Seattle Mariners. They had a winning percentage of .543 last year and tied for second in the American League wild card race. But as Baseball Reference’s 2007 standings page shows us, they were extremely lucky last year, winning 8 games more than Pythagoras told us they should have putting up a sub-.500 Pythagorean winning percentage. They became a trendy pre-season pick to make the playoffs this year, but currently have the worst winning percentage in the American League. (Ironically, they are severely underperforming their Pythagorean percentage this year, winning six games fewer than expected.)

Notable performance differences so far this year:

American League

* Tampa Bay has won six more games than it “should” have while Boston is dead-on its expected winning percentage. If both teams matched their expectations, Boston would be in first place in the AL East.
* The Los Angeles Angels of Anaheim (my least favorite team name ever) have actually won 9 games more than expected, although they’d still be running away with their division even if there were no variance from expectations. (Although had they not been so lucky, would the A’s have traded 40% of their starting rotation?) They’re good, but not as good as they look.

National League

* The Phillies would be in first place by a game and a half if they weren’t two games off their expected record.
* The Cubs, already owners of the best record in baseball (that just doesn’t make sense) have actually been unlucky, expecting two more wins than they actually have.
* The Diamondbacks, leading the West with a .511 winning percentage, should actually have a .534 winning percentage. Still not great for a division leader, but less embarrassing.

Another interesting thing I noticed on this page related to winning percentages against winning teams. You often hear in postseason previews that Team X had a losing record against teams with winning percentages over .500. Of course they do, that’s why those teams are over .500. As of today, four teams (out of sixteen) in the NL (Mets, Marlins, Cubs and the Brewers) have won over half their games against teams with winning percentages. (The Cardinals are exactly at .500.) Meanwhile, in the American League only the Yankees, Devil Rays and Angels are winning against winning teams. So, if this is a typical year (and I don’t care enough to do further research), we can expect about a quarter of them teams to be over .500 against winning teams, which is about what I would guess makes sense.

I’ll have to spend more time on Baseball Reference. I never knew this page existed so who knows what else I’m missing?