Where Have All the Good Teams Gone?

Even before the addition of the second Wild Card, baseball’s postseason was not structured to reward the league’s best team with a championship.  It is the nature of baseball that a 162-game season says far more about a team’s abilities than a best-of- five or best-of-seven (or one-game!) series.  Most of the teams who have recently won championships- most notably the 2006 and 2011 Cardinals- have little claim to the title of Best Team in Baseball other than the rings they wear.

One convenient narrative to describe the 2012 postseason to-date is that four up-and-coming teams, whose preseason expectations varied from last place to fringe contenders, were exiled by four usual suspects, each of whom has played in a League Championship Series in the past two seasons.  If this is true, why does it feel like all of the good teams have been knocked out of the playoffs, leaving two weeks to determine which mediocre team can take advantage of bracket chaos and back into a championship?

Who were the best teams in Major League Baseball in 2012?  We could answer this question a number of ways.  Find out after the jump.

By win-loss record, the best teams were:

Washington, 98-64

Cincinnati, 97-65

NY Yankees, 95-67

The first two of those teams, of course, were vanquished in the NLDS by opponents with inferior regular season records, and the third seems of the verge of collapse.  Furthermore, just one of the three 94-win teams (San Francisco, Atlanta, and Oakland) is still playing, and they’ve certainly found themselves in an underdog role in their LCS as well.

If we adjust win-loss records for quality of competition, the teams in the AL East (Yankees and Orioles) and AL West (A’s and Rangers) look like the best teams in baseball, and again, only the Yankees are left standing, on one 787-year-old leg (an even 800 if we swap out Nunez for Jeter).

If we look a little deeper, at each team’s pythagorean record, an expected record based on runs scored and prevented, the best teams were:

Washington, 98-64

Tampa Bay, 96-66

NY Yankees, 96-66

Now we’ve got a team that’s been eliminated, a team that didn’t even make the playoffs, and the geriatric unit that keeps coming in third.  To be fair, the Cardinals had the fourth best pythag, at 94-68, and their run differential probably says more about their true talent than their win-loss record does.

Peel back one more layer, and we can look at the components that make up runs scored and runs prevented, namely hitting, baserunning, fielding, and pitching.  According to fangraphs, the best squads this season in terms of team WAR were:

1. Cardinals, 52.3

2. Yankees, 51.1

3. Rangers, 50.4

4. Brewers, 50.3

5. Nationals, 50.1

6. Diamondbacks, 47.6

7. Braves, 47.5

8. Angels, 47.4

9. Reds, 46.9

10. Tigers, 45.9

12. Giants, 44.6

14. Athletics, 41.8

20. Orioles, 31.9

There’s not much separation among the top five (which somehow includes the Brewers), but now we’ve got two teams that are still alive at the top.  This makes sense, since these are the teams fangraphs tells us displayed the most talent on the field all year, but of course, it’s counterintuitive to suggest that disparities in talent show up more in a short series than they do over 162 games.

A few things strike me here, starting with my motivation for writing this piece.  All season, I was certain the Rangers were the best team in baseball.  They won the pennant in the tougher league each of the past two years.  They had a star at practically every position Michael Young didn’t play.  They had a deep rotation, a great bullpen, and, for most of the season, the best record in the American League, despite playing in perhaps its most competitive division.  Let’s look at the components of their team WAR (from the Value tab of fangraphs’ leaderboards):

52.5 batting runs above replacement (6th in MLB)

1.1 baserunning runs above replacement (16th)

7.1 fielding runs above replacement (12th)

18.0 starting pitcher WAR (3rd)

5.9 relief pitcher WAR (7th)

That’s the profile of a good team- maybe the type that wins 93 games and almost wins its division- but not necessarily the powerhouse I thought they were.  Nevertheless, their 93-win season in baseball’s toughest division was rewarded with one game against a far weaker team by just about any measure, and now they’re watching the playoffs from home.

Fangraphs tells us that the best hitting team in baseball, and the best baserunning team and the second-best fielding team, was the Trouts Angels, whose 37.4 position player WAR were 11% better than the runner-up Brewers.  Despite what looked like a dominant rotation on paper, the Angels’ pitching wasn’t good enough to lead them to the postseason, as they finished four games behind the two Wild Card teams.

After the Tigers and Rangers, the best pitching team was the Rays, with 23.2 WAR.  Conversely to the Angels, Tampa couldn’t hit enough to reach the postseason.  Certainly, the Rangers, Angels, and Rays would have been in any conversation about the best teams in baseball at any point during the season, but due to their own shortcomings and the mysterious magic of the Orioles and Athletics, they didn’t get a chance to test their luck in October.

What does that leave us with?  Well, for one, the Cardinals, who may be the best team in baseball despite their 88-74 record, worst among all playoff qualifiers (and worse than the Rays and Angels, to boot).  St. Louis can hit (NL-best 107 wRC+).  They can pitch (their 3.47 starter FIP was .01 behind the league-best Nationals, and that doesn’t include much from Chris Carpenter).  And as they did last year, they can hit you with eight playoff-caliber relievers (six of them had ERAs below 3 and FIPs below 3.5).  The defense and baserunning are suspect, but I’m not sure how much that matters when the umpires are giving them infield fly calls on 225-foot fly balls (I’m sorry, I had to).  Keep in mind also that the last two champion Cardinals teams also had the worst record of all playoff qualifiers, and this team seems even more poised to add another trophy.

It also leaves us with the Yankees, the best AL team according to two of the three measures above.  Much of the Yankees’ regular season value was tied up in their offense.  But the team that batted .265/.337/.453 this year (the latter two of those numbers led the AL) hasn’t shown up in October, when they’ve hit .205/.277/.326.  Their pitching has yielded an even more feeble .213/.255/.303 line, which has kept them alive, but much of that was against the impotent Orioles (.247/.311/417 all season).  It’s easy to think the Rangers would have stomped all over this Yankee team, but baseball isn’t that easy, and the numbers don’t vouch for Texas’s superiority.

The Tigers were just the 10th best team in total WAR, but that doesn’t count more than half of Anibal Sanchez’s and Omar Infante’s contributions, since they were acquired just before the July trade deadline.  It also includes a lot of subpar work from the Tigers’ bench, which was among the worst in the game (just nine Tigers position players finished the season above replacement level, while ten were below), and from Rick Porcello (4.59 ERA) and a handful of sub-replacement-level spot starters.  In the postseason, the Tigers will only start Justin Verlander, Doug Fister, Anibal Sanchez, and Max Scherzer, each of whom has dominated in October, and can lean on Austin Jackson, Miguel Cabrera, and Prince Fielder without having to turn to their bench too often.

Finally, the Giants rank twelfth in team WAR and are basically an average team across the board.  They play in the most extreme pitcher’s park in the National League, which tends to paint the picture of a dominant pitching staff and a flaccid lineup, but adjusting for park effects, they had an average offense this year (99 wRC+), an average rotation (3.73 ERA, 3.82 FIP, 5th and 6th in the Nl, respectively), and a decent bullpen (3.56, 3.68).  The short series limits their exposure to Barry Zito, and seems to have helped Tim Lincecum, who’s given up one run in ten playoff innings out of the bullpen.  Still, this is not much better than an average team, with Madison Bumgarner possibly running out of gas and Ryan Vogelsong never inspiring much fear despite another impressive ERA (3.37) this year.

Clearly, the title of this post is a bit of an overreaction to the dismal Yankees-Orioles series and to the many errors and passed balls we’ve seen in every series.  The Cardinals may have been the best team in baseball this year (though I might still try to argue for the Rangers) and certainly look like the best team right now.  The Yankees may look like extras from “Cocoon” but they can still pitch, and the first 162 games of this season may tell us more about what their offense is capable of later this week than the last seven.  The Tigers are loaded with star power and their pitching is on fire at the moment.  And the Giants, well, baseball is unpredictable.  And they might have the best player still playing in Buster Posey.

If only the umps were playoff-caliber…


Where Have All the Good Teams Gone? — 82 Comments

  1. One other variable that also has to be kept in mind is that teams are not static over a six-month-long season. Injuries, trades, rookies, new discoveries — just because a team is at the top of the standings (wins, WAR, pythag or whatever you choose) over the full 162-game slog does not necessarily mean it’s the most talented at the end of the season when championships are decided. Those 162-game stats are strong evidence of who the strongest clubs are as of October, but not definitive evidence.

    • Good point, birtelcom. I tried to touch on that with the difference between the regular season Yankees and the October version, and by mentioning Detroit’s midseason acquisitions, but there’s much more to it than that. I thought about presenting September splits to find the best late-season teams (Baltimore would finally be in the conversation), but it felt counter to my original point to segregate months and try to draw conclusions from smaller samples.

  2. “If only the umps were playoff caliber”

    Another call tonight that would have been reversed on review. At least the umpires, both today and yesterday, were in the right position to make a correct call. But, that only make their failure to do so the more frustrating.

  3. I think all the teams that are left are good teams. That doesn’t mean that they are the actual best four teams, but that’s the bargain MLB has made with increasing rounds of playoffs. The true talent will show up across 162 games. In a short series, however, anything is possible, and we now have the shortest of all short series: A one-game playoff!

    I think the difference now is there is not a great difference between many of the teams that make the postseason. There is no super team. Any team that makes the postseason can win the World Series. It just comes down to the luck of timing, but you still have to be good to get there.

  4. Nice argument, Bryan, however, I think somewhat flawed by the use of “best”.

    Baseball is a zero-sum game (essentially). There is a winner and a loser (barring very rare exceptions). Repeat 162 times. The best team is the one who has won more than rest. So, this year, the best team is Washington. Easy. They have had what it takes (players, depth, management, and all the rest) to win more than everyone else.

    In the old days, they would play against the best AL team, the NY Geriatrics, to determine to “best of the best”, which was what the World Series was originally meant to be (leaving aside the Negro Leagues, et al).

    Now, to be the best-of-the-best, the best team has to play against less good teams on an equal footing. So we are back to one-off games and silly series. All very exciting but, as you have rightly pointed out, not a good way of determining the “best”.

    Unless you mean that the team that can win the most against the other best teams really is the best-of-the-best, no matter how average they are? Maybe… 😉

    • Thanks for the comment, Mark. I disagree with the assertion that the best team is the one who has won more than the rest. With a balanced schedule, you could make that argument, but when the Nationals play 36 games against the Mets and Marlins, while the A’s play 38 (I think) against the Rangers and Angels, Washington winning four more games doesn’t make them better. More successful, maybe, but not better.

  5. At some point several years ago (I think in response to NFL football arguments) I concluded that all the chatter about the so called best team and whether the “best team” won, misses the point. Teams play to win championships not to be acclaimed as the “best team” (except for college football without a play-off).

    The regular season goal is to get yourself in position to play for a championship whether that takes 88 wins in your division or 100. If you are left standing at the end, then you have earned the right to be called the champion, and that is enough for me.

  6. When a sport has as many teams as MLB does, it’s inevitable that often the “best” regular season teams don’t win the World Series. We could throw in the 2001 Mariners or the 1954 Indians as examples. What I object to is the watering down of the playoffs, coupled with an unbalanced schedule, which can lead to unfair enrichment of teams who were mediocre in the regular season. It cheapens the 162 game effort.

  7. My understanding of “best” was formed before divisional play, and was (and is), simple: the team that wins the most games. Like Mark @4. The simplicity of this was screwed up first by divisional play, which introduced unbalanced seasons where differential schedules contributed to a mix of same/different division opponents (though initially still symmetrical), which led to two contenders for the pennant whose win differentials were based on asymmetrical bases, with an (initially) five-game series having the power to undo 162 games of effort. The schedule introduced an element of true luck – one I think even Jim Bouldin would grant. But at least the divisions still resembled the old leagues in rewarding teams in the same way: most wins meant best team.

    With the expansion of luck through the crap-shoot schedules of interleague play, the introduction of more and smaller divisions, which increased the leverage of a team’s division placement (having nothing to do with earned merit), and finally the wild card and the luck of how the rules would pick a team’s initial postseason opponent . . . and finally the second wild card, the obstacles to a team’s success on the basis of a consistent ability to win ballgames has been replaced with the challenge to win them at the right time. So we now have the criterion that Bill @5 suggests: the goal is a strategy to win just enough at each step to advance to the next, a goal that used to be confined solely to the World Series, which used unfailingly to match the winnngest teams of two leagues with sharply distinct identities.

    The current system is what it is – it’s certainly great for media entertainment: there used to be as few as four postseason games, now a child can grow up and go off to college during a postseason stretch – but it has left the concept of the best team up for grabs.

    • Perfectly stated epm. Just to clarify though–I would say that the divisions and unbalanced schedules introduced an element of *randomness*, not luck, into the process of determining the league champ. May seem like a quibble but the semantics are in fact important.

  8. Sorry, but I’m not buying it, especially the idea that the Cardinals are somehow the “best” team in baseball.

    Far far too much is made of the relevancy of the pythagorean expectation (PE) and WAR in these kinds of discussions. The logic of the PE goes like this: the cumulative run differential (RD) is a better indication of a team’s true ability than is it’s won/loss (WL) record. As a flat statement, that’s logical nonsense, because for example, it’s obvious that a team can create a significant fraction of its RD by simply “piling on”, i.e. scoring useless runs above what is needed to win any given game (or being the recipient thereof, e.g. in the case of the Orioles this year), while also winning games by close margins (Orioles again, but not just them). It is not uncommon, for example, especially in the first third of a season, when many teams are still trying to figure out lineups and rotations and middle relief sequences, to have a team fall behind by several runs, to which it responds by bringing in their lower quality middle relief guys instead of wasting their better pitchers in games they are unlikely to win.

    And the result of this: even more runs are scored, i.e. it’s a positive feedback cycle, a piling on by the winning team. And it really tells you very little about the ability of either team to win baseball games over the course of a 162 game season, as some of those bad pitchers get optioned down to the minors, released, traded and what have you, and others take their place as the team stabilizes it’s roster. And there are also the moves that occur at the trade deadline of course.

    Sure, if you win every one of you 162 games by a score of 4-1, then your RD is an indicator of how “good” of a team you are, and the PE will predict your actual record correctly, but so what, that doesn’t tell you anything you didn’t already know from the W-L record itself. And WAR is similar in these regards, because WAR, in the end, is based on run differential, just as the PE is.

    The best way to evaluate which teams are the best is to simply normalize their season records to account for divisional and league strength differences, and just leave it at that. The playoffs tell you very little.

    • But Jim, if we can say team A had the best season, though they are not World Champions, does it matter?

      Who was the better team: the undefeated Patriots (until the Super Bowl) or the Giants? We can argue that point and the argument will be based upon what collection of criteria that we use, and whether we can reach agreement on that. But the Giants are the Champions of that season and I have no bigger problem with that than I do the Cardinals’ recent baseball titles, though I was not pulling for any of those teams to win.

      • Bill, why don’t they take the top 10 finishers, or all the single stage winners, of the Tour de France and have them race one more day, at the end, to see who the real “World Champion” is?

        Answer: because they *already decided that* in the previous three weeks of racing.

        • Jim- because they all might OD if it came down to one day?

          Just kidding, but they race to what the rules are. Would you favor a baseball play-off system where game 7 of the World series was akin to the ceremonial last day of the tour de France?

          • Bill I favor a playoff system that is radical and will never ever be adopted.

            As for the Tour, the ceremonial finish has been challenged at times when someone or other thought they had a chance of winning it on the last day, which can only happen if it’s really close after the penultimate stage. But Greg Lemond took advantage of the 1989 Tour ending with an individual time trial, to accomplish what is still one of the greatest American sporting accomplishments ever.

          • I was at that Tour finish and Greg LeMond also took advantage of being the first Tour contestant to ever use a true aero time trial bike to win it. Fignon, while not treating the time trial as ceremonial, was nonetheless waving at the crowd as he left the starting gate with his normal racing bike, as was customary. LeMond did not break his aero riding position on that aero bike the entire time.

            While it was indeed an awesome achievement coming back from a shotgun wound and major surgery, and while LeMond was absolutely flying through Paris, cheered on by a considerable number of Americans, had Fignon rode the same type of high-tech bike I doubt LeMond would have overtaken him.

          • so, let me guess… you conclude from this that Lemond was lucky to have chosen to use an aerodynamically superior bike and riding position, right?

          • Jim,

            You seem to have a problem with the word “luck.” Nowhere do I mention it in reference to what LeMond did. The Tour obviously has some luck involved (you have to avoid crashes, etc.), but not nearly as much as baseball, which is one reason why individual riders can dominate it for several consecutive years despite the hundred-plus other competitors entered in each one.

            As far as LeMond’s victory goes, before the Tour he got approval to use his aero bike for time trials. He was the only rider to request such permission, and Tour officials debated at length about whether they should allow him to. The accepted practice at the time was for riders to use the same type of bike for every etappe, and in fact the French public and most Tour afficionados were hostile at the time to what has now become standard practice (i.e. using such specialized bikes rather than only regular road racing bikes). To the extent that Tour officials interpreted rules in his favor and disregarded long-established custom, Greg might indeed have gotten a little lucky :-)

          • tag, the lengths to which you will go to defend your “luck” theory can only be described as breathtaking and all-encompassing.

            I will remind you here that Lemond’s time trial run remains the second fastest ever, notwithstanding the improvements in conditioning and technology and extravagant use of PEDs by the cycling community over the intervening 23 years. Maybe his bike gained him a few extra seconds, I don’t know and neither do you. But I do know that Fignon should not have been waving to his adoring French crowd as he exited the ramp.

          • Jim,

            What I was discussing in that post had nothing to do with luck (it was only mentioned tongue in cheek). What I was discussing, which is clear from any unbiased reading of it, was high tech and the advantage it gave LeMond. And there are many things we do know about the improvements that advanced aerodynamic design and related lightweight materials can provide. (If you’ve ever ridden one of these bikes, you probably wouldn’t bother contesting the point.)

            The UCI recognizes two hour records in bike racing. The UCI Hour Record restricts competitors to roughly the same equipment that the great Belgian rider Eddy Merckx used when he established the first modern record, disallowing time trial helmets, disc or tri-spoke wheels, aerodynamic bars and monocoque frames. The association also recognizes the Best Human Effort or UCI “Absolute” Record, which permits all the postmodern gadgetry.

            Merckx set his record in 1972 (49.431). Chris Boardman, among several others, broke Merckx’s record using then-current highest tech. In July 1993 he went 52.270 km/h with a triathlon handlebar/carbon airfoil tubing frame/carbon four-spoke wheels. In September 1996 he went 56.375 using a Graeme Obree “superman” handlebar/carbon monocoque aero frame/five-spoke front & rear disk wheels. Even discounting for potentially better training and PEDs within this relatively short timeframe, the pace of improvement employing this cutting-edge equipment is clear.

            In 2000, after all records benefitting from such advanced tech had been downgraded by the UCI to Best Human Effort, Boardman went after Merckx’s 28-year-old UCI Hour Record on a traditional bike and rode 49.441 km, topping Merckx by 10 m, an improvement of 0.02%.

            LeMond’s Tour time trial was obviously not conducted on a track nor did it last an hour. He traveled 25.5 km. He made up 58 seconds and won by eight seconds. In November 1989 Bicycling Magazine, employing wind-tunnel data, estimated that LeMond may have gained up to one minute on Fignon through the use of his new aerobars plus 16 more seconds by wearing his aero helmet with its elongated tail section (Fignon, bless his Gallic heart, rode bare-headed with his ponytail fluttering in the breeze: it was later bemoaned in the French sporting press that all Laurent had had to do to secure the victory was cut off his trademark ponytail to cut down on drag). Fignon for his part may have gained a five-second advantage by using a disk front wheel while LeMond used a 24-spoke bladed radially spoked front wheel. LeMond averaged 54.55 km/h. Fignon finished third in the final trial at 53.59. It’s hard to estimate how much they both benefited from the moderate tailwind that prevailed over much of the course.

            I followed that race from start to finish and love Greg LeMond. (He’s a friend of my friend Stefan Mutter, the excellent Swiss cyclist.) The man pulled off the ride of his life at the perfect time. I’m certainly not downgrading his achievement (again, he’d recovered from a shotgun wound and his team absolutely sucked, providing him no help at all); I’m just putting it in the proper context. The tech (which he’d also used in a previous, much longer time trial to help him gain a minute on Fignon), and Fignon’s very French habit of sticking to custom, made a clear difference, Laurent’s exuberant wave and all.

            As far as the doping goes, Merckx was doing it back when he set his records and…well, I assume the PEDs and the protocols get better all the time.

          • Not the place for this discussion, but Lemond’s abilities are the stuff of legend. Just as importantly, he has been instrumental in exposing the doping issue via his direct confrontations of Landis and Armstrong, with Armstrong now in complete free-fall with the USADA report and the events of the last week.

            Accordingly, he is without question one of the most important figures in the sport and also one of the greatest American athletes. But I’ll stop there.

          • The funny thing is that there’s no one to even give Lance’s vacated Tour wins to: Ulrich, Pantani, Zulle, Basso – they were all taking PEDs. I tend to give Lance and all the other bikers a lot more slack than I do Bonds and the other baseball players. Riding the way they do is inhuman.

            BTW, Ulrich is a great guy. He used to train in the Black Forest near me and you could ride with him – provided you could keep up, which I never managed to do for long. He used to party and get pretty fat in the offseason, and his trainer would send him out to do a 300 km ride with only liters of water to keep him going. He was always cursing about it.

    • Jim , I know that the pythag expectation is a pet peeve of yours. But the thing is, it works. That is, pythagorean expectation wins are more stable year-to-year than actual wins. If one looks at random Team X and the only things one know about them are their 2012 actual wins and thier 2012 pythag expectation wins, one will have in the long run more success betting that in 2013 their actual wins will look more like their 2012 pythag than their 2012 actual wins. This fact proves out across MLB history.

      It’s hard for me to come up with an explanation for such a result other than that the pythag expectation is providing a better estimate of a team’s underlying talent base than actual wins. This is not a value judgment — the goal of baseball is still to win games and championships, not to come out on top in a season-long net runs scored competition. And of course in all these matter we are always talking aobut probabilities, not certainties. In any particular case, a team might perform the next season closer to its past year’s actual wins than its past year’s pythag. But overall, in terms of probabilities, it is just a fact that the pythag expectation is measuring something real and predictive. In the long run, the way to have the best chance of winning the most games is to maximize your runs scored and minimize your runs allowed. The repeatable ability of a team to time its runs scored and runs allowed so as to do better than its overall runs scored and runs allowed would suggest is at best very unusual or very limited, with most such variations suggesting a random phenomenon and not a repeatable talent. Again, not meant as a judgment, just a factual observatioon.

      • The way (I think) that Bill James explained why Pythag works was that every team, however good or bad, will win and lose 30 games. The difference between teams, then, is what they do in the other 100 games that are “up for grabs”. This mitigates the potential Pythag distortion by meaningless runs in blowout games.

        • Doug, Let’s never forget the Cleveland Spiders of ’99 – even had they reached their Pythag Potential they would have had only 26 wins in a 154 game season. As it was, even a 162 expansion season couldn’t have brought them up to James’s theoretical minimum.

          birtlecom seems to me to be focusing on the right distinction: best talent vs. best record. Sometimes an underperforming team is a group of talented players who have bad seasons; sometimes it’s a group of players living up to their talent who somehow can’t win enough ballgames – just as there are teams of so-so players who somehow win more than “their share.” These are among the most interesting teams, and what makes baseball distinctive is that the nature of the game (revolving pitchers, long season, the precision difficulties of hitting, etc.) underlies the old chestnut that any team can beat any other “on any given day” – the odds based on talent are intrinsically unreliable. The Pythagorean numbers give us the odds based on talent, but the nature of baseball is that the odds are upset by other factors with a regularity unusual among sports (which is, of course, built into real-life betting lines). That frequency is neither luck nor random: it’s intrinsic to the game – though within that unusual space of unpredictability, there are factors that may be more [wind, pebbles, umpires thinking about supper] or less [manager decisions, player distraction] random, that is, beyond the control of player talent to influence.

          • I think the ’99 Spiders are an outlier – not a team fielded with the intention of trying to win. James’ remarks about the 30-30-100 breakdown were meant to be commentary on modern teams which are trying to be as competitive as they can be. Sometimes (Browns selloffs to pay bills, 1950s KC shuttle, Finley’s attempted fire sale, some other fire sales – Marlins, etc.) that point is challenged, but don’t think the competitive line has been crossed as it was with the Spiders.

          • epm,

            Actually, pro baseball and football are fairly similar in how much elements other than pure skill play a role in deciding the outcome. Basically, any sport – any activity – with a tendency toward regression to the mean built into it will exhibit this quality.

          • Doug, I was just being facetious about the Spiders, a team that’s always fascinated me. Sorry to create a distraction (I guess I thought you’d simply roll your eyes and move on). James’ point is clearly well taken and you cited it appropriately.

            tag, I’m not sure you’re correct. I wasn’t just making this up, I was engaging in the critical work of repeating hearsay. Nevertheless, the basis that I remember hearing them say was cogent enough: apart from the intuitive factors I cited, the relative compression of season percentage records and the long-term results of W-L outcomes by opposing teams at the extremes were the data that I’ve seen cited more than once – unfortunately I have no recollection where, but I’ll look.

            I’m not quite sure I underestand your last sentence, since what seems at issue is precisely differentials of degree in the tendency towards such regression, and what I was describing were “built in” factors that determine the degree in baseball (not all degrees are alike, as many Ivy League graduates are glad to point out). We certainly don’t see NFL/NBA type percentage extremes — where sub-.200 and super-.800 teams are not uncommon — in non-Spiders baseball.

          • epm,

            Basketball doesn’t exhibit nearly as much regression to the mean at a team level as baseball does. Jordan’s Bulls can and did play .850 ball over 82 games, and the worst team can and does play .200 ball. Neither regresses much at all. That worst team simply won’t beat a Jordan’s Bulls-type team in a seven-game series because skill plays a highly dominant role in the outcomes of basketball games.

            But the worst team in MLB can take a seven-game series from the best team. The odds may not be very good but they are not negligible.

            And in football, because only one game is played, the “better”/more talented team frequently loses (ask those near-perfect Patriots or the Packers last year). Turnovers, which no team truly controls, play a decisive role in the outcome of many games, as does injury unluck.

          • tag, Well we agree on basketball, where you stick to a season basis (which is what I was using). But you switch to playoffs for baseball and football, where there is no chance of a worst/best match-up. I don’t get the comparative logic. The fact that you have “near-perfect” season-record Packers/Patriots in the first place is the point I’m making. Even the 1906 Cubs lost almost a quarter of their games, and every other modern team has lost more.

      • Good comment birtelcom, strong points. Do you have any references to James’ (or others’) work on this? I’ve read a little on it but the details were lacking and I’d like to see the specifics of his, or anyone’s, approach and algorithm.

        • Jim,

          There are a lot of articles about this, and all kinds of refinements of James’s initial concept, including versions that adjust for blowout victories (i.e. once the winning percentage in a game reaches some high-90% figure the additional runs scored/allowed are no longer factored in).

          I even think PE’s been looked at by advanced mathematicians:

          “Initially the correlation between the formula and actual winning percentage was simply an experimental observation. In 2003, Hein Hundal provided an inexact derivation of the formula and showed that the Pythagorean exponent was approximately 2/(σ√π) where σ was the standard deviation of runs scored by all teams divided by the average number of runs scored. In 2006, Professor Steven J. Miller provided a statistical derivation of the formula under some assumptions about baseball games: if runs for each team follow a Weibull distribution and the runs scored and allowed per game are statistically independent, then the formula gives the probability of winning.”

          I think other mathematicians have looked at it as well.

      • Exactly. Heading into 2013, a team like the Baltimore Orioles should be expected to decline, unless they are able to improve their team in other areas. I would not count on two straight years of historic winning percentages in one-run games. In defense of the Orioles, they did substantially better in run differential as the season progressed. A full year of Machado, working in Bundy next year, and hopefully some key moves in the off season can keep them as a very competitive team, but the Orioles would be making a mistake if they ignored what their pythag shows.

    • I’m also not buying the notion that the Orioles are “lucky” because they won such a large number and proportion of one-run games.

      Throughout most of baseball history, relief staffs have been, at varying stages, practically nonexistent, ill-defined, or well-defined but a) poorly staffed or b) improperly utilized. For these reasons, managers have not had much control over allowing runs in the late innings of close games; thus, luck would play a significant role in who wins these games.

      Nowadays, relief staffs are more well-defined than ever and most teams are adopting advanced statistics to some extent or another. I’m not sure if teams are utilizing their pitching resources in the best way possible, but I don’t think it’s a coincidence that the Orioles had one of the best bullpens in baseball and a historically good record in one-run games. I believe that in the future, managers will figure out how to optimize their pitching resources in high-leverage situations, and there will be a strengthening of the relationship between record in one- and two-run games and quality of relief pitchers; this relationship may be becoming stronger already.

      • There is some connection between bullpen performance and one-run record, even though Tampa had the best bullpen in MLB but a 21-27 mark in 1RG.

        But since reliever performance is itself volatile, one-run records are not very predictive.

        And it’s hard to believe that teams are making progress towards optimal use of their bullpen resources. You still see almost all the best relievers being used as closers, i.e., pitching in save situations without regard to leverage.

        • I guess, then, one could say the O’s were lucky – but lucky more in terms of having a really good bullpen than winning so many one-run games.

          As advanced statistics, and people who talk about advanced statistics, become more accepted within the baseball community, I expect that managers will figure out to use relief pitchers more effectively. (What would be the best way to go about this? If it’s a 3-2 game, should you bring your closer in in the 7th and see if you can get more than three outs from him? Or, is it better to risk it with your second-best guy and wait until the 8th?)

          Now that I think about it, I’m not sure what effect this would have on the correlation between one-run performance and quality of relievers. Because relief pitcher performance is pretty unpredictable (I should’ve realized this after looking at Fernando Rodney’s numbers), one-run performance probably will continue to not be a good predictor of future performance for any given team.

          If teams learn to use their pitching resources more effectively – which I think they will as advanced statistics, and people who talk about advanced statistics, get more exposure – then the correlation between bullpen performance and close-game performance should be separable into two factors: first, the quality of the relief staff; and second, the effectiveness of managerial tactics.

          • “What would be the best way to go about [finding the most effective use of ace relievers]?”

            A good starting place would be the study of this issue that Bill James ran in his Historical Baseball Abstract.

            I actually think the knowledge has already seeped into management, at least most front offices. But two obstacles remain before that knowledge will be widely applied: The players themselves will tend to resist until the better measures of relief performance pay off at contract time. And there is a widespread (and I would say self-fulfilling) belief among players and coaches that “the last 3 outs are different.”

      • Every time the Orioles won another one-run game or extra-inning game, the notion that their streak was purely luck faded a bit, but that doesn’t mean we can dismiss the role of luck entirely. Baseball is so full of luck- a ball down the line kicking off the chalk, an infielder being positioned just right to catch a scorching liner, an umpire giving a pitcher a few inches off the plate on a full-count pitch- that almost every win a team earns is the result of some luck. It follows that the closer the score of a game, the more luck was involved in the win. It’s impossible to win 29 of 38 one-run games without a healthy dose of luck. It’s also nearly impossible to do that without a strong bullpen, a savvy manager, some timely hitting, healthy players in the right positions, a good clubhouse environment…

        Crediting thos wins entirely to the bullpen seems just as lazy as crediting them entirely to luck.

        • A serious complication in analyzing one-run records is that, unlike most other performance splits — home/road, lefty/righty, day/night, etc. — a one-run decision doesn’t exist as such until the last out. There are many different ways for a game to wind up with a one-run margin, and not all of them center on the things we tend to associate with one-run success.

          I don’t mean to denigrate the Orioles. But an argument for the role of luck in their one-run record is boosted by the fact that their bullpen K rate ranked next-to-last in the AL. Yes, they got the outs at crucial times — but over a longer period, they likely would not be as effective.

          • This is a very important point. There is a huge performance range between nearly blowing a four-run lead and erasing a four-run deficit, but both can result in a one-run win. So while it was interesting to know that the Orioles record in one-run games this season was historic, I think their extra-inning record was far more impressive and significant. Beyond going 16-2 in such games, it means that they won or tied 57 of 59 sudden-death innings. (That doesn’t include games tied after eight innings that were decided in the ninth; I didn’t do the research on that.)

  9. Why about SRS? Simple Rating System = Run Differential + Strength of Schedule, expressed as the number of runs better (or worse) the team is than the average ML team. It’s on the front page of baseball-reference to see the full list.

    NYY (1.1)
    TBR (0.9)
    OAK (0.8)
    TEX (08)
    LAA (0.7)

    WSN (0.7)
    ATL (0.5)
    STL (0.4)
    CIN (0.2)
    SFG (0.2)

    • So, by SRS, all the right NL teams made the playoffs, but a different story in the AL.

      But, not sure how to interpret the magnitude of those SRS numbers. The difference between those SRS numbers seems much larger than the subjective difference in quality between those teams.

    • In this case, the Cardinals get knocked down a few pegs for accumulating those WAR and that run differential against weaker competition. Seems fair to me.

      My post didn’t intend to crown one team evaluation system (or one team) as the best- only to question my gut feeling that the Rangers were the best team in baseball and the four remaining teams don’t seem all that good. Thanks for throwing out an alternative evaluation method.

  10. Off topic, but I need perspective and advice from my HHS friends. Reader’s poll. Tonight, Debate, Yankees-Tigers, or my daughter’s PTA, to be followed by Debate or Yankees-Tigers? Split screen is not realistic, although I suppose I could listen and then stare with the sound off. Baseball radio, with John Sterling, is not something that will help me hold up under the strain. But we are talking about three of my favorite things, child, baseball, politics. I think child comes first, but as to the rest, what say you all?

    • Record the debate, and catch the end of the game after the PTA.

      I never record games – too hard to avoid finding out what happens. Even if you don’t know how it turned out, you still know you can find out without watching the recording, which makes watching the recording not really feel like watching live (at least for me).

  11. Baseball, like pro football, has a great deal of luck built into its very core. Which is why a tournament / playoff format is not the best idea for it. When skill plays a highly dominant role in a sport or other activity, it’s fine to determine champions based on tournaments and playoffs. Garry Kasparov will annihilate every challenger (except for machines) for 15 straight years. Usain Bolt will capture back-to-back 100 and 200 gold medals in the Olympics and dominate a dozen Diamond League meets in between. Roger Federer will win six straight matches in a major and nine out of 12 majors while staying No.1 for almost six consecutive years. And the NBA’s best team will almost always take home the hardware every season. Baseball is a marathon, and running a 400-meter race after the marathon regular season to determine the champion is, um, less than ideal.

  12. This is a great, thought-provoking post, Bryan. It pretty much demands a statistical and philosophical debate on how to define “the best teams”, which HHS commenters have risen to.

    Bryan’s careful and fascinating consideration of the many ways to measure “best teams” statistically appears to have re-ignited the great ongoing HHS debate on Pythagorean vs. Actual Records, which now looks like a triangular debate:
    Pythagorean Wins vs. Actual Wins vs. Postseason Wins.

    This is a great time to have this debate, before the World Series, which is harder to argue with using statistics than a regular season record. And on that note, after 162 games, 2012 did not appear to be a regular season with any truly dominant teams in it, however you measure it.

    And to anyone who got dragged into a thread started by me being incompetent in the previous post, I can only apologize and promise to copy edit better.

    • Thanks for the kind words, Kid. I’m not sure this is a linear debate or a triangular one. There are a lot of ways to evaluate a team, as commenters have pointed out above. W-L has its value, so do pythag and team WAR and SRS and isolating recent numbers. There may even be an argument for the eye test here. Whatever team I’m rooting for, I don’t want them up one run against the Cardinals in the bottom of the ninth, not because any stat says Pete Kozma can hit, but because I start feeling like I’ve seen this movie before.

  13. Doesn’t anyone else actually enjoy watching the games without knowing how they are going to turn out? I get the feeling sometimes that folks think it’s just a matter of getting the perfect format, and once we do that we can accurately pick the outcome in advance because the “best” team will win.

    Why have play-offs at all?

  14. I’m always torn on arguments like this. Sometimes it feels like people want to just have the hitters take batting practice and give them mathematical scores. have the pitchers throw at a target and mathematically chart their velocity, accuracy, and pitch movement. Then simply put in the numbers and see which team has the better players. You get where I’m going? It’s like the inputs to the game’s results become more important than the result. Don’t play a playoff, just have a computer run a few trillion permutations and tell us who would win.

    I guess I like looking into the infinite statistical depth of baseball but at the same time keeping in mind some days you’re just going to get beat by Raul Ibanez and that’s all there is to it.

  15. The only place where I’m really willing to concede that randomness has a substantial role in baseball game outcomes on a regular or quasi-common basis, is that hitters are usually not skilled enough to place balls specifically where the fielders are not, resulting in a degree of unpredictability in whether a baTTED ball will result in a base hit or not. Pretty much every other type of example I’ve seen given as supporting the supposedly important role of luck in baseball is some sort of rare case scenario that is not important. So over a game, or a few games, this can be important, but like all stochastic events, over a large sample size, even it is not. I’m only talking here about what goes on on the field.

    • In contrast, I generally think of the “three lucks” of baseball: (1) As you point out, the luck related to where balls fall in the field of play (2) the degree to which hits get bunched within innings to create runs, and (3) the degree to which runs get bunched within games to create wins. The second of those three is simply a result of the fact that when you put together a sequence of guys who each get on base 30% to 40% of the time, sometimes their times on base will get bunched together in an inning and create runs, and sometimes they will occur more spread out and thus be less productive in terms of run scoring. At least to some degree the timing of that bunching is random (or “luck”, if you will): just because you have a lineup of .280 hitters, that doesn’t mean that 28% of at bats every inning will be hits. Sometimes there will be three hits in one inning and a run or two will score, sometimes there will be one hit in each of three innnings and no runs will score. This can be controlled to some degree by things like how a manager arranges his batting order, but much of this bunching is just random.

      My third “luck” is that one team can average 3 runs a game and another 5 runs a game, but sometimes when those two teams play a two-game series just as a matter of randomness the three-run-a-game team will score 6 in one game and none in the other, while the five-run-a-game team hits its exact run scoring average in both games, leading to a series split. Based on this two-game sample these two teams look eqaully skilled but they are not. Maybe a team can control that allocation of runs across games to some degree, but at least some of it remains random.

      • birtlecom, This doesn’t seem right to me. Maybe I’m missing something, but I see your model as mechanistic and static, whereas games are balances of fluid dispositions, reactions, efforts, and choices. What you’re calling luck is what I see as an interplay of skills.

        We all know it’s very hard for a batter to place the ball where he wants it, yet when even I hit fungos, I can get the ball pretty much to the kid I want to give fielding practice to in the manner I want the ball to go. Why can’t I do that in a game? Largely because of the pitcher’s skill – it’s skill that makes hitting hard, just not necessarily the batter’s. And hitters address this issue. Hitters they study pitchers, consider game situations, and make constant adjustments in their swings that express trade-offs between maximizing their ability to place the ball and maximizing their power (of course, some find the odds better always to go with one option). They succeed to varying degrees as a product of their skills and choices and the pitcher’s. The luck of a fielder being where you hit the ball is a product of the limits of your skill and choices, the fielder’s skills, and the manager’s skill in positioning. It’s not random like the location of a meteor hitting the earth. (But wind, light rain, air temperature – to the degree that players can’t anticipate them, are matters of luck, truly out of the control of the people playing the game.)

        When your .280 lineup bunches hits, it’s generally because they’re getting leverage on a particular pitcher on a particular day – they’ll do that more often against less skilled pitchers or highly skilled pitchers whose abilities are dulled on a bad night, situations that are mitigated by a manager’s skill is deciding who’ll pitch at what point, which applies equally to the ways that specific hitters’ dispositional skills are deployed in high-leverage situational contexts. When an error lets a hitter on base it’s rarely luck, it’s a product of the limits of a fielder’s skill in that situation – it may been “luck” in relation to the hitter, but not the fielder (unless, for example, a pebble turned the batted ball into a skill-proof play – that’s luck), so it’s not luck from an observer’s perspective. An example of something much more like luck is when a pitcher overpowers a hitter whose awkward swing results in an unfieldable dribbler – I’d call that luck: a result counter to the skill inputs – but that sort of play is rare in a high leverage situation.

        So my feeling is that outside of a pretty narrow range, anything you might call “luck” can be better expressed as a confluence of skill outcomes.

    • Jim,

      I think that’s the whole point. Achieving hits and scoring runs can’t really be controlled to a great extent, at least not nearly to the same extent that scoring can be controlled in, say, basketball. And other than through strikeouts, pitchers have highly imperfect control over how they record outs.

      My understanding of things is that the speed at which a sport (or any other activity) regresses to the mean basically tells you how much luck is involved in it. In fact, skill can basically be thought of, in any sport where such is the case, as a brake on this inexorable process of regression. I don’t know. With your math skills I imagine you could derive, given enough data, how quickly this process occurs in baseball. My hypothesis is that it’s rather quickly and that therefore a good deal of luck is involved.

    • No, I don’t agree with the thrust of either of these comments, and indeed I feel like it would require several pages of writing to explain myself fully on the suite of topics involved, and having to get at least some work done today, which up to now I have decidedly not, it’s unfortunately going to have to wait for another time.

      • Jim,

        I’m very interested in reading what you come up with. Before you devote any real time to it, however, let me lay out a few concepts I’ve seen over the years. Most come from guys I occasionally work for. They’re math Ph.Ds and stats geeks who have a small company that seeks to identify fund and portfolio managers around the world who can consistently beat benchmarks and index funds (they’re looking for Warren Buffett in 1963). They basically get paid for separating skill from luck in evaluating fund management performance. In giving presentations, they make many analogies to sports and reference lots of studies about sports performance to engage and hold the attention of their audience members, mostly pension fund managers, who think of themselves as sophisticated investors but are really not, and who like me are not trained in statistical methodology. This is a sort of cobbled-together précis from stuff I’ve written/edited/read for them over the years:

        The outcomes of many activities, including sports, business and investing, are a combination of skill and luck. People usually recognize that skill and luck play a role in results but often have a poor sense of the relative contribution of each. Properly disentangling skill and luck contributes to better thinking about most day-to-day outcomes and enhances decision making.

        First, let’s define the terms. Skill is the capacity to use your ability, knowledge, experience, etc. to effectively and readily execute or perform. Skill can considered either a process or a series of actions to achieve a specific goal. Luck comprises all extrinsic events or circumstances that operate for or against an individual/team. Luck, in this sense, is above and beyond skill. It’s a distribution with an average of zero and tends to be transitory.

        A good way of thinking about skill and luck is to place activities along a continuum that has all skill/no luck on one side and no skill/all luck on the other. (I’ve mentioned this before: the spectrum extends from, say, chess to roulette.) Most activities fall in between these extremes and combine both. Placing activities along this continuum requires thinking carefully about the forces that shape outcomes.

        To wit: the outcomes of any activity that combines skill and luck exhibit reversion to the mean (this is axiomatic). An extreme outcome (good or bad) will be followed by an outcome with an expected value closer to the mean. Reversion to the mean is a tricky concept, and the relative contributions of skill and luck shed light on its significance for various activities.

        Probably the biggest difficulty in assessing this relative contribution is that in most cases we can only observe outcomes. Outcomes at the extremes of the skill/luck spectrum present no problems because you know what you’re getting. But for the other activities (like baseball) that blend skill and luck, making a skill/luck assessment requires you to tidy up the jumble of outcomes.

        In thinking about the problem it’s helpful to imagine two urns, one for skill and one for luck. Each urn contains cards marked with numbers that follow a distribution. In simple form, a mean and standard deviation specify the distribution, and the luck urn will always have a mean of zero.

        Take a head-to-head matchup. Both participants – either individuals or teams – draw one number from a skill urn and one from a luck urn, and then add them together. The player with the higher number wins. The players then return the numbers back to the urn, draw again, and decide the next outcome.

        Consider again the skill/luck spectrum. You can mirror any point along it by varying the mean and standard deviation of the numbers in the urns. For instance, the luck urn in an all-skill activity has a mean and standard deviation of zero. Since all participants draw only zeros from that urn, only the skill number determines competitive outcomes. At the other extreme, all-luck activities have a skill urn with a zero mean and standard deviation, and only luck matters.

        Now lots of factors shift activities toward the luck side of the continuum. Two big ones are sample size and competitive parity. In cases where luck is normally distributed, the larger the sample the better we can observe skill. With decent sample sizes, you can start to place sports along the skill/luck spectrum by combining pure skill and pure luck distributions in a proportion that matches empirical results (usually this is done with a season as the sample size because you have to avoid the possibility of the skills themselves eroding and/or rosters changing too much in the case of teams). Such analyses start with three distributions: what would happen if luck determined the outcome of each game (your basic binomial model), what would happen if skill determined each game (a higher-skilled team always beats a lower-skilled one), and what actually did happen. Often cited are Brian Burke’s Advanced NFL Stats study of luck’s contribution to the win-loss record of NFL teams. He concluded it exceeds 50%. Tom Tango did a similar study of basketball. It came in around 11%. (These studies are easy to look up and I’m not defending them; they are one type of approach.)

        Various people, including a few math and stats profs, have done analyses of baseball that relate to this. They look at the correlations of different baseball stats from season to season. In one study (sorry I don’t have it to hand) the R2’s were less than 15% for BA and hitting singles. Unsurprisingly, strikeout rate correlates most highly year to year and is a good indicator of skill (R2 close to 70%). Homers had an R2 of about .5 and OBP was only about .25.

        The urn model also suggests another method of study on the basis of reversion to the mean (which I suggested in a previous post). The rapidity of mean reversion gives you clues about the contributions of skill and luck. Lots of skill makes outcomes “stickier” because good or bad luck is not enough to sway results (there’s no reversion to the mean in chess or checkers: Garry Kasparov and Marion Tinsley kicked their opponents’ asses over and over and over again for years, in the case of Tinsley for decades (though both did lose to machines)). The less skill there is the more the luck distribution takes over, and reversion to the mean tends to be rapid.

        I haven’t yet seen a good study according to these (and other) methods which even quasi-definitely places baseball on the skill/luck spectrum. Quick-and-dirty work by the guys I know suggests it’s around 40%. Take that for what you will, as well as the rest of this.

        • Hi tag, just now seeing your essay, but definitely need to do some *actual work* for a change. Will read and respond as soon as I can, looks well considered on first glance.

          Also, I definitely think we should not discuss cycling at length on this blog, but I just encourage you to read all the USADA report including the affidavits from former team members regarding Armstrong. He is clearly 100% guilty, just like Bonds, Clemens and the rest of ’em.
          Hi tag, just now seeing your essay, but definitely need to do some *actual work* for a change. Will read and respond as soon as I can, looks well considered on first glance.

          Also, I definitely think we should not discuss cycling at length on this blog, but I just encourage you to read all the USADA report including the affidavits from former team members regarding Armstrong. He is clearly 100% guilty, just like Bonds, Clemens and the rest of ’em.

          Hi tag, just now seeing your essay, but definitely need to do some *actual work* for a change. Will read and respond as soon as I can, looks well considered on first glance.

          Also, I definitely think we should not discuss cycling at length on this blog, but I just encourage you to read all the USADA report including the affidavits from former team members regarding Armstrong. He is clearly 100% guilty, just like Bonds, Clemens and the rest of ’em.

          • Won’t discuss the cycling but I’ve seen the reports and interviews with his team members. Lance is obviously guilty, no question.

        • Hi tag,

          Just read this twice. It’s not entirely clear to me what your main point is here. A lot of words were expended to describe the basic idea that observational data have signal and noise mixed in various degrees. This is easily summarized simply by a linear, stochastic model y = a + bx + epsilon, where epsilon is a random variable with defined properties.

          Yes, things that are more stochastic are less predictable, for any given sample size, than things that are more deterministic.

          Luck is not a term that statisticians or scientists use, almost ever. It’s a colloquial term. It has vague and varying meanings, which is why it’s not used. Random variation

          I’m about 100% sure now that you are confusing sampling error (random variation, “chance”) with “luck”. They are not the same thing. I repeat, THEY ARE NOT THE SAME THING. This gets at what epm and I are arguing: you can have random variation due to temporal variations in display of skill, and THIS IS NOT THE SAME AS LUCK.

  16. Bryan,

    Another factor that you might want to consider is the uneven playing field. You have alluded to it in your post, but there is an large inequity in payrolls amongst the MLB teams each year that has a huge impact on players, rosters, ability to get through the season.

    So, you could consider the “best” team the one providing best “Bang for Buck”. I’d like to define that measure as

    B4B = (Wins Above Avg * Bucks per Avg Win) / (Bucks per Win)


    Bucks per Win = Team Payroll / Season Wins

    This year the average payroll was $98m, the average wins was 81, so the Avg Bucks per Win was $1.2m.

    Using this formula, the “best” teams in MLB 2012 were

    Oakland 26.7
    Washington 24.8
    Cincinnati 22.8
    Atlanta 17.8
    Baltimore 16.6

    The three worst were

    Astros -28.5
    Rockies -16.9
    Cubs -16.7

    I think that there is still some work to do :-)

    • Mark, these numbers are interesting, but they speak more to efficiency than talent. Maybe the GM who wins the most with the smallest budget is the best, but the Yankees are no worse a team because they make so much money.

      Also, it might be worth tinkering with your formula until the Red Sox show up on the worst list.

      • Economic modeling is not my forte, Bryan, but I’ll give it a proper go this time :-)

        Let’s say we have an efficient market and that payroll is a predictor for Wins: The more money, the better the players, the more wins. Historically (1963 on, less strike years) the max win average is 102 (the min is 58). And further, let’s assume that the team that spends most will get the most wins and the team that spend the least will get the lowest number of wins.

        Doing the numbers on this year, the cost for a win above minimum was around $3m. This leads to an expected cost for the wins that a team actually achieved (for example, the Royals were 72-90, 14 wins above the historic min, thus the expected cost is 42.7m). We can then compare that to what they actually spent above min (5.7m in the Royals case) to get a Bang for Buck (B4B) figure (37.1 for the Royals).

        Doing this for the MLB teams, the best B4B 2012 are

        Oakland 109.8
        Washington 96.0
        Cincinnati 92.1
        Tampa 88.7
        Atlanta 81.8

        The worst five are

        Boston -84.4
        Philly -49.1
        Yankees -29.8
        Marlins -29.3
        Cubs -23.8

        Interestingly, using this analysis, most teams provided “value”. The only additional “negative value” teams were the Astros, Twins, Angles and Rockies.

        Fun stuff!

        • This looks better, but I have one issue. Why set the minimum wins at the yearly average minimum? In that case, you’re going to have teams every now and then paying money for negative wins. Seems like the baseline should be 40 to 45- somewhere in that replacement level range where a team full of players making the league minimum would be expected to fall. Then, of course, the minimum cost of a team would be 25 times the league minimum salary, so $12 million this year.

          I’d guess your best and worst teams wouldn’t change much, but I wonder if you’d still see most teams providing positive value.

  17. Fascinating discussion. I personally don’t believe there’s a “best” team left in the playoffs right now. To believe so one would have to buy into the concept that one team is so ostensibly better than the others right now that it’s actually favored to win a seven-game series over its opponent by a significant margin, and I don’t believe this to be true.

    • I’d rather say “there’s no favorite” than “there’s no best team”. Just because the margin of superiority is less than the random variation inherent in a short series doesn’t mean one team isn’t better. That’s true of just about any regular season game as well, but some teams are far better than others throughout the season.

      • Bryan, logically, if “the margin of superiority is less than the random variation inherent in a short series” then there’s no way of determining “the best team.” :-) Which is why, much as I like to watch the games, I’ve never placed much store in the postseason for sorting out such things.

      • My point was that it’s very debatable who the “best team left” is. I don’t think it’s clearly the Cardinals. I don’t think they have any clear advantage going forward over the Giants or either of the AL teams.

        I think you could play these semifinals and finals four times and come up with four different winners.

        • I agree that the Cards aren’t “clearly” the best team, but I do think they’re the best. If we agree that randomness accounts for at least the first quarter of the possible outcomes of a baseball game (based on the theory that a replacement-level team will win 40-45 games), and that there are no bad teams creating huge mismatches in the postseason, there’s probably a maximum of 60% chance that one team beats another in a playoff game (call it Verlander’s Tigers vs. the wounded Yankees). I think the Cards have at least a 55% chance of beating San Francisco in any given game. Play that out over a seven-game series, and there’s at least a 60% chance the Cards win(I’ve figured out that formula before, but I don’t know it off the top of my head; I’m sure someone here does). I think that’s enough to call them favorites. If St. Louis plays the Tigers, their chances might be down to 52 or 53% (based on a significant offensive advantage, a small disadvantage in the rotation, and an advantage in the bullpen), which of course means anything could happen, but it’s still worth noting they’re the favorites.

          • “small disadvantage in the rotation” relative to the Tigers right now? I don’t think so. Nobody’s starting staff is doing remotely what the Tigers’ is right now. Valverde has apparently worked out his mechanical issues from what I’m hearing (though he certainly does have to prove it), and Coke has been close to spectacular. Dotel and Smyly have both pitched very well and Alburquerque is fairly lights out.

            I honestly don’t think any team has much of a chance against the Tigers right now.

          • Bryan, why are the Cards favored 55-45% over the Giants in any given game? Is this number based on a Vegas betting line? Arrived at after a quick calculation derived from PE? Plucked from thin air (or a darker place)? I’m not saying you’re wrong, and you’re certainly free to posit the Cards’ superiority, but it just seems like you’re making up this number.

          • Tag, you’re absolutely right that I pulled 55% out of thin air. Basically, the Cards are a far better hitting team (107 wRC+ to 99) with a better rotation ERA (3.62 to 3.73) and FIP (3.47 to 3.82), compiled in a higher-run environment. My point was that even if randomness has more impact on potential outcomes of this series than one team’s superiority, it’s not a 50/50 proposition if one team is demonstrably better. And I think the Cardinals are.

            And Jim, it’s a fair point that any advantage for St. Louis’s growing October mystique has to be offset by the Tigers’ rotation’s recent dominance. To test my lazy assertion using regular season numbers:

            Verlander 2.64 ERA, 2.94 FIP
            Fister 3.45, 3.42
            Sanchez 3.86, 3.53
            Scherzer 3.74, 3.27
            Avg 3.42, 3.29

            Lohse 2.86, 3.51
            Wainwright 3.94, 3.10
            Carpenter 3.71, 4.09 (SSS)
            Lynn 3.78, 3.49
            Avg 3.57, 3.55 (though Carpenter’s weighted equally here- I’m not being much less lazy)

            Park factors and quality of competition favor Detroit’s rotation a little, as does the possibility of Verlander going twice. So I’ll give you a reasonably significant advantage for Detroit.

            The Cards’ offense was better, but probably not by as much as I suggested (107 wRC+ to 105; most of their WAR difference is defensive). They match up pretty well.

            For what it’s worth, I’m rooting pretty hard for the Giants and Tigers, and my New England pessimism, combined with the Cardinals’ recent string of ninth-inning October comebacks, probably makes them seem more imposing than they actually are.

  18. What are the Yanks going to do with Arod? 5 years at 114, too big to buy out and trade unless they eat most of it. Keeping him around with the possibility of a further decline could be detrimental to the team. Has this already been discussed and I missed it?

    • PP: It was discussed a little but not much. And here’s the thing. It’s not $114 million. It’s more than that. He gets $6 million bonuses for passing Mays, Ruth, Aaron and Bonds in home runs. He’s got a good shot at passing Mays next year. And he could eventually catch Ruth. So that’s an additional $6-12 million.

      I’ve seen some odd talk about the Dodgers being interested in him. Makes no sense at all partially because of their current obligations but also because he’s being used more and more at DH (for obvious reasons). So I don’t know. I think they’re stuck with him.

      • Actually the whole Yankees offseason should be interesting to watch. They have lots of free agents to decide what to do with – Swisher, Martin, Kuroda, Chavez, Ibanez, Andrew Jones. Pettitte is also a free agent and there’s the obvious question of whether he’ll return. Soriano has an opt-out clause in his contract so he could also choose to test free agency. Granderson and Cano are both on team options for 2013 though presumably they’ll both be picked up. Then you have the players trying to return from injuries – Rivera, Pineda, and Jeter (plus Gardner to a certain extent). And on top of all that, there’s the “what do you do about Alex” question. Yep should be fun to watch!

Leave a Reply

Your email address will not be published. Required fields are marked *