Why AL/NL WARs Differ in a Given Year (Hint: it’s more obvious than I thought)

Recently I made the “shocking” discovery that the AL and NL don’t have the same season WAR totals (on a per-team basis), even before interleague play. Of course I wondered why that is. After much verbal head-scratching on my part, Ed very kindly pointed out that the obvious answer I had been rejecting was, indeed, the answer:

WAR formulas are intentionally tweaked to reflect the relative strength of the leagues; otherwise, player WAR could not be used for any meaningful comparison across different leagues or seasons.

Here’s the explanation that appears on B-R:

[T]the leagues are not always equal in their quality levels as evidenced by things like inter-league play and also player performances when shifting leagues. Taking these differences into account assign slightly different multipliers to the leagues, but centered on 20 for 162 game seasons and 19 for 154 game seasons. One example of this is the post-war integration. The National League integrated far more quickly than the American League and was a higher quality league until the 1970’s.

I still don’t know just how the difference in leagues is assessed. But as an exercise, I tried to see if league strength could be seen in the players who moved between the leagues during the period 1946-68 — the post-war, pre-expansion era, which is where I first noticed the large difference in league WAR. As it turned out, almost all of those who spent roughly 2 full seasons in each league looked better in the AL, relative to that league.

There were only 12 players who had 1,000+ PAs in both the AL and the NL during 1946-68. This list shows their OPS+ in each league for those years only:

  1. Chico Fernandez: AL 72, NL 61, +9 points of OPS+ in the AL
  2. Frank Bolling: AL 91, NL 79, +12
  3. Eddie Bressoud: AL 109, NL 81, +28
  4. Gino Cimoli: AL 84, NL 85, -1
  5. Tito Francona: AL 110, NL 92, +18
  6. Bill Bruton: AL 97, NL 95, +2
  7. Jackie Brandt: AL 103, NL 97, +6
  8. Harvey Kuenn: AL 112, NL 98, +14
  9. Roy Sievers: AL 127, NL 107, +20
  10. Dick Stuart: AL 119, NL 116, +3
  11. Frank Howard: AL 148, NL 125, +23
  12. Frank Robinson: AL 182, NL 150, +32

Average: +14 points of OPS+ in the AL.

And the ERA+ for the 23 pitchers with at least 429 innings* in each league during 1946-68:

  1. Jim Bunning: AL 116, NL 129, -13 points of ERA+ in the AL
  2. John Buzhardt: AL 100, NL 94, +6
  3. Gene Conley: AL 90, NL 107, -17
  4. Moe Drabowsky: AL 105, NL 94, +11
  5. Jack Fisher: AL 99, NL 84, +15
  6. Ron Kline: AL 118, NL 97, +21
  7. Mike McCormick: AL 94, NL 100, -6
  8. Cal McLish: AL 107, NL 91, +16
  9. Don McMahon: AL 141, NL 111, +30
  10. Stu Miller: AL 145, NL 107, +38
  11. Billy O’Dell: AL 128, NL 102, +26
  12. Claude Osteen: AL 111, NL 101, +10
  13. Milt Pappas: AL 113, NL 98, +15
  14. Juan Pizarro: AL 115, NL 88, +27
  15. Robin Roberts: AL 115, NL 113, +2
  16. Johnny Sain: AL 103, NL 109, -6
  17. Johnny Schmitz: AL 112, NL 106, +6
  18. Bob Shaw: AL 103, NL 107, -4
  19. Gerry Staley: AL 140, NL 100, +40
  20. Hoyt Wilhelm: AL 162, NL 130, +32
  21. Stan Williams: AL 111, NL 106, +5
  22. Jim Wilson: AL 98, NL 86, +12
  23. Al Worthington: AL 141, NL 102, +39

Average: +13 points of ERA+ in the AL.

These are small samples, but the consistency and the size of the difference strongly suggests that the NL had a significantly higher level of competition in this period. Eleven of the 12 hitters and 18 of 23 pitchers had higher “+” ratings in the AL.

This doesn’t explain why the AL won 13 of those 23 World Series, but we all know that a 7-game series isn’t as telling as a full season.

___________________

* Why 429 IP? I only wanted to use one page of P-I results, and 429 IP was the total for the 200th guy in the NL in that span.

36 thoughts on “Why AL/NL WARs Differ in a Given Year (Hint: it’s more obvious than I thought)

  1. Mike L

    “This doesn’t explain why the AL won 13 of those 23 World Series” The Yankees won ten out of fifteen times. The rest of the AL was 3-5.

    Reply
    1. e pluribus munu

      Which suggests that for much of that period, the distribution of talent was probably less balanced in the AL – though not as different from the NL as the simple tally of Yankee pennants might indicate.

      Reply
  2. Bells

    Is OPS+, then, not adjusted for league? Baseball-reference seems to suggest it’s not… copy/pasted explanation of the measure:

    Statistic Description: OPS+ 100*[OBP/lg OBP + SLG/lg SLG – 1] Adjusted to the player’s ballpark(s)

    Just checking, though, as I’m not really familiar with how these things are calculated. Why would OPS+ be only league-relevant, but WAR be adjusted for minor differences in league strength? Perhaps it’s simple, but my brain isn’t working so well today.

    Reply
      1. John Autin Post author

        b, I interpret that as saying OPS+ is adjusted for the offensive context of the league as compared to historic norms, not as compared to the other league in a given year.

        If OPS+ were adjusted for the strength of one league relative to its opposite number, then what I noted in the post about players with significant time in both leagues would be meaningless — it would not support the theory that the NL was stronger in that period.

        Reply
          1. Bells

            No, thanks bstar, I had the same interpretation as you and was like ‘wait, wouldn’t that make John A’s comparison tautological’? But I figured I must be wrong (John is usually pretty rigorous), and am glad that I am, because otherwise the mystery would still be there.

  3. Forrest

    Interesting. So now that we’re about to have seasons where there’s an interleague game EVERY DAY, will WAR stop being tweaked to reflect the league the player’s in?

    Reply
    1. John Autin Post author

      Forrest, I think I’m missing your point. Why would a change in number(?) and timing of future interleague games alter the rationale for weighting WAR according to league strength?

      Reply
      1. e pluribus munu

        Wouldn’t any change in the number of games alter the formula for weighting? After all, if interleague games were 50% of all games, the context discrepancy would presumably have shrunk to zero.

        Reply
        1. John Autin Post author

          True statement, e, but Forrest asked if they would/should stop tweaking the formula entirely. I said the intended changes wouldn’t alter the rationale for weighting WAR — not that they shouldn’t alter the specific weights used.

          Reply
          1. e pluribus munu

            Right, JA, and another instance of my natural talent for making statements that are both trivially true and irrelevant. In this case I thought “every day” might have gotten confused with the concept of 50/50 balance. I craftily avoided clarity in my response.

    2. kds

      Won’t make that much difference. From 252 inter-league games to 300 out of 2430 for the full schedule.

      Let’s compare Mays in the NL to Mantle in the AL. I’ve chosen 1955-1962 to get Mantle’s best years, Mays’ best 8 consecutive years were 1958-1965, so this is a little unfair to him. Mays had a few more PA, being healthier, and is just ahead in brWAR 69.2 to 68.1. Mantle was a considerably better hitter, 512 to 392 in Rbat and Mays’ advantage in defense leaves him still 42 behind by 42 RAA, 553 to 511. This leaves Mays behind in WAA 54.9-51.6. So, how is Mays ahead in WAR? The difference between replacement level and average was 22 runs/162 games in the NL, but only 18 runs/162 games in the AL. The average AL player was about 4 runs per season worse than the average NL player.

      Reply
  4. no statistician but

    JA:

    As usual, I have to quibble a little. The pattern of several of the position players here—Bolling, Francona, Kuenn, Sievers—shows a lower OPS+ in the NL at least in part because they had entered into their declining years as performers when they switched leagues. Stuart’s two big years in the AL don’t match his two best in the NL, and it’s only his clunker 1962 season that draws down his NL numbers. Jackie Brandt spent his prime years in the AL and his early and late years in the NL. The two big dogs at the end of the list are the only ones that genuinely support the thesis, unless you count Bressoud, and I can’t help thinking that his numbers, in spite of the supposed park adjustments, were helped by playing in Fenway.

    The pitching list is too daunting for me to tackle in the time I have, but Bunning and Conley, showing negative figures, spent their declining years in the NL.

    It’s not that I didn’t think then and don’t think now that the NL was somewhat stronger in that era—it certainly had more big name stars—but something that seems to have gotten lost over time is that the style of play in the two leagues was far different. I don’t know how this can be factored in or out.

    Reply
  5. Doug

    I think comment #11 from kds explains the league differences more succinctly (and with greater precision) than what might be inferred from a grab-bag of test case players with selection difficulties such as those identified by nsb in comment #10.

    Would be interesting to see the league-wide numbers kds mentions on a year-by-year basis, to see if they correspond to the perception that the AL has had a superior level of play in recent seasons.

    Reply
    1. John Autin Post author

      Doug, I grant you that 12 position players hardly make up a valid test of AL/NL strength over a 23-year period. But I’m not sure how to do that test any better. Using a threshold lower than 1,000 PAs would bring in more players, but each would be more subject not only to randomness, but to the specific problem that players of that era were far more likely to change leagues after a poor year or two.

      The difficulties of such a study leave me wondering again at how B-R (S.F.) has arrived at the estimate of league strength. The only two factors mentioned by name are “inter-league play and also player performances when shifting leagues,” and of course there was no interleague play in that era.

      And the estimated differences are sometimes large. For example, in 1956, the NL is rated 20% betterL
      – 1956 NL, 141.0(pos) + 101.5(pit) = 242.5 WAR
      – 1956 AL, 119.8(pos) + 82.7(pit) = 202.5 WAR.

      I don’t actually doubt that there’s a valid methodology behind that. But I’d love to know what it is.

      Reply
      1. Doug

        John, This was the kds quote that intrigued me.

        The difference between replacement level and average was 22 runs/162 games in the NL, but only 18 runs/162 games in the AL. The average AL player was about 4 runs per season worse than the average NL player.

        It sounds like kds has looked at WAR minus WAA on a league-wide (all players) basis, and then converted to RBAT. Subtracting one from the the other may not be entirely legitimate statistically, but I liked the approach conceptually, based on the notion that replacement level should be about the same for both leagues (i.e. the same minor league call-up players), but average would be higher in the league with more top-echelon talent. What I didn’t know was whether (or where) WAR and WAA could be found already calculated for an entire league.

        Reply
      2. Bells

        That’s still what bothers me too, John. The 1956 differences you cite are large, especially considering that a) the leagues were closed circuits save for a 4-7 game playoff series, and b) the number of samples of players switching, although telling a pretty clear story from your original post’s analysis, is really still small. A simple ‘sign test’ of binomial probability would certainly show it unlikely to randomly get 11 out of 12 position players to be better in one league, and without doing the math, 18 out of 23 pitchers being better in the AL is unlikely too, possibly to a significant degree. So I’m sure the basis is sound for that conclusion. Still, I’d like to see the numbers.

        Reply
    2. John Autin Post author

      “Would be interesting to see the league-wide numbers kds mentions on a year-by-year basis, to see if they correspond to the perception that the AL has had a superior level of play in recent seasons.”

      It would indeed, but I don’t know how to convert the WAR numbers to the replacement-level adjustments kds cites.

      On the WAR level, the AL has been rated 20%-21% stronger each of the last 5 years, both players and pitchers.

      Reply
        1. John Autin Post author

          Ed, that’s a good find. Still, I’m not sure that the phrase means what you’re saying. It could refer to the varying quality of competition faced within one’s own league.

          For example, a hitter on the ’93 Braves did not bat against his own historically great pitching staff, but did get 13 games against the historically high-yielding expansion Rockies, who allowed almost 6 R/G.

          BTW, Atlanta swept that series, 13-0, averaging 8.2 R/G. David Justice hit .392/1.269 with 6 HRs; Jeff Blauser hit .426/1.269 with 4 HRs and 15 RBI (15 and 73 for the year), etc.

          Reply
          1. Ed

            Perhaps John though as far as I’m aware the place in which bWAR “Varies Replacement Level” is between the leagues. I’ve never seen anything that would suggest that players within the same league are assigned different replacement levels based on not facing their own teammates. They may make adjustments for that but making adjustments is different than “varying replacement level”.

          2. John Autin Post author

            Ed, you’re prob’ly right. I’m starting to get out of my depth in the replacement-level discussion.

  6. Ed

    So nice to be mentioned in a HHS post! 🙂

    As for the World Series, I have a few different thoughts.

    1) One thing to look at would be total the batter and pitching WAR for each team in the WS and see how often the team with the higher WAR won. As commenters 1, 2 and 15 have noted, just because league A is stronger then league B, doesn’t mean that the top team from league A is stronger than the top team from league B.

    2) Another thought is this…one thing we know about the playoffs is that teams tend to use fewer players than they do during the regular season. So while the NL may have been a deeper league overall, during the WS, when the teams are just relying on their very, very best that may not matter as much. It’s possible that the NL was deeper than the AL due to earlier integration but not necessarily stronger among the top players.

    Reply
    1. Doug

      The fact that the Yankees so utterly dominated the AL during this period would also be support the premise of a lower quality of play in the AL. Easier to come out on top consistently playing against lesser opponents.

      Reply
      1. no statistician but

        Doug:

        The Dodgers and Giants together did a fair job of dominating the NL in that era, with 13 titles(Dodgers with 10) to the Yankees’ 15, so I’m not sure your argument is that strong. As for the rather startling 20% difference between leagues in 1956 cited by JA @ #18, I suspect it’s a crock, but I have no way to substantiate my suspicion. The NL had a three team race that went down to the wire, and the Yankees pulled away from the pack in early July, but in terms of competition within the leagues, the rest of the NL finished under .500, while there was only a 6 game gap between 2nd and 5th place in the AL, my point being that away from the leaders, how do you evaluate strength: the NL’s big 3, little 5, vs. the AL’s big 1, medium 5, little 3?

        As for the dominance of Black players in the NL, that would be far more telling if there had actually been a lot of them. In 1956 their numbers were just beginning to grow from those on the three pioneer teams, Dodgers, Giants, and Indians to the rest of the teams in both leagues.

        Take a look at the rosters, if you don’t believe me.

        Reply
        1. John Autin Post author

          nsb — “not a lot” is a reasonable count of the number of African-American & Latino stars in 1956. “A lot more than the AL” is another way to describe it.

          A Mays here, an Aaron there, a Banks here, a Frank Robby there — pretty soon you’re talking about real talent.

          Here’s my completely amateur count, by team, of 1956 players on each team who would not have been allowed to play in 1946. Players before the slash are regulars and SPs; after the slash are reserves. (?) marks players about whom I’m not certain they’d have kept out during segregation.

          NATIONAL LEAGUE

          Dodgers — Jackie Robinson, Roy Campanella, Jim Gilliam, Sandy Amoros, Don Newcombe / Charley Neal, Chico Fernandez

          Giants — Willie Mays, Bill White, Ruben Gomez / Hank Thompson

          Braves — Henry Aaron, Bill Bruton / Wes Covington, Felix Mantilla

          Cubs — Ernie Banks, Gene Baker, Monte Irvin / Solly Drake

          Reds — Frank Robinson / Joe Black, George Crowe, Bob Thurman, Curt Flood, Chuck Harmon

          Pirates — Roberto Clemente / Curt Roberts

          Cardinals — / Charley Peete

          Phillies — None.

          AMERICAN LEAGUE:

          Yankees — Elston Howard

          Indians — Bobby Avila, Chico Carrasquel, Al Smith

          White Sox — Minnie Minoso, Larry Doby, Luis Aparicio(?) / Earl Battey, Connie Johnson

          Orioles — Connie Johnson / Charlie Beamon

          Senators — Jose Valdivielso, Camilo Pascual(?) / Carlos Paula

          Athletics — Vic Power, Hector Lopez, Harry Simpson / Jose Santiago

          Red Sox — None

          Tigers — None

          I think you’re right that it doesn’t account for the full talent difference between the leagues. But it’s not trivial, either.

          Reply
          1. Richard Chester

            Why is Chico Carrasquel on that list? His uncle Alex already had 8 years in the ML prior to his (Chico’s) arrival.

          2. John Autin Post author

            Richard, I’m sure I whiffed on a few. I was just looking at the picture and birthplace. You have to admit there weren’t many players from Venezuela before integration — in fact, Alex Carrasquel was the first, and the only Venezuelan with more than 5 games before Chico.

            I simply didn’t know about Alex.

  7. kds

    If you go to a players “value” section where the WAR computations are shown and highlight a part of the players career, you will get not only the totals of each column for that period, but also averages per year and per 162 games. So you can just look at Rrep per 162 games played. That is what I did for Mays/Mantle. For pitchers it is a bit more difficult since there isn’t a Rrep column. But you do have RAA and RAR, so the difference should be Rrep. (Have to be a little careful since different pitchers can have different numbers of IP per season.) I looked at Whitey Ford and Warren Spahn over the same 8 year period I used for Mantle/Mays. Per 9IP Spahn had just about exactly a difference of 1 run between RAA and RAR. For Ford this was about 5/6 of a run per 9IP. Extend that to 24*9=216 IP and you get a 4 run difference between replacement level and average for those pitchers in the different leagues over those years. (Technical note. I applied the “WAAadj” to RAA even though it is shown as after WAA on the way to WAR. This adjustment is much more important for relief pitchers because it takes into account whether they pitched in high leverage situations or not. It is also used to balance WAA, without the adjustment the 2012 NL total pitching WAA was 12.0 and the AL was 15.3. (Do pitchers come from Lake Wobegon?) Almost all regular starters have a yearly WAAadj of -0.1, or about 1 run. Relievers who pitch well in high leverage situations can have a significant positive WAAadj.)

    Reply
  8. kds

    Now that I’ve done this work, I reread the explanations at B-ref of how they figure WAR. In the section on position players, down near the bottom, there is a table showing the # of replacement runs for a full time player for each league in each year, 1871-2012. Interestingly, the NL has a substantial lead in 1946, the year before the start of integration. The advantage for the NL continues to 1969.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *