Who are the best starting pitchers of the past 60+ years? One way to answer that question is using RE24, the measure of how much a pitcher reduces his opponent’s’ run expectancy with each batter faced.
Starting from each of the 24 base-out states (ranging from nobody on, nobody out to to bases loaded, two out), there is an expected number of runs a team will score in the remainder of that inning, based on average hitters facing average pitchers. With the result of each plate appearance, a pitcher is credited with the resulting change in run expectancy (which can be positive or negative) less any runs allowed.
RE24, then, tells you how many runs a pitcher saved or cost his team relative to the average pitcher in the same base-out situations. Over the course of a career, the batters each pitcher faces will collectively approximate an average batter, allowing some reasonable basis for comparing different pitchers (with the possibly large caveat that RE24 does not adjust for park factors, team defense or other factors).
After the jump, the top 50 since 1950.
I’m using the metric RE24/9 to show the number of runs per 9 innings that a pitcher was better than the average pitcher in the base-out states that the pitcher faced. To qualify, a pitcher must have compiled 2000 IP since 1950. Only seasons since 1950 are counted (the data back to 1950 are mostly complete, with some data back to 1945, and scant data prior to that). Of the 220 qualifying pitchers, these are the top 50 in RE24/9.
[table id=162 /]
The column labeled boLI stands for base-out leverage index, which is a measure of how much leverage was associated with each plate appearance. What is meant by leverage? Essentially, this is a measure of how much variability there is in the run expectancies that could result from a given PA. Without getting too technical, from a base-out state with m baserunners and n outs, a PA can obviously only result in base-out states with a maximum of m+1 baserunners and a minimum of n outs. A weighted average of the differences (absolute values) in run expectancies between the current base-out state and each of the possible subsequent states (with the weights corresponding to the empirical probability for each transition) yields the boLI for that plate appearance.
Intuitively, boLIs are higher with more runners on base and lower if fewer. Thus, the pitchers with lower boLI numbers are those better at keeping men off base, while pitchers with higher numbers are less adept at this. By dividing boLI into RE24, RE24 is normalized (or “de-leveraged”) by showing how well pitchers did relative only to the range of outcomes possible from their base-out states. If I’ve confused the heck out of you, this explanation may be easier to grasp.
Ranking by RE24boLI/9 (boLI divided into RE24/9), those same fifty pitchers look like this.
[table id=167 /]
.
And, combining the two measures, here are our 50 pitchers, ordered by the sum of their rankings in the previous two lists.
[table id=168 /]
Finally, here are some other notable pitchers and their rankings (out of 220) in both RE24 and RE24boLI.
Player | RE24/9 | RE24 Rk | RE24 boLI/9 | RE24 boLI Rk | IP |
Don Sutton | 0.512 | 55 | 0.696 | 38 | 5282.1 |
Gaylord Perry | 0.518 | 53 | 0.633 | 43 | 5350.0 |
Nolan Ryan | 0.471 | 58 | 0.636 | 42 | 5386.0 |
Fergie Jenkins | 0.517 | 54 | 0.590 | 46 | 4500.2 |
Jon Matlack | 0.498 | 56 | 0.601 | 45 | 2363.0 |
Jim Bunning | 0.527 | 52 | 0.520 | 58 | 3760.1 |
David Wells | 0.464 | 61 | 0.561 | 52 | 3439.0 |
Steve Carlton | 0.475 | 57 | 0.523 | 57 | 5217.2 |
Orel Hershiser | 0.434 | 68 | 0.565 | 50 | 3130.1 |
Dean Chance | 0.469 | 59 | 0.515 | 59 | 2147.1 |
Freddy Garcia | 0.530 | 51 | 0.452 | 70 | 2264.0 |
Mark Gubicza | 0.436 | 67 | 0.541 | 56 | 2223.1 |
Frank Viola | 0.463 | 62 | 0.458 | 67 | 2836.1 |
Vida Blue | 0.440 | 63 | 0.434 | 74 | 3343.1 |
Rick Reuschel | 0.405 | 73 | 0.462 | 64 | 3548.1 |
Jerry Koosman | 0.406 | 72 | 0.425 | 76 | 3839.1 |
Milt Pappas | 0.412 | 71 | 0.412 | 77 | 3186.0 |
Frank Lary | 0.433 | 69 | 0.404 | 81 | 2162.1 |
Mel Stottlemyre | 0.437 | 65 | 0.368 | 86 | 2661.1 |
John Lackey | 0.350 | 79 | 0.410 | 78 | 2065.1 |
Catfish Hunter | 0.387 | 74 | 0.373 | 85 | 3449.1 |
Tommy John | 0.344 | 81 | 0.354 | 89 | 4710.1 |
Jack Morris | 0.284 | 94 | 0.352 | 90 | 3824.0 |
Phil Niekro | 0.310 | 90 | 0.290 | 107 | 5404.0 |
Kenny Rogers | 0.234 | 108 | 0.321 | 97 | 3302.2 |
Johnny Podres | 0.321 | 89 | 0.223 | 118 | 2265.0 |
Mike Flanagan | 0.226 | 111 | 0.222 | 119 | 2770.0 |
A.J. Burnett | 0.243 | 103 | 0.198 | 127 | 2353.2 |
Frank Tanana | 0.254 | 100 | 0.173 | 135 | 4188.1 |
Harvey Haddix | 0.214 | 116 | 0.189 | 133 | 2235.0 |
Lew Burdette | 0.169 | 133 | 0.224 | 117 | 3067.1 |
Mickey Lolich | 0.201 | 121 | 0.179 | 134 | 3638.1 |
Fernando Valenzuela | 0.187 | 127 | 0.169 | 139 | 2930.0 |
Scott McGregor | 0.174 | 132 | 0.172 | 136 | 2140.2 |
Wilbur Wood | 0.186 | 128 | 0.165 | 140 | 2684.0 |
Charlie Hough | 0.163 | 134 | 0.133 | 148 | 3801.1 |
Dave Stewart | 0.123 | 146 | 0.075 | 159 | 2629.2 |
Jim Kaat | 0.055 | 161 | 0.097 | 153 | 4530.1 |
Jerry Reuss | -0.027 | 176 | 0.084 | 154 | 3669.2 |
Rick Sutcliffe | 0.008 | 173 | 0.073 | 160 | 2697.2 |
Tim Wakefield | 0.053 | 164 | 0.030 | 171 | 3226.1 |
Jim Clancy | -0.092 | 189 | 0.020 | 172 | 2517.1 |
Ryan Dempster | -0.044 | 181 | -0.128 | 190 | 2387.0 |
Livan Hernandez | -0.117 | 197 | -0.214 | 207 | 3189.0 |
Bill James talks about pitching “families” in the NBJHBA, groups of hurlers with similar defining characteristics (e.g, the Robin Roberts ‘family,’ which includes Fergie Jenkins and Catfish Hunter—RH guys with “good fastballs and a strong commitment to strike zone,” lots of innings, high SO/BB ratio, lots of HRs).
James is only concerned with his top 100 pitchers in the commentary, and he has to admit that some few defy this categorization and are one of a kind. Among the latter non-group is Whitey Ford, and it seems to me that your breakdown here supports that evaluation. His lines here look like nobody else’s, and the disparity of rankings—5th in the first chart and 16th in the second—while still placing him far higher than recent WAR based devaluations have—indicate in particular his uniqueness.
What’s really surprising is how much better Ford shows up than the other starters from his own era, in the presence of Spahn and Pierce. It’s been suggested a couple of times at HHS in other posts that Pierce was as good as or better than Ford, and he did have a couple of excellent seasons, but any close analysis puts Billy a full measure behind Whitey. That being said, Ford was probably never the best pitcher in the league over a full season, but from 1953- 1965, 13 years, he was certainly the best pitcher in the AL year after year, even if he did pitch in Yankee Stadium with good fielding behind him.
His WAR is hurt by Stengel’s 5 and 1/2 man rotation that kept his innings down, plus two years lost to the military. An interesting fact I stumbled across: among pitchers with 100 or more IPs in 1950, rookie Whitey’s 2/3 season of mixed starts and relief produced the lowest ERA, 2.83 to leader Early Wynn’s 3.20, and the highest ERA+, 153, to Ned Garver’s 146. Missing 1951 and 1952 in the service? Consider the latest ready-to-go-from-the-start player, Mike Trout. With the draft in place, we’d all be wondering now, instead of having confirmation, about how his career was going to progress.
Whitey also benefited from not having to pitch to the Yankees.
True, but that’s countered by Stengel’s pattern of saving Ford to face the toughest non-Yankee opponents.
During Ford’s tenure with the Yankees (1950-1967) he had the fourth highest W-L% (.606) in the ML vs. teams with a .500+ winning percentage and the fourth lowest ERA (2.83). That’s for pitchers with a minimum of 40 victories against .500+ teams.
RC:
Who were those three other pitchers? Actually, I know that one was Koufax, who is kind of the anti-Ford, considering his cluelessness for so long, plus the huge numbers of innings he put up once he got on top of his game.
@5
He was behind Ed Lopat (.667), Koufax (.651) and Marichal (.629) in W-L% and behind Wilhelm (2.63), Koufax (2.75) and Bob Veale (2.77) in ERA.
RC:
When you consider the full careers of Lopat, Marichal, and Veale, not just the overlap with Whitey’s, their numbers drop below his. Wilhelm, of course, was mainly a reliever, starting only 50-some games, over half of them in 1959, and while he did excellently as a starter, I’m not sure he belongs in the discussion. That leaves only Koufax, really.
In game four of the 1963 WS, Koufax won over Ford 2-1, a 3-base error by Joe Pepitone allowing Jim Gilliam to make it to third on a ground ball to the infield, from where he scored on a sac fly to break the 1-1 tie. The other runs were HRs by Frank Howard and Mickey Mantle. A classic pitcher’s duel.
Question:
Has there every been a starting nine comprised entirely of free agents?
If the New York (A) team actually trades Brett Gardner, and age once and for all catches up with Jeter – the Yanx will be 100% mercenary.
Just a glance at the famously mercenary 1997 Florida Marlins shows that most of their infield (Johnson, Conine, Castilla, Renteria) and the back of their rotation (Saunders, Rapp, Hernandez) and closer (Nen) were all home grown. Before looking I could only think of 4 of them (I didn’t realize Conine came in the expansion draft from KC)
And of course, Soriano is technically homegrown…
But quite the shift. As much as the NYA team is perceived as just throwing money at free agents, they’ve been Drafted up the middle since, well:
Catcher
Russell Martin was perceived as a transition to a homegrown player.
Montero. Romine. Sanchez.
Didn’t happen. But before that, it was 1997 when the last FA Catcher squatted back there. That guy is now the manager.
Second Base
2005 – 2013 Cano
2004 – 2004 FA Miguel Cairo
2001 – 2003 Soriano
…traded-for Knoblauch
Shortstop
Jeter since 2006.
FA Tony Fernandez in ’05.
And the story is ugly for awhile before that.
CenterField
2010 – 2013 Gardner (when healthy)
2006 – 2009 Melky
1991 – 2005 Bernie
1989 – 1992 Roberto Kelly
…Claudell and Rickey
Greatest Yankee second baseman of all time?
By WAR
53.8 Willie Randolph
48.3 Tony Lazzeri (w/a season’s worth of 3B)
45.2 Rob Cano
37.6 Joe Gordon
26.3 Snuffy (1/2 season at 3B)
14.6 Horace Clark
14.3 Jimmy Williams
13.1 Del Pratt
12.5 Aaron Ward
9.9 Steve Sax
8.3 Bobby Richardson
6.5 Knobs
5.8 Billy Martin
Gil McDougald had 40.6 WAR, and played
599 2B
508 3B
284 SS
I meant or course, Jeter, 1996.
I meant of course, Jeter, 1996.
@20
Hard to believe Clarke came in ahead of Richardson. Bobby had 7 AS selections to Clarke’s none.
I ran a PI search for best OPS+ Against totals, 1948-2013, for all pitchers with at least 250 starts over that period (295 pitchers meet those citeria). The best numbers:
61 Pedro Martinez
68 Roger Clemens
70 Sandy Koufax
71 Randy Johnson
72 Bob Rush
74 Justin Verlander and Johan Santana
75 Greg Maddux
76 Roy Halladay and John Smoltz
77 Nolan Ryan, Curt Schilling, Bob Gibson, Ned Garver
Interesting to see Bob Rush in there.
Thanks for mentioning Rush. I had never heard of him.
In his best 5-year stretch (1952-56), Rush had a 117 ERA+, good for 4th best in the NL (min. 1000 IP). His OPS was 2nd best, behind only Spahn. Not bad for pitching at Wrigley.
Sorry but any measure that puts Tom Seaver as the 17th best pitcher just since 1950- not to mention Gibson at 25, Spahn at 28, Marichal at 25, Roberts at 50 and Carlton & Niekro out of the top 50 altogether- seems to be of pretty limited if not entirely questionable value. I sort of understand about Spahn since it leaves out a few of his best seasons but otherwise this just seems to punish guys for pitching thru the heart of the lineup a 4th or even 5th time.
Hartvig:
What I think this approach does is to look at player evaluation from a particular and meaningful perspective that highlights an important element of play deep within the game. It isn’t a complete measure by any means, but it brings to light a useful supplemental way of reckoning performance. I agree that it’s crazy to rank Seaver, for instance, that low or Ford and Wilhelm that high, but it is a helpful corrective to moderate the flat, take-no-prisoners complacency of WAR-only-ness.
I agree with you, Hartvig. This was just presenting some numbers, with a provocative title to start some lively debated.
If today’s starters were used like the guys you mentioned, or they were used like today’s starters, I suspect the numbers would look rather different.
That is actually a good idea for a future post, using the Split-Finder, to see how much the pitchers from the 1950s to 1970s were hurt by staying in games late and, thus, how much today’s starters are helped in their rate stats.
Interesting list… but rating by a qualitative metric? Innings matter!
What if you did:
(RE24/9) X (IP) = ????
would that number mean anything??
Estimating this for Pedro, Rocket, Unit, and Mad Dog:
-Clemens vaults well ahead of Martinez
-Maddux is a bit ahead of Martinez
-Randy Johnson is a little behind Martinez
In addition to the problem of ignoring IP, another major problem is that when working with a raw runs-saved value like this, there’s a bias toward high offense eras. It’s easier to be farther below the average when the average is higher. Thus, the pitchers from the high offense ’90s and ’00s dominate the list far more than they should.
It’s an interesting idea, but it seems like it needs a lot more work before it gets to a state where it produces a list of “The top 50 pitchers” that would pass the smell test.
I did a study covering the same period (my lifetime). I titled it SMS, simple- minded s***. I took each individual start & subtracted runs allowed (no negatives #s). I then did a roto type scoring 30pts for 1st, 29 for 2nd…Both for quantitative and qualitative (pts/gs). For instance in ’50, Spahn led in both categories and therfore got 60pts for the year, second were R.Roberts and Preacher Roe wit 53 each….Just for laughs, my top 10…Clemens, Maddux, Seaver, Spahn, Blylevin, G.Perry, R.Johnson, Ryan, Carlton, Glavine.
I subtracted runs allowed from innings pitched***
Interesting thought.
(IP – RA) / IP
You might even make it:
(IP – p*RA) / IP
where p is a factor between 0 and 1 to reduce RA to a crude estimate of the number of innings in which runs are scored. For a SWAG, I’m going suggest p=0.75.
The ratio would then be % of IP in which the pitcher didn’t allow any runs, something worth knowing, I think, in evaluating pitcher effectiveness.
Raise your virtual hand if you’d take Johan Santana over Tom Seaver.
Minor snark aside, interesting approach, although the fact that it is weighted very heavily with pitchers from the recent offensive explosion period is telling. It is probably not helpful in comparing pitchers from different eras.
Doug, your post is linked to on both Tom Tango’s site and Baseball Think Factory. That provocative post title did indeed provoke some serious notice in the baseball blogosphere.
The commenters on BaseballThinkFactory take things too literally.
Tango had it right with his comment that “That list was simply a list, and it should be treated as a list.”
I agree, Doug. Most of the criticism at BTF seems to go not to the substance of your post but to the inference the critics think they can make between the lists in your post and your title. If your title replaced “Pitchers” with “RE24/9s”, much of the criticism disappears, I think.
But, if I did that, the post wouldn’t have been linked, and there’d be like 6 comments here.
This is more fun. 🙂
I agree with both of those points!
This is making me vacillate a bit on my Niekro vote.
Very interesting approach, Doug.
I do wonder if RE24 “opportunities” are affected by the scoring context. The run expectancy for any given base/out situation was much lower in 1968 (MLB avg. 3.42 R/G) than it was in 2000 (5.14 R/G), so the same out recorded in 2000 is worth more RE24 than it was in 1968.
Granted, that same out was also harder to obtain in 2000, but my gut still says it is easier for the same good pitcher to compile high RE24 in a high-scoring era than in a low-scoring era.
To test my theory, I tallied all qualified pitcher-seasons with RE24/9 of at least 1.20, then adjusted for the number of teams each year. (So the tallies for 1950-60 were multiplied by 30/16, etc.)
Then I totaled that adjusted tally for two periods — the low-scoring years 1963-76, and the high-scoring years 1994-2007. The results:
— 1963-76: 177 seasons, about 13 per year
— 1994-07: 277 seasons, about 20 per year
A variation on this method: How many pitchers in each of those periods compiled 1,000 IP with at least 1.00 RE24/9?
— 1963-76: 7 pitchers
— 1994-07: 20 pitchers
Even after adjust the earlier period upward to compensate for the difference in league size, it still comes out less than 10 pitchers, so a 2-to-1 edge for the high-run period.
This bias could help explain why the top 7 on your list all spent a good chunk of their careers in the highest-scoring era of the study period.
… Or have I misunderstood everything? Either way, let’s not allow my qualms to get in the way of appreciating Mike Mussina!
I believe you’ve got it John.
The best use of RE24 in a career context is to look at pitchers who are close contemporaries. Here are the leaders by decade for pitchers with 1500 IP and 200 RE24 in a decade.
Generated 12/9/2013.
Randy Johnson takes the prize as the only pitcher to do this in two 0-9 decades. Note that Whitey Ford almost matched Robin Roberts in the 1950s, with only a bit more than half as many innings.
What he said. : -)