Perhaps the most fascinating of this year’s award debates is the National League Cy Young race. Johnny Cueto, RA Dickey, Gio Gonzalez, and Clayton Kershaw all have compelling cases, and if voters are looking for dominance over accumulated value, Ar
oldis Chapman, Craig Kimbrel, Kris Medlen, and Stephen Strasburg are worth discussing as well.
A quick look at fangraphs’ pitching WAR leaderboard suggests that Cliff Lee may have a place in this conversation as well. Lee ranks third in fWAR, just .6 wins behind Kershaw and .5 behind Gonzalez. Baseball-reference ranks Lee eighth, bunched up with five other pitchers behind Kershaw, Cueto, and Dickey at the top. After the jump, I’ll examine 12 candidates based on some key numbers.
[table id=70 /]
I included wins in the chart, not because they say anything about a pitcher’s talent or award-worthiness, but because they’ve had an overwhelming impact on past votes and because Lee’s six wins will probably keep him off most voters’ five-man ballots. My goal here is not to pick my Cy Young winner (I’ll let John tackle that in a future post), but to assess just how realistic it would be to put a starting pitcher with six wins at or near the top of a ballot.
Several numbers from the above chart jump off the page, and a few of them belong to Kimbrel. In just 62 2/3 innings, Kimbrel was as dominant as any pitcher has ever been in any role. I wouldn’t hold it against a voter who threw Kimbrel a first-place vote on the the basis of his sub-1 FIP, near-1 ERA, and 8+ strikeouts for every walk, but I think it’s nearly impossible to be the most valuable pitcher in the game without pitching 100 innings.
Similarly, Medlen’s 1.57 ERA, accumulated mostly as a starter, is a remarkable feat, but he only threw about twice as many innings as Kimbrel and less than 60% as many as Dickey.
Lee, on the other hand, struck out 7.39 times as many batters as he walked over 211 innings. If not for a bit of a home run problem (he gave up 26, though many came in a park that inflates homers by about 9%), he may have been the most dominant starting pitcher in baseball this year. Normalize his home run/fly ball rate, as xFIP does, and he leads all qualified pitchers at 3.06. Even with all those homers counting 13 times in his FIP (you can find the FIP formula here), he trails only Gonzalez, Kershaw, and Adam Wainwright (who pitched fewer than 200 innings with an ERA near four) among qualified pitchers. Gonzalez owes much of his success to minimizing home runs, giving up just nine. Bring four just-enough homers back on the field for Lee and he’s the best pitcher in the NL from a fielding-independent standpoint.
I think it’s important that voters (or just debaters) establish a framework in evaluating candidates and are consistent in staying within the framework. Otherwise, a single pitcher’s narrative may cause the voter to neglect another pitcher’s stronger candidacy. If a voter values run prevention and high volume of innings, Kershaw or Dickey is the pick. A voter who prefers dominance over a shorter timeframe would take a closer look at Kimbrel and Medlen. A FIP loyalist would likely vote for Gonzalez, or Kershaw, whose FIP was a tick behind Gio’s in 28 more innings.
The framework within which one can justify Lee as the best pitcher in the National League this season would look something like this:
-Wins are a team statistic and do little to measure a pitcher’s contribution to his team’s success.
-Striking out hitters and avoiding walks are the two most important things a pitcher can do.
-BABiP shouldn’t be held against a pitcher, since it’s more a measure of defense and randomness.
-Home runs are also random to some extent, since they’re affected by park dimensions and weather factors and fluctuate more than strikeouts and walks.
-Relievers don’t throw enough innings to deserve the Cy Young, but additional volume shouldn’t be a deciding factor if several candidates throw 200+ innings.
I listed the five points above in descending order of reasonableness. The last two are, shall we say, less than scientific, and it seems unlikely that a voter of any ilk would consistently apply these five tenets to any Cy Young consideration. Any voter who really does put Lee at the top of his ballot is probably making a somewhat outlandish statement about the worthlessness of pitcher wins.
Remove the last line from the framework, though, and Lee’s probably second to Kershaw, who threw almost 17 more innings. Remove the last two lines and your framework is essentially fWAR, where Lee finished third. I’m not saying I would, but it would be perfectly reasonable for a voter to place Cliff Lee second, third, or fourth on a Cy Young ballot.
If only he had the grit to bear down and win games, he might have added another trophy to his mantle this year.Rich Text AreaToolbarBold (Ctrl + B)Italic (Ctrl + I)Strikethrough (Alt + Shift + D)Unordered list (Alt + Shift + U)Ordered list (Alt + Shift + O)Blockquote (Alt + Shift + Q)Align Left (Alt + Shift + L)Align Center (Alt + Shift + C)Align Right (Alt + Shift + R)Insert/edit link (Alt + Shift + A)Unlink (Alt + Shift + S)Insert More Tag (Alt + Shift + T)Toggle spellchecker (Alt + Shift + N)▼
Toggle fullscreen mode (Alt + Shift + G)Show/Hide Kitchen Sink (Alt + Shift + Z)Insert a Table
FormatFormat▼
UnderlineAlign Full (Alt + Shift + J)Select text color▼
Paste as Plain TextPaste from WordRemove formattingInsert custom characterOutdentIndentUndo (Ctrl + Z)Redo (Ctrl + Y)Help (Alt + Shift + H)
Perhaps the most fascinating of this year’s award debates is the National League Cy Young race. Johnny Cueto, RA Dickey, Gio Gonzalez, and Clayton Kershaw all have compelling cases, and if voters are looking for dominance over accumulated value, Aroldis Chapman, Craig Kimbrel, Kris Medlen, and Stephen Strasburg are worth discussing as well.
A quick look at fangraphs’ pitching WAR leaderboard suggests that Cliff Lee may have a place in this conversation as well. Lee ranks third in fWAR, just .6 wins behind Kershaw and .5 behind Gonzalez. Baseball-reference ranks Lee eighth, bunched up with five other pitchers behind Kershaw, Cueto, and Dickey at the top. After the jump, I’ll examine 12 candidates based on some key numbers.
[table id=70 /]
I included wins in the chart, not because they say anything about a pitcher’s talent or award-worthiness, but because they’ve had an overwhelming impact on past votes and because Lee’s six wins will probably keep him off most voters’ five-man ballots. My goal here is not to pick my Cy Young winner (I’ll let John tackle that in a future post), but to assess just how realistic it would be to put a starting pitcher with six wins at or near the top of a ballot.
Several numbers from the above chart jump off the page, and a few of them belong to Kimbrel. In just 62 2/3 innings, Kimbrel was as dominant as any pitcher has ever been in any role. I wouldn’t hold it against a voter who threw Kimbrel a first-place vote on the the basis of his sub-1 FIP, near-1 ERA, and 8+ strikeouts for every walk, but I think it’s nearly impossible to be the most valuable pitcher in the game without pitching 100 innings.
Similarly, Medlen’s 1.57 ERA, accumulated mostly as a starter, is a remarkable feat, but he only threw about twice as many innings as Kimbrel and less than 60% as many as Dickey.
Lee, on the other hand, struck out 7.39 times as many batters as he walked over 211 innings. If not for a bit of a home run problem (he gave up 26, though many came in a park that inflates homers by about 9%), he may have been the most dominant starting pitcher in baseball this year. Normalize his home run/fly ball rate, as xFIP does, and he leads all qualified pitchers at 3.06. Even with all those homers counting 13 times in his FIP (you can find the FIP formula here), he trails only Gonzalez, Kershaw, and Adam Wainwright (who pitched fewer than 200 innings with an ERA near four) among qualified pitchers. Gonzalez owes much of his success to minimizing home runs, giving up just nine. Bring four just-enough homers back on the field for Lee and he’s the best pitcher in the NL from a fielding-independent standpoint.
I think it’s important that voters (or just debaters) establish a framework in evaluating candidates and are consistent in staying within the framework. Otherwise, a single pitcher’s narrative may cause the voter to neglect another pitcher’s stronger candidacy. If a voter values run prevention and high volume of innings, Kershaw or Dickey is the pick. A voter who prefers dominance over a shorter timeframe would take a closer look at Kimbrel and Medlen. A FIP loyalist would likely vote for Gonzalez, or Kershaw, whose FIP was a tick behind Gio’s in 28 more innings.
The framework within which one can justify Lee as the best pitcher in the National League this season would look something like this:
-Wins are a team statistic and do little to measure a pitcher’s contribution to his team’s success.
-Striking out hitters and avoiding walks are the two most important things a pitcher can do.
-BABiP shouldn’t be held against a pitcher, since it’s more a measure of defense and randomness.
-Home runs are also random to some extent, since they’re affected by park dimensions and weather factors and fluctuate more than strikeouts and walks.
-Relievers don’t throw enough innings to deserve the Cy Young, but additional volume shouldn’t be a deciding factor if several candidates throw 200+ innings.
I listed the five points above in descending order of reasonableness. The last two are, shall we say, less than scientific, and it seems unlikely that a voter of any ilk would consistently apply these five tenets to any Cy Young consideration. Any voter who really does put Lee at the top of his ballot is probably making a somewhat outlandish statement about the worthlessness of pitcher wins.
Remove the last line from the framework, though, and Lee’s probably second to Kershaw, who threw almost 17 more innings. Remove the last two lines and your framework is essentially fWAR, where Lee finished third. I’m not saying I would, but it would be perfectly reasonable for a voter to place Cliff Lee second, third, or fourth on a Cy Young ballot.
If only he had the grit to bear down and win games, he might have added another trophy to his mantle this year.
Path:
Does fWar make adjustments for batters based on Babip?
No, but fWAR ignores BABiP, since it’s based only on fielding-independent outcomes. The difference between a FIP based WAR and a run prevention-based WAR is essentially BABiP and the ability to strand/pick off runners.
Bryan, I’m seeing 4.2 rWAR for Cole Hamels, tied with Lee for #7. Your table has 5.2. Typo? Or am I misinterpreting?
http://www.baseball-reference.com/leagues/NL/2012-pitching-leaders.shtml
Yup, that’s a typo. Fixed it above. Thanks.
“Perhaps the most fascinating of this year’s award debates is the National League Cy Young race.”
In the sense that watching paint dry is fascinating, one presumes.
We’re onto you now Bryan.
Jim, I notice that your reactions to awards debates are not binomially distributed.* 🙂
[* Reveals ignorance of the term’s meaning.]
It’s more that my sarcasm recognition sensors have been engaged John!
Normalizing abnormally high HR rates* might be better for predicting future performance than not doing so, but I don’t like the idea of using a statistic that does so to justify awards voting. He gave up the HRs and I have to assign him the responsibility for doing so not matter how improbable it is that that large a percentage of fly balls would clear the wall.
*I’m more than willing to consider park factors, but on 9% increase for half his games is an extra 1-2 HRs.
FIP and xFIP for the Cy Young?
“Striking out hitters and avoiding walks are the two most important things a pitcher can do.”
While striking out hitters and avoiding walks are certain to help in the endeavor, it flies in the face of logic to suggest these two traits are more important than run prevention.
Run prevention is the pitcher’s ONLY job. It’s quite irrelevant how he achieves it. That’s why he’s out there, to prevent runs, not pad his strikeout total or keep his walk rate low.
I better stop right there.
Right. And there is a strong correlation between men-on-base and runs scored. So, it can be argued, that the pitcher’s real job is to stop guys getting on base, however they do that, is goodness.
It is somewhat perverse that we rate a 9-pitch 3K innings as being “better” than a 3-pitch ground-out innings (a la the game-score approach). But isn’t that just the pitching equivalent of hitting a home-run?
If a pitcher loads the bases every inning and then pitches out of the jam each time, achieving an ERA of 0.00, how “good” of a pitcher is he?
Baserunner prevention does not equal run prevention. Run prevention equals run prevention.
And the same argument applies to run differential, winning games and “pythagorean expectation” I might add. The team’s job is to *win games*, not accumulate more runs than opponents over some collection of games. If they lose 10 games 6-2 but win 20 others 3-2, they’re +10 on the former metric but -20 on the latter. And from such data some believe the necessary conclusion is that they won some number of their games by “luck”. Others, who shall remain unnamed, dispute this interpretation.
I’m not asking Cliff to bear down and win any games. But what about bearing down with men on base?
– Bases empty: .239/.664
– Men on: .285/.737
– RISP: .258/.721
R.A. Dickey:
– Bases empty: .227/.637
– Men on: .226/.644
– RISP: .177/.526
I second Evan’s position on normalizing HR/FB rates in this context, and I would extend that to performance with men on. Dickey’s fantastic RISP numbers may not be repeatable, but this year, they really happened. Guys do have years when they’re just “in the zone” in certain situations.
And a dark little secret on Lee: His career BA with RISP is 11 points higher than with bases empty, after adjusting for sac flies — whereas the league BA with RISP is actually a little lower than with bases empty, after the same adjustment.
So in my view, a good bit of the discrepancy between Lee’s excellent SO/BB data and his less excellent ERA this year is due to a high BA with men aboard. And that, in turn, is at least partly due to his lower SO% in those situations. With bases empty, Lee fanned 26.6% of all batters this year; but with men on, that dropped to 20.5%.
Some other little things:
– GDPs: Lee’s GDP rate plummeted with 2+ men on. With a man on 1st only, he got 18 DPs in 139 PAs, or 13%. But with 2+ men on, 3 DPs in 71 PAs, or 4%. (Sorry, I don’t have the actual DPopps for those situations.) Dickey’s DP rate went up from 10% with 1 on to 12% with 2+ on; he got 9 of his 25 DPs with 2+ men on.
– If normalizing the HR/FB rate benefits Lee, it seems like it should benefit Dickey, too. Lee allowed 23 HRs on 233 fly balls (per B-R), or 9.9%. Dickey allowed 24 on 217 fly balls, or 11.1%. And their road data are nearly identical, right down the line — BA, SLG, ERA, WHIP, whatever.
– Team defense in general: I can’t see the Mets as a significantly better defense than the Phillies. NYM allowed a .295 BAbip, PHI .303. Total Zone rates NYM as average, PHI slightly below — but DRS has NYM way below average, PHI slightly below.
I’ll admit that I am not very adept with xFIP and the like, so I’ll keep an open mind to counterpoints. But right now, I’m not loving Cliff for this particular award.
From watching the games I felt Lee’s issues w/ men on and RISP were the big difference this year fo him. Also For all his Ks he seemed to have trouble finishing guys once he got to two strikes – extra foul balls.
The Phillies miserable bullpen led to Lee pitching a lot extra innings and to batters he may not have had the Phils had better options.
Looking through Lee’s game log, he was definitely the victim of several blown saves and poor run support.
Assuming I’ve counted correctly, Lee made 21 quality starts with a record of 6-4 and 11 no decisions. He picked up 0 cheap wins. Also, The Phillies went 4-9 in one run games in which Lee started (21-18 with other starters).
Ed, that’s of interest as regards Lee’s W-L record. But his QS data don’t advance his CYA case at all — he was 11th in QS% with 21 out of 30 starts.
And if we look at the 18 NL pitchers with 20+ QS, Lee’s performance in those games is unremarkable — 7th in ERA, 5th in IP/G, 3rd in SO/9, 9th in WHIP.
John – I wasn’t trying to make a Cy Young case. Just trying to make sense of his W-L record.
“I’m not loving Cliff Lee for this particular award”. I’m not either. My ballot would probably be something like Kershaw-Dickey-Cueto-Gonzalez and someone from the Lee, Medlen, Kimbrel group. What I was trying to establish is whether there’s a reasonable framework that supports Lee’s candidacy despite his low win total and good-but-not-great ERA. I concluded that there is such a framework, but it’s a little suspect. John, you’ve laid out a great case against Lee’s candidacy, and I support a framework focusing on run prevention and situational pitching that rejects his candidacy.
Now, in defense of the pro-Lee framework, the high batting average Lee allowed with men on board is still a function, to some extent, of the defense behind him. If a voter is willing to put the lion’s share of the blame on the defense, since Lee did his part in striking batters out (20.5% K rate is still above league average) and not walking them, he can reasonably conclude that Lee did his part in limiting runs better than Dickey did.
I, too, am troubled by using FIP-type stats as a basis for awards, for the same reason I wouldn’t award a star-of-the game designation to a great hitter who went 0 for four in a game on three screaming line-outs and a blast to deep right caught at the top of the wall by a speedy, leaping defender. Sure we know that those four shots will likely be hits in most situations, and on those other days, this hitter can be the star of the game. We can also take those nearly-hits as further evidence that this guy is indeed a great hitter who will likely score many runs for his team in the future. But he can’t be the star of the game on this particular day, because the probabilities didn’t go his way this day. The game is won or lost on a particular day based on whether the probabilities happen to pan out that day, not on the probabilities themselves. I know that is the opposite of the way sabermetrics teaches us to think about evaluating talent going forward — there we need to look not at small sample results but a large scale probabilities. But for retrospective evaluation – such the granting of awards — we have to repress our sabermetric instinct to look at probabilities and we have to look instead at results.
That being said, there is an aspect of FIP analysis that is appropriate in award granting, and that is trying if possible to peel out the part of a pitcher’s results that really is attributable to his fielders rather than to his own pitching. Yes we want to grant awards to a pitcher based on the pitcher’s results, not his probabilities, but it should be, to the extent possible, the pitcher’s results, not somebody else’s results. So if we can document the degree to which team defense biased a pitcher’s results i’m OK with backing that part of the results out. But not the luck part. The luck part, for award purposes, has to stay in.
I could give a small edge to guys who did worse than expected due to luck, or vice versa. Though for something like BA with RISP, for good pitchers, is the sample size big enough in a year to make those results meaningful? There are many more BBIP, right? And I have heard that in a single year that can vary randomly.
So especially for a pitcher, & one who is very good, should we assign any meaning to variations like RISP for a single year?
“should we assign any meaning to variations like RISP for a single year?”
Mike, where does the question of assigning meaning come into the Cy Young Award discussion?
Suppose a pitcher had a great ERA helped by a fantastic RISP split, linked to a major spike in SO% with RISP.
That may not be meaningful in the sense of repeatable, representative of his talent level. But I would hardly call it luck, either; we’re not talking about escaping jams with a bunch of at’em balls, but with strikeouts. Whatever the true underlying cause might have been, he did it.
I wouldn’t ask that such a pitcher be given extra credit for his RISP split — only that his over-all performance not be downgraded because of what some might call a fluke.
I have a couple problems with this postng. First off: No Matt Cain mention…at all? Look at his numbers for the season, he’s top 5 in almost every category, he also has the best game score off all time and another score of 96. He anchored a team that included Tim Lincecum falling off planet earth somehow and lost at the time it’s best hitter(Cabrera) and Sandoval for two months. Again, I don’t think he’s a top 3 candidate but Dickey didn’t pitch a meaningful game after June.
Define CY Young winner based on team performance. Does the pitcher get a lot of run support? Are they pitching under pressure(playoff chasing)? Lee’s season is remarkable but the Phillies were hardly in the pennant chase all year. Kershaw, Cueto, Cain and Gonzalez were and they are the aces of their respective teams. For 99% of the MVP votes we look at the overall team performance record wise and how they fnished, why is Cain overlooked here?
Jeff, Cain had a great season and has a strong case to be ranked ahead of some of the pitchers I included. His WAR does suffer some from pitching in the most pitcher-friendly park in the NL (see here. If I’m filling out an actual ballot, I’ll take a long look at Cain’s numbers, but in the context of the framework that support’s Lee’s candidacy, I’m not sure Cain’s 3.40 FIP and 3.82 xFIP are relevant.
As to your point about team performance, I don’t see that as much more than a tiebreaker. If I couldn’t separate Cueto from Kershaw and Dickey, I might give him a hat tip for pitchign in a pennant race, but I don’t generally hold teammates’ performance against an individual player. I think the most valuable pitcher in the league is the one that improved his team’s chances to win games more than anyone else. I choose to measure that with a combination of fWAR and rWAR.
Also, Why is Cain’s WAR so much lower than other pitchers considering the season he had? It couldn’t completely be the ballpark factor because Kershaw plays in LA, just wondering.
Jeff, don’t assume that all pitchers’ parks are the same.
The 3-year park factor for Dodgers Stadium is 96. For SF, it’s 88 — which I believe is the lowest in MLB.
Cain did have an extreme home/road split this year: 2.03 ERA at home, with 7 HRs; 3.56 on the road, with 14 HRs.
Cain has given up 16 fewer hits in 25 more innings. His WHIP is .14 better than Miley. And Cain is in the Playoffs, Miley isn’t.
Jeff, I asked the same question about Cain’s WAR totals: have a look at the discussion that followed. http://www.highheatstats.com/2012/10/wednesday-game-notes-such-sweet-sorrow-edition/#comment-40134
Without getting into the Cy Young debate, Lee is only the 2nd pitcher ever to have 6 or fewer wins and WAR greater than 4 while qualifying for the ERA title and making at least 60% starts. The other pitcher was:
Joey Hamilton (1995) 6-9 4.4 WAR
Interesting that he and Lee had exactly the same W-L record.
Two other pitchers did it while making fewer than 60% starts:
Stu Miller (1958) 6-9, 4.7 WAR
Terry Forster (1973) 6-11 4.4 WAR
I assume the prototype here is Nolan Ryan’s 1987 season when he went 8-16 for the Astro’s. Ryan also led the league in ERA, ERA+, SO and SO/BB. He also finished 5th in the Cy Young voting and WAR for pitchers.
I just cannot see Lee doing anywhere near that well. He’s 7th in WAR for pitchers, 9th in ERA, 8th in ERA+. That leaves us with a stat (FIP) that most likely a majority of SABRmetric friendly sportswriters could only explain in the vaguest of terms and and an outstanding SO/BB ration. Thing is that just 2 seasons ago Lee produced the second greatest SO/BB season in baseball history alongside slightly better peripheral numbers & still managed to get only 3% of the Cy Young votes. I’d be surprised to see him do even that well this year.
I am all for the FIP and BABIP numbers when evaluating how good a pitcher is going forward (in other words predicting how well he will do next year) because it gets rid of the noise created by luck and other changeable factors.
That being said, I believe, at the end of the year, when determining who had the better year, in evaluating a pitcher on what they actually did, not on what they might have done if their defense was better or their luck was better. I agree with normalizing stats for ballparks, because 5 runs given up in Coors isn’t the same as 5 runs given up at PacBell, but trying to normalize for defense and luck ignores what actually happened on the field. I don’t see how that is relevant to evaluate what actually occurred. My two cents.
Brent, I understand where you’re coming from, but I’d like to point you toward birtelcom’s second paragraph in comment #8 above. To the extent that BABiP is the result of random placement of batted balls, it’s appropriate to charge the pitcher with those results because there’s really no one else to blame. But there’s a portion of BABiP that relies on defense, and that shouldn’t be held against a pitcher. FIP is more than what would have happened under different circumstances. It’s a measure of what *did* happen between a pitcher and a batter. Once a ball is put in play, the pitcher is less a part of the interaction, which is between the batted ball and the fielders. Run prevention is a team outcome; FIP is a pitcher outcome.
It’s like WAR which specifically accounts for “luck’ defense which a pitcher can’t control. Verlander has a 7+ WAR…Why? Price and Weaver are 6.4 and 3.7 respectively. 3.7 for Weaver…? Why so low?
I remember a story about Greg Maddux telling a teammate that he was going to throw a certain pitch and get a certain batter to pop foul to 3rd base. And then he did just that.
I mention this because, in his 4 CYA seasons, Maddux significantly outperformed his FIP. Starting with ’92, his last year in Chicago, here are his actual ERA and his FIP, and the difference:
1992 – 2.18, 2.58, -0.40
1993 – 2.36, 2.85, -0.49
1994 – 1.56, 2.39, -0.83
1995 – 1.63, 2.26, -0.63
Granted, he led the NL in FIP all 4 years, so I’m obviously not using this to argue against his CYAs. But I wonder if the FIP proponents think of Maddux in this run more as the guy with the 1.98 combined ERA, or as the guy with the 2.54 FIP?