Building a Better Pitcher Wins Metric

Dr. Doom provides us with another new metric to measure wins contributed by pitchers. Not  wins above replacement, just wins, plain and simple. More after the jump.

Greetings, everyone!

I’m glad you all enjoyed my last series on what I called a new version of Pitcher WAR, but was really just a way of re-framing individualized W-L records. Well, here I am today with yet another new approach.

You probably know by now from my many posts and comments that I adore messing around with baseball numbers and learning more from them. The individualized W-L records I mentioned before are something I’ve been doing for years. At least half a decade, I would guess. But this post is about something I’ve only been horsing around with for a month or so, so enjoy!

Yet again, we’re going to use a pitcher’s ERA+ to figure out a lot about him. This time, though, we’re going to use his actual ERA, as well. When you see a pitcher’s basic stats on Baseball-Reference, you see his ERA and his ERA+. Those are two separate pieces of information. However, they also tell you a third piece of information: a pitcher’s expected ERA. This is obvious. If Bret Saberhagen had a 180 ERA+ in 1989, that tells us that the expected ERA for a league-average pitcher given Bret Saberhagen’s parks would be 80% higher than Saberhagen’s actual ERA of 2.16. In other words, an average pitcher would’ve had an ERA of 1.8 times 2.16, which is 3.89.

Now, we get to even easier math territory. Every nine innings, Saberhagen saved his team 1.73 runs, or .192/inning.

We also know how many innings Saberhagen pitched in 1989: 262⅓. If we multiply the rate of run-saving (.192/inning) by the number of innings (262⅓), we can say that Saberhagen saved a total of 50.368 runs in the course of the year.

OK, that’s all well and good. But we can go another step. And this is where things get interesting, in my opinion. We know that Runs are a bad currency. They’re a bad currency because a run in the 1930 Baker Bowl and a run in 1966 Dodger Stadium are not worth the same thing. That’s why SO many stats use “Wins” instead of “Runs.” Of course, one could convert to wins, that convert back to a run number that would please people more. I’ve often thought that would be a good idea. Alas, no one’s really doing that, and it’s easy enough to do if you really care that much. But for my purposes today, we’ll just use Wins as our currency.

We will recall that we expected an ERA of 3.89. That means that we can take the total number of Runs Saberhagen saved (50.368) and divide by the Expected ERA (3.89) to get a number of Wins: 12.95. Let’s do a little bit of rounding and call it 13 wins. Bret Saberhagen was worth 13 wins in 1989.

Right now (8/23), Chris Sale is cruising to the AL lead in ERA+. His is 220, with Trevor Bauer at 199 and Blake Snell at 196. How do we value these? By this method, Snell remains in third place, with 7.6 wins. Sale and Bauer, though, are closer, with Bauer actuall sneaking into the lead, due to his having pitched 20 more innings than Sale.

This kind of counting stat may be a little more satisfying to some of you all on here. I’m not trying to just inundate you with random thoughts, but I thought you all might be interested. I’ll leave you with some great all-time seasons, and you can see how they stack up to one another (listed with actual W-L, ERA+, and then the assigned W by this formula):

Bob Gibson, 1968 – 22-9, 258; 20.7
Steve Carlton, 1972 – 27-10, 182; 17.3
Dwight Gooden, 1985 – 24-4, 229; 17.3
Sandy Koufax, 1966 – 27-9, 190; 17.0
Roger Clemens, 1997 – 21-7, 222; 16.1
Pedro Martinez, 2000 – 18-6, 291; 15.8
Tom Seaver, 1971 – 20-10, 271; 15.4
Greg Maddux, 1994 – 16-6, 271; 14.2
Randy Johnson, 2002 – 24-5, 195; 14.1
Jake Arrieta, 2015 – 22-6, 215; 13.6
Justin Verlander, 2011 – 24-5, 172; 11.7

Seeing Gibson all those lightyears ahead really gives you a renewed appreciation for that ’68 season, doesn’t it? Gibson had the 3rd-highest ERA+ in the group, as well as the 3rd-highest innings pitched. Yet, the two in combination make his 1968 perhaps the best season in the Liveball Era (and not as far away from many great Deadball seasons as you might think, actually).

Thanks again for bearing with me through another arithmetic-heavy post. Hope you enjoy when I write these once in a while. Anyway, friends, what do you think? I look forward to your comments/criticisms below!

For those interested, a spreadsheet can be downloaded here containing Dr. Doom’s WAR metric (nWAR) from his previous post and his new Wins metric (nWAA) introduced in this post (note that calculations for both are based on FanGraphs ERA+ metric). The spreadsheet contains all 50+ IP seasons since 1961 and includes a pivot table for displaying season-by-season results for any pitcher from that period.

Leave a Reply

107 Comments on "Building a Better Pitcher Wins Metric"

Notify of
avatar
Sort by:   newest | oldest | most voted
Voomo Zanzibar
Guest
Most Wins, with WAR a higher number than Win Total: 6 / 6.2 … Ted Abernathy 5 / 5.5 … Dan Quiz And a bunch of guys with 4. Almost all relievers. Here’s the list of pitchers with at least 10 Starts: 4 / 4.1 … Eddie Smith (23 of 38 appearances were Starts) 3 / 3.9 … Hoyt Wilhelm (10 of 39) 2 / 2.4 … Rod Nichols (16 of 31) 2 / 2.3 … Tom Hausman (10 of 19) 2 / 2.3 … Ross Baumgarten (23 of 24) 2 / 2.3 … Pascual Perez (14 of 14) Right… Read more »
Bob Eno (epm)
Guest
More fun stuff, Doom — I like the way these new stats shuffle the basic numbers in ways that offer new vantage points for us to think about famous and non-famous pitching seasons. In a moment I’m going to point to an issue with the concept of “wins” in this formulation that I think is fun/puzzling to think about. But I think I’d better say first that I’m actually “e pluribus munu” (epm), writing under a pseudonym — I’ve gotten really, really tired of using my HHS screen name, which is sort of silly . . . well, very silly… Read more »
Bob Eno (epm)
Guest

Oh! Wait a minute: DWins is surely net on W-L, that is, wins over .500. So an average pitcher in deGrom’s shoes would have about 1.5 DWins, not -5, and, in nWAR, would be 9.5-9.5. Problem solved? It would be just the sort of basic conceptual error I make most frequently and embarrassingly. Or have I just misconstrued it all in a way that just has has the appearance of plausibility?

Dr. Doom
Guest
Bob, (It feels SUPER weird to address you as such) The relationship here isn’t directly “to” anything. It’s closest to the “above .500” thought, as a pitcher who is worse than average will come out with a negative number. What you’re really counting is runs; it’s ALWAYS runs. The only reason I translate it to “games” is to make the units work out, so that we don’t conclude that Pedro’s 2000 was better than Gibson’s ’68. Pedro saved 80.1 R by this measure in a 5.06 R/G environment*; Gibson saved 50.9 R in a 2.89 R/G environment. If we DON’T… Read more »
Bob Eno (epm)
Guest
Doom, I wonder whether your concept isn’t better expressed in terms of “Games,” as in “games behind” (or ahead), rather than “Wins.” For example, the Mets with deGrom are 16 games behind. With an average pitcher in his place, we’d expect them to be 29 games behind — his performance has probably raised them 13 games in the standings. Trevor Bauer (AL leader in DWins) is a key to why Cleveland is 12 games ahead. With a league-average guy they’re projected as only 3 games ahead, but Bauer has boosted them 9 more games ahead. It’s still not real world,… Read more »
no statistician but
Guest
I suspect this stat isn’t supposed to measure career results, but I started working out career results for some prominent pitchers anyway, and here they are, within 1 or so, since I averaged a little. Cy Young 817 Roger Clemens 731 Walter Johnson 670 Kid Nichols 663.5 Lefty Grove 643.5 Greg Maddux 562 Randy Johnson 528 Pete Alexander 519 Christy Mathewson 410 Tom Seaver 409 Kevin Brown 365 Carl Hubbell 355 Warren Spahn 342 Bob Gibson 341 Mike Mussina 340 Steve Carlton 336 Curt Schilling 336 Bert Blyleven 331 John Smolz 321 Whitey Ford 320.5 Jim Palmer 316 Gaylord Perry… Read more »
no statistician but
Guest

On reflection, I think I must have done the calculation wrong, since my numbers appear far too high, compared to the ones in Doom’s post. So just ignore the above, since I can’t delete it.

Bob Eno (epm)
Guest
A tangent, if Doom will forgive it. There’s another interesting article on FiveThirtyEight.com that relates to the changes we’re seeing in MLB play. This one focuses on how the A’s have managed to put together a surprise winner. The analysis engages a pretty wide variety of relevant stats, ranging from Statcast data (in a general way) to advanced stats like pitcher WPA. (It also includes a graph more unfriendly to the human eye than any I’ve seen.) Without specifically addressing the issue, the article gives one good overview of the way that hitters and pitchers are adjusting their games in… Read more »
Dr. Doom
Guest
I understand and agree with some of your points. I would like to propose two arguments, though: 1. MLB could do a lot to fix this problem. The biggest one I know is one a friend from college always talked about: handle thickness. Maple is a very supple wood, and basically all bats are made of it. To get the “whip” effect they have, the handles are SUPER thin. Gradually expanding the thickness of those handles would really go a long way toward driving away some of the power hitting we see. They could also (very slightly) deaden the balls.… Read more »
Bob Eno (epm)
Guest

Both points well taken, Doom.

no statistician but
Guest
Here’s a second attempt at applying this formula to career stats. If I’ve got it wrong this time, I think I’m at least closer. I’ve added in a couple of prominent relievers and a few more starters. If I’ve missed any significant starting pitcher in the live ball era—one with a rating of over 60, let’s say—I’d be surprised, but not absolutely amazed. Cy Young 224.94 Walter Johnson 209.50 Roger Clemens 164.24 Kid Nichols 151.76 Pete Alexander 150.00 Lefty Grove 141.78 Christy Mathewson 141.33 Greg Maddux 134.51 Randy Johnson 119.21 Tom Seaver 112.79 Mordecai Brown 98.71 Pedro Martinez 99.04 Warren… Read more »
Dr. Doom
Guest
This is what I have (well, I didn’t round until the end, so we’re within rounding errors of one another… doesn’t matter). Except, you have Pedro (correctly) with more wins than Three-Finger Brown, but listed below him for some reason. Typo, I guess. 🙂 A couple of players you left off: 1. In COG discussions, Tiant v Reuschel has been deliberated ad nauseum. Unsurprisingly, they come out nearly the same here: Tiant at 47.6, Reuschel at 48.4 (again, I’m using my numbers, not yours, so we’re a little off from one another, but not enough to substantively change things; it’s… Read more »
no statistician but
Guest
Doom: I left out active pitchers for the simple reason that pitching stats tend to decline toward the end of a career, and depending on how far the player prolongs it, the more those negative years will drag down the final reckoning. Also, you are right about the third innings computation(below). I just took the .1 or .2 as read. Mariano’s high figure to me simply indicates that this isn’t the best stat with which to compare modern closers and starters of previous eras. Robin Roberts—he had a 6-year mid career slump—or five with one outlier—which sinks his total. Finally—the… Read more »
Dr. Doom
Guest

The Mariano thing is primarily because this is comparing to average. If you compared to replacement, the other guys would move WAY ahead of him. Most of a player’s value, as Bill James is fond of saying, is in BEING average. That means that we’re losing a lot of value for thousands of innings pitched, which would push, say, Nolan Ryan, light years ahead of Mariano.

Bob Eno (epm)
Guest

Thanks for doing this, nsb. I felt bad when you had to withdraw your first tabulation, because so much work went down the tubes. And here you’ve done it again. It bears out the fun in Doom’s new number.

Among active players, I’d guess that Kershaw was leading, and his total would be 86.38, good for 20th place on your list. (Although Ed Walsh, at least, could be added higher up, at 103.77, according to my figures.)

Dr. Doom
Guest
nsb, I did some deeper digging this morning. Wow, did you pick a good threshold for “significant.” I checked 8 prominent pitchers this morning, and five of them came in between 55.9-59.6 without going over 60. Those pitchers were David Cone, Max Scherzer, Zack Greinke, CC Sabathia, and Johan Santana. Santana was the one at 59.6, which is awful close. Some of them will likely go over that threshold in the next couple of years, barring falling off a cliff. I did find three players over 60. Bob already mentioned Kershaw. The other two over 60 that I found were… Read more »
Bob Eno (epm)
Guest
Since Doom and nsb have created career lists for Doom’s new stats, I thought I’d work up a career list for a very different kind of aggregate: pitcher WPA. My goal in doing this is pretty simple. Doom’s (ERA+*IP)-based stat is terrific for its simplicity and the way it provides several valid bases of comparing pitchers by season or career. But there are a number of different perspectives on value, pWAR being an obvious one that lacks nWAR-based transparency and simplicity, but adds critical components nWAR leaves out (UER and DefEff being two important ones). pWAR, like nWAR, leaves out… Read more »
no statistician but
Guest
Bob: Some not-so-random comments: Although there are a few surprises, what strikes me about these various lists is there relative sameness—the same guys over and over. What we’re doing is looking for different ways to evaluate pitching effectiveness, and every version, within reason, gives the same names—in sightly different order, true, but for a majority of them the variation in place is relatively small. Omitting the pre-1925 crowd and relievers, you see Clemens, Grove, Martinez, R. Johnson, Maddux, Spahn, Seaver, Ford, Palmer, Hubbell, Gibson, and a gaggle of others who alternate in the remaining top twenty spots. And when you… Read more »
Bob Eno (epm)
Guest
Responding to your second point first, I agree: Ford’s pWAR is indeed a problem (and, if I recall, it interfered with his easy election to the CoG). Your comment on Ford (which I think is completely justified — notwithstanding that, as a Brooklyn fan, he represented the Dark Side to my eyes) leads me to note that although we have no WPA data for him, Three-Finger Brown is similarly pummeled by pWAR, principally, I think, based on DefEff figures for the Cubbies behind him in the field, Gil McDougalds all. You’ll recall how skeptical I was of that calculation, despite… Read more »
Mike L
Guest

NSB and Bob, nothing would make me happier than a re-argument of Ford’s worth. Ford’s WAR is either a product of some weird glitch in calculation, or a function of some deeper insight into his skills. The problem if you take the latter position, is that it seems to boil down to “don’t look at anything else he did as reflected by any other stats , traditional or otherwise, our secret sauce says, eh”.
Thank you for letting me rant.

Dr. Doom
Guest
So, the question with Ford (when trying to assess his value) is, “How do you separate him from Mickey Mantle and Gil McDougald?” If we want an accurate way of assessing his effectiveness, that’s the question to answer. One way of addressing this question, then, would be, “How does Ford do at the things his fielders have NO control over?” Those things are in the FIP calculation. If we use Ford’s FIP+ in the calculation I introduced in the previous three-part series, we see him with a record of 199-153 (actual record: 236-106; the formula using ERA+ record: 225-127). His… Read more »
Mike L
Guest
Doom, I understand the argument. Interesting sidebar to this–Eddie Lopat’s Yankee numbers, which went from his age 30-37 seasons. 113-59, ERA 3.19, ERA+121, FIP 3.60, total WAR 17.5, maximum WAR in any Yankee season, 3.7. Someone on HHS, several years ago, tried to reverse-engineer what Ford (or any pitcher) would have to do to excel in WAR with the Mantle/MacDougald axis, and it was ridiculous. I’d make an argument that the dampening and smoothing effect may have a disproportionate impact on a handful of pitchers, and would like to see a spreadsheet that tries to identify and compare them.
Dr. Doom
Guest
Replying to Mike L, regarding “what Ford would have to do to excel.” Ford did excel. But I’m not sure exactly what “excel means here, so I’ll try a few different thresholds. We could get him to 60 WAR (or darn close) just by giving him credit for his service time in Korea. That gets him at least 5 WAR, I would think. But assuming we’re not doing that, and assuming that Ford Ford had a career 2.75 ERA. To get to 60, he would’ve needed a 2.59 ERA. To get to 70, he would’ve needed a 2.36 ERA. To… Read more »
Mike L
Guest

So, let’s get back to Ford.

To get to 60, he would’ve needed a 2.59 ERA.
To get to 70, he would’ve needed a 2.36 ERA.
To get to 80, he would’ve needed a 2.14 ERA.

Since the cohort of pitchers who’s career didn’t have the bulk of it in Dead Ball Era, there are zero starting pitchers with an ERA of 2:14 or better, Clayton Kershaw (still active) with an ERA 2.36 or better, and, that’s it….

no statistician but
Guest
Doom: 1) Maybe you realize this, but all you’re response does is confirm Mike L’s observation. 2) What are sabermetric stats, exactly, if they don’t include RE24, ERA+, WPA, adjusted pitching runs, adjusted pitching wins, WPA/LI, REW, and even the reckonings you yourself have been working out? As opposed, it seems, to FIP, which apparently is an advanced stat and takes precedence over any others except WAR? 3) Further, when you say that it’s telling that Fangraphs and B-R agree about WAR, all you’re saying is that they go by the same rules to reach their conclusions, meaning that those… Read more »
Bob Eno (epm)
Guest
On this issue, I tend to get simpleminded and resort to a “divide all people into two types” approach. In this case, the people are good pitchers, and the two types are good power pitchers and good finesse pitchers. Of course, it’s a spectrum, but I’m being simpleminded. Lopat, whom Mike points to as someone pWAR treats poorly, like Ford, was clearly near the far-end of finesse pitchers. Ford could strike people out, but his K-rate was modest, and he was closer to a Lopat than to a Koufax. (Mordecai Brown was on the finesse end of the spectrum as… Read more »
Mike L
Guest

Bob, you are right about Ford’s K rate, but that rate was also reflective of the era he pitched in. He was in the top ten seven times (with rates that modern eyes would say were almost disqualifying).

Bob Eno (epm)
Guest

That’s a good point, Mike. nsb’s post below reminds us that the gaudy K numbers we’ve become used to were rarities in most of baseball history.

Mike L
Guest
Bob, I’d never argue that the power pitcher doesn’t have natural advantages over the “feel” guy, but I guess what troubles me about the broader argument are two underlying biases. The first is that we seem to be discounting the net result (outs and runs) based on how that result was achieved. The second is I’m not sure we are adequately adjusting for managerial approach. Up until relatively recently, we asked starters to go deep, and the phrase “six-inning pitcher” was a pejorative. To do that, most (except the freaks of nature) had to pace themselves. And batters were coached… Read more »
no statistician but
Guest

Damn, I wrote ‘you’re’ for’ your’ again. I must be sic.

no statistician but
Guest
Bob from nsb: I just accidentally deleted a long response to your comment, which I’ll now try to reproduce. What I said, basically, was that I thought your remarks were insightful, especially when you talked about the power pitcher’s desire to strike out the side on nine pitches for nine innings vs. the finesse pitcher’s desire to force 27 groundouts. My distinct impression is that deep within the perspective that drives pitching WAR is the overtly expressed or unarticulated assumption that there exists an ideal pitching performance against which all real pitching performances are to be measured. Further, my impression… Read more »
Bob Eno (epm)
Guest
nsb, I think you’re definitely right, that the unstated ideal of pitching is the power pitcher. But I’m not sure the reason is because strike outs are the safest outs. I think its simply an expression of the way we value athletes. We cheer for the little guy who surprises us by turning limited natural advantages into a winning performance, but we’re in awe of the athlete who possesses super-normal physical gifts and puts them on display to good effect. I recall once reading a comment about Nolan Ryan, whom I saw as a player who wasted extraordinary talent by… Read more »
Dr. Doom
Guest
Sorry for sounding like I was trying to end discussion; it’s not my intention. I don’t even think I was defending the WAR ranking of Ford at all. On the contrary, I just think it’s interesting how things shake out. I don’t think there’s some mystifying secret formula that says “If player name = Ford, Whitey, subtract 25% of WAR.” I think it’s more that he pitched in unusual circumstances, was a very good pitcher, and got better results than other pitchers of similar ability due to the team on which he played. That’s basically what WAR says. It says… Read more »
Mike L
Guest
Doctor Doom, let me clarify what I was saying, without any intended snark. The purpose of pitching is contextualized run prevention which then results in a higher probability of winning games. “Science” (sorry, can’t think of a better phrase) tells us that almost nothing bad can happen with a strike out, but ultimately, most outs are just outs. FIP emphasizes Ks over other outs. No one is questioning that pitchers who have better-fielding teams behind them will have greater success than those who have a team of older Jeters and Dick Stuart’s. But quality pitchers take advantage of what’s offered… Read more »
Bob Eno (epm)
Guest

You know, nsb, I replied to your comment somehow thinking you were Doom; that’s why you’re a third-person in my comment and I ended by attributing Doom’s stats to you. I guess it’s because you started by calling me by my name, which only Doom has done here, although there are actually other people in life who refer to me by my name, and I don’t think they’re all Doom. Doom said using my name was weird — looks like I’m going to have to get used to it here too.

no statistician but
Guest
Since Base-Out Runs Saved has a similar aim, here is a somewhat revised career listing of pitchers who are ranked in the RE24 top 20 (to the right) versus their ranking using Doom’s Ws (from top to bottom): 1. Roger Clemens 164.24—1 2. Lefty Grove 141.78—2 3. Greg Maddux 134.51—3 4. Randy Johnson 119.21—5 5. Tom Seaver 112.79—6 Pedro Martinez 99.04—4 7. Warren Spahn 94.03—7 Base-Out runs Saved 8. Carl Hubbell 92.76—19 9. Bob Gibson 92.38—14 10. Jim Palmer 88.22—8 11. Whitey Ford 87.48—10 (Clayton Kershaw 86.38—12) 13. Bert Blyleven 84.66—17 17. Bob Feller 77.12—13 18. Curt Schilling 76.51—11 20.… Read more »
no statistician but
Guest
To change the subject to the present season, the BoSox have lost six out of eight, four of the losses coming at the hands of the Rays, and I’d say the Mariner’s 116-win season is now safely out of reach. To break that record Boston would have to finish with a 27-3 flourish. In the NL the Cards have put up a 19-5 record in August which is getting them a lot of coverage, but the Cubs haven’t been too shabby either, 15-8, and hold a 4 game lead in the NL Central, which also happens to be the biggest… Read more »
Doug
Guest

I know I wouldn’t have been thinking about Martinez as a Triple Crown threat. But, as I write this, he’s one HR shy of leading in all of the TC categories, so, yeah, he’s right there. But, he’s coming off a red hot August (except in the HR department), so chances are he won’t be able to carry that to the house (he had a very similar August in 2016, but followed that with only a .736 OPS and 3 HR in Sept).

Dr. Doom
Guest
The biggest impediments to a Martinez Triple Crown would not be a Martinez slump, I don’t think. I think the bigger thing facing him is actually the competition. In Batting Average, he’s only a hair ahead of his teammate Mookie (.337-.336), with the guy who won three of the last four AL batting crowns close behind (Altuve, .332). (I don’t think the previous batting crowns make Altuve more likely to win, for the record; I’m just pointing out that it’s not like it’s some fluke player due for a major regression in the final month; in fact, his current average,… Read more »
Dr. Doom
Guest
I want to jump in to the conversation above about strikeouts relative value to other outs. Part of this comes down to thinking. Strikeouts are not more valuable than other outs. In fact, as a group, they’re probably LESS valuable, since other outs include double plays. On the other hand, they also include sac flies and sac bunts, which are surely better than strikeouts; maybe it’s a wash, but I think the double plays are so much more common than the other two (nearly 2:1 in 2018; as of this morning, 2842 GDP, 1672 sac bunts + sac flies). The… Read more »
Bob Eno (epm)
Guest
Doom, I think you’re agreeing with nsb, no? He wrote about why strikeouts are valuable: “Because even letting the batters make contact has the potential to unleash all kinds of unpredictable results, not just hits, but errors and risk taking.” That’s your point too, so perhaps you’re disagreeing with me. But I don’t disagree at all. I think it’s true, but I believe that’s not why people admire top power pitchers. I think there are some points in your generally good analysis that are open to question. For example, you say that “people are overstating the ability for pitchers to… Read more »
Dr. Doom
Guest
I’ve tried like 10 times to write this post. Here are 7 pitchers who weren’t really strikeout guys: Jim Kaat Greg Maddux Jim Palmer Tom Glavine Tim Hudson Mike Fiers Matt Cain You can quibble with some of these, if you’d like, saying they have too many strikeouts, but I think it’s overall a decent group. This group pitched 23,981.1 innings. They had a BABIP of .282 (I calculated this myself and left out sac flies) They had a (modified) SLG of .446 (I modified SLG; instead of doing it over all AB, I took away AB that ended in… Read more »
Bob Eno (epm)
Guest
Great research, Doom. I appreciate your doing it. It makes your point very well. I think that to include my point, we’d need to add BB rates, which, if they are higher for power pitchers, will balance to some degree the increased number of BiP that finesse pitchers allow (and the increased number of batters faced). Here’s what I get as a per 9IP figure for each group (I’ve included both Ks and BBs): Finesse pitchers: 5.72 K/IP; 2.54 BB/IP Power pitchers: 8.74 K/IP; 3.71 BB/IP This implies that BiP out rates per 9IP would be: Finessers: 21.28 Powerers: 18.26… Read more »
Dr. Doom
Guest
Bob, I’m going to issue a slight math correction. Hopefully, you’re not too shamed. You’ve made an error I myself have made in doing similar situations. Hopefully, I can explain; let me know if I’m not making sense. So, yes, the “power” group averaged 8.74 K/9 and the “finesse” group averaged 5.72 K/9. Because we’re using “per 9 innings” math, that means that some outs are accounted for. That means, not that finessers face 21.28 BiP and powerers face 18.26; rather that that’s how many OUTS they need to get after the strikeouts are accounted for. So that means, they… Read more »
Bob Eno (epm)
Guest

Great reply, Doom. Lots of stuff here, and it’s all interesting. It looks as though I’m going to have to grant your general point about power pitchers, though in terms of the pWAR argument, I think Mike L captures the issue above by asking why similar results should be evaluated differently based on how the results were achieved.

Once again, I’m just passing through — I’ll log on this evening to try to digest your ideas more fully and give you a better response.

Bob Eno (epm)
Guest
PS to Doom: When I was a teacher, I’d do the same thing you did: try to save face for students’ by saying I’d made the same error they did. Much of the time I was making that up (although I certainly committed my share or more), and I’m bearing that in mind in reading your reply — as soon as I made it past the intro sentences, I realized my error, just as in the Mordecai Brown case (except, I have to tell you now that time has passed, in that case I’d actually raced back into my house… Read more »
Dr. Doom
Guest
Haha, yes, the face-saving thing is sometimes a part. But I’ll also admit that juggling these things is difficult. I HAVE in point of fact, made that error when I first learned about Bill James doing some era-adjustments in the New Bill James Historical Baseball Abstract. Sometimes, you have to adjust based on PAs, sometimes based on Outs… it was all very confusing and I made my share of errors when I first tried. Now, admittedly, it’s been over a decade, but I really DID mean it when I said I’d made the same (or a similar) error. As for… Read more »
no statistician but
Guest
Some to me interesting facts on the pitching stat K/9: By my count, of the top 500 qualifying seasons for this stat, 352 have occurred in the new millennium, say 70%. Of those same 500 or the remaining 148, take your pick, one occurred in 1884 in the Union Association, which existed for that one year and is a major league INO—produced by a pitcher named Hugh Daly. Trivia mavens please take note. Of the remaining 499 (or 147) exactly 6 occurred prior to 1960. Of those six, just two made it past the magic strikeout per inning mark of… Read more »
Dr. Doom
Guest
I, too dislike the general trend of baseball; I think everyone does. There’s an interesting article from 5 years ago on Beyond the Box Score about why, if you’d like to dig into the numbers on it. Basically, it comes down to the fact that strikeouts are great for pitchers, and not so bad for batters. The batters are not disincentivized to avoid strikeouts (sorry about the triple negative there; “there’s no reason to avoid strikeouts,” is what I’m saying), but the pitchers are gunning for them more than ever. Eventually, as nsb and I discussed above, I think MLB… Read more »
Bob Eno (epm)
Guest
That’s a really good article, Doom. (The comments help too.) I wonder whether, five years later, it continues to be true that it is the called-strike rate that is rising, rather than the swinging strike. One takeaway for me is that the rising K-rate is largely being generated by batters’ strategic approach, rather than by the increase in pitcher skills (although the 2013 article may not fully reflect the way pitching roles have continued to change to maximize velocity). If this is the case, then it may not be that there has been an increase in power pitching so much… Read more »
Dr. Doom
Guest
The question of Whitey Ford and Greg Maddux, and whether what they were doing was a true skill or whether they happened to be the example of flipping a coin a million times and finding two sets of ten where it came up heads ten times consecutively is an interesting one. To study it, I would propose this. We look, not just at year-to-year correlation with the pitcher himself, but with his TEAM. In other words, yeah; these guys were better than others at controlling balls in play, and there’s a high year-to-year correlation. But here’s the question: were they… Read more »
Dr. Doom
Guest
I effed up the discussion of the process (the paragraph after the parenthetical). I was just re-reading and realized that I explained it wrong. Let me do it correctly here: 1. Start with the total innings pitched. Multiply by 3. Subtract strikeouts. This is the total number of BiP outs. 2. Start with the number of hits. Subtract the number of HR. This is the total number of BiP hits. 3. Add BiP Hits + BiP Outs. This is the BiP Total. 4. Divide BiP Outs by BiP Total. This is the Defensive Efficiency. Again, for the Yankees, 1950&1953-1967, DefEff… Read more »
Dr. Doom
Guest
I got to thinking about Maddux and Ford some more, and wanted to check something out. In the interest of open discussion (and unbiased data), I thought I should share this. Perhaps one might argue, “Well, SURE, over the course of their careers, these two players regressed to average, but SURELY they were better in their prime.” In Maddux’s prime, 1991-1998, he was consistently better than his team, beating them every year. (That’s actually true for 1988-1990, as well, but the biggest contrast comes when we start in 1991; there’s no reason to winnow down to 1994-1998, as he beats… Read more »
Bob Eno (epm)
Guest
Doom, I still have some stats I want to find time to calculate in response to your posts here, but I have to say that you’re clearly winning this argument, and doing it with convincing calculations, rather than rhetoric. I don’t think I mind that Ford may not benefit from the verdict of sabermetric injustice that Mike, nsb (a little indirectly), and I were advocating for — as you know, I’m a Brooklyn/Koufax fan and arguing on the side of a Yankee finesse pitcher from the era I most care about is not a natural stance for me. But I… Read more »
Dr. Doom
Guest

“But I really do much prefer winning arguments to losing them.” Haha, likewise. And thanks for your kind words; I appreciate it. I’m curious what you were thinking about calculating; I might like to try something myself.

Bob Eno (epm)
Guest
I had several ideas, Doom, but the main one was to check out Ford’s BABIP stats relative to strength-of-schedule issues. As you know, Stengel pitched Ford disproportionately against strong opponents. Although strength of schedule is part of the pWAR calculus, I thought perhaps some of the inconsistency of Ford’s BABIP performance vs. the Yankee staff as a whole might be revealed on closer analysis of that parameter. I also thought I’d do a game-log analysis of Yankee error rates when Ford was pitching, at least for some selected years, but that’s quite a time investment, and I wanted to spend… Read more »
Dr. Doom
Guest

Wow… those are some VERY weedy issues. I will leave you to it. I think that you’re getting to the point where you’re desperate to prove SOME part of your point. But I think, again, that we’re getting to a point where we’re forgetting: I think Whitey Ford is an excellent pitcher, too. I just don’t think WAR is mis-assessing his value – or that, if it is, it’s by an amount that doesn’t concern me.

Bob Eno (epm)
Guest
Not desperate enough to actually try, Doom. And if I do try, it’ll be as a relaxation after my deadline work is done. Anyway, there’s more to it than just trying to salvage an argument. The BABIP issue and the relative degrees of control between batter and pitcher that you raised via a link to an article online are really interesting to me. (You remember: it’s a facet of the luck/chance issue I’m so dogmatic about.) Ford is a way to explore a little further and get the feel of those data (I’ve never paid that much attention to them… Read more »
Dr. Doom
Guest
Not exhaustive, but I checked BABIPs for the 1952, 1955, and 1958 In 1952, the top three scoring (non-Yankee) teams had BABIPs of .282, .278, .269. The bottom three teams had BABIPs of .276, .262, and .258. In 1955, the top three scoring (non-Yankee) teams had BABIPs of .279, .285, .285. The bottom three teams had BABIPs of .283, .271, and .271. In 1958, the top three scoring (non-Yankee) teams had BABIPs of .279, .278, .283. The bottom three teams had BABIPs of .278, .261, and .263. 1955 seems to be the outlier of the three (randomly-chosen) seasons, in that… Read more »
Mike L
Guest

Doom, one very quick note to your very comprehensive comment: You said “The thing I would say about that is #1, you’re not required to compare pitchers across eras.”
But we do compare pitchers across eras, and that’s exactly what WAR is all about for many people–an all-encompassing single number to be used both in-era and across eras. It’s the measurement used by JAWS and I suspect, if we return to our Circle of Greats discussions, part of every thread.

Dr. Doom
Guest

Yeah, WAR can compare pitchers across eras. But WAR is measuring each season in its context. So while WAR can be used across eras, it’s not actually “punishing” pitchers relative to their peers. I didn’t mean WAR – I meant directly comparing K rates

Mike L
Guest

Just for the hell of it, I decided to look at other pitchers who were fellow rotation-mates with Ford. Allie Reynolds pitched 8 years for the Yankees after coming over from Cleveland. 131-60, ERA of 3.30, ERA+ of 115, FIP of 3.64, total WAR 19.6. Two years jump out: 1951, 3.06 ERA, 126ERA+, FIP 3.47, fewest hits per nine, led league in shutouts, and WAR of 3.7 (also 3rd in MVP voting) and 1952, lead-leading 2.06 ERA, 161 ERA+, FIP 2.89, led league in K’s (with 160!) and 4.7 WAR.

Voomo Zanzibar
Guest

An historical season in progress for a Denver player…
Lowest ERA in a qualifying season for a Rockie:

2.88 … Ubaldo
2.90 … Kyle Freeland (2018…)

3.47 … Ubaldo
3.47 … Jhoulys
3.49 … Jorge de la Rosa
3.62 … Jhoulys
_______________

Highest WAR:

7.5 … Ubaldo
6.9 … Freeland (2018…)

5.9 … Pedro Astacio
5.7 … Jhoulys
5.6 … Joe Kennedy
___________________

no statistician but
Guest
A two unrelated remarks: On baseball bats: is it really true that maple bats have completely replaced ash? If so, how has that impacted in itself on batters going for the long ball? On the Ford issue—which is really the issue of how reliable pWAR is in assessing pitchers from baseball’s middle ages (1893-1920 or so), renaissance (1920-1968 or so), and modern era (1968-1995 or so), as opposed to the post modern era of sabermetrics and TTO: Currently, as Doom remarks, most pitchers are on board the strikeout ship as the ideal craft, but that was hardly the case for… Read more »
no statistician but
Guest

As in “A one, and a two . . .” Non-baseball question for old timers: Who was famous for using this phrase in the 1950s?

Richard Chester
Guest

I never watched his show but was it Lawrence Welk?

no statistician but
Guest

“Somebody turn off the bubble machine.”

A line, actually, from the Stan Freberg parody of of the Lawrence Welk show. In those bad old days of limited telecast, out in the central Illinois boonies we suffered the choice of either Lawrence Welk or the Gale Storm sitcom—had to look this up. Kitsch and treacle. My parents preferred Welk. My brother and I generally found something else to do.

Mike L
Guest

Gale Storm? My Little Margie? As for Lawrence Welk, the correct pronunciation, according to my grandmother, a huge fan, was Lahryrince Velk.

Bob Eno (epm)
Guest
nsb, You and I are on the same side in this argument. I think your major point here is, as usual, correct. But I think you’re overstating the case. There were always power pitchers with 250+ K seasons who served as alternative models: Rube Waddell, Walter Johnson, Dazzy Vance, Lefty Grove, Bob Feller, Herb Score, far too many to mention from 1961 on. That other pitchers didn’t follow their example was likely due to three factors: (1) just as you say, the basic concept of the way baseball was to be played made Ks exceptional and BiP the norm; (2)… Read more »
Dr. Doom
Guest
Mike L, way up the thread, posed a question about how the various WAR-systems are accounting for shifts. For a while, baseball-reference was just throwing away the data on shifts. It came about because of Brett Lawrie, if I recall correctly. It was, and remains, the single most significant accomplishment of Brett Lawrie, who was a VERY touted prospect in the Brewers’ system I was initially sorry to see go. Well… I’m not saying Shaun Marcum was such a great pickup, but Lawrie was really no loss. Anyway, at one point, maybe a third of the way through a season… Read more »
Doug
Guest
Bob Eno (epm)
Guest
To add to Doom’s references, FiveThirtyEight.com has a new article titled, “Baseball Positions Are Starting To Lose Their Meaning” that picks up a related issue: the trend of positioning players not by fielding position skills but in order to maximize lineup strength. On the shift itself, The Fielding Bible III (2012) has several good articles (one of the people commenting on Tango’s blog apparently wrote one of them, though I’m not spotting his name in the book) — Tango’s field diagram looks like a simplified version of many in that book, though its purpose is different. (Just to make things… Read more »
no statistician but
Guest
I’d like to turn the following remarks on Whitey Ford in a different direction, although some statistics as well as some assumptions are necessarily part of the construct. The most basic assumption, to take those first, is that the New York Yankees from 1949 through 1964 were the dominant team in the American League. The second is that they dominated that league and baseball generally in a way that hasn’t been seen before or since, matched in professional sports only by the Boston Celtics’ long run from the mid-1950s through the late 1960s in pro basketball. The third is that… Read more »
Richard Chester
Guest
Here’s a couple of old posts of mine about Ford. Of Ford’s 438 career starts 203 or 46.3% were against teams with greater than a .500 winning percentage. The Yankees were in the first division for 88% of his career starts, meaning that 43% of the teams (not counting the Yankees) during the 8-team league years were above .500 and 44% of the teams (not counting the Yankees) during the 10-team years were above .500. So it looks like he really was not held back for the better teams that much. He was held back from games at Fenway Park.… Read more »
Doug
Guest

The other professional team with a similar dynasty was the Montreal Canadiens, with 15 championships in 24 seasons (1956-79), including 10 in 15 years (1965-79). The first 7 required wins in two post-season series, and the next 8 needed three.

no statistician but
Guest

Is hockey a sport? Thought about the Canadiens, actually, but the NHL was only a six team, one division league through about 1967, and I was thinking the Maple Leafs were pretty strong in those days, too. It’s hard for me to take seriously a sport played on ice with half its US teams in climes that never see snow.

Dr. Doom
Guest
nsb, “Aught,” not “ought,” in this context. Not to be a pedant, but your scare quotes indicated to me that you really weren’t sure about the word, so I figured, “Why not let him know?” Furthermore, there are some very specious arguments I’d like to point out here. I’m not sure I understand the point of any of your first five points. Yes, the Yankees were dominant. Okay. I’m not sure what that has to do with Whitey Ford. He played for them, yes, but that doesn’t make him one of the best pitchers of all-time anymore that it makes… Read more »
no statistician but
Guest
Hey, Doom: Chill out. Seriously. I didn’t have you or your arguments in mind at all while I was writing that comment. My aim was simply to point out that a couple of knocks on Ford’s credibility (not from you, I don’t think)—that he was a product of Yankee Stadium and that his record of success was entirely dependent on the team’s record—weren’t consistent with at least these particular facts as I interpreted them. Plus, you can search all the past posts of HHS since I’ve been making a pest of myself, and you won’t find me saying either overtly… Read more »
Mike L
Guest

LOL.
Maybe we can all agree on a new stat….WARY

Dr. Doom
Guest
Not that I don’t love being told to “chill out. Seriously,” but I’m not sure what I’ve said here to upset you so. Sorry for whatever it was; I’m not some raving lunatic – I don’t think. But I suppose most raving lunatics wouldn’t describe themselves as such. Also, I feel fairly certain that no post on this website has ever caused me to lose sleep, so I don’t see any reason for your concern in that regard. My two-year-old who still doesn’t sleep through the night, most of the time, though is a different story. (Sigh) C’est la vie.… Read more »
no statistician but
Guest
Mid-1960s, off-campus dumps away from metropolises at the low end, yeah, but it was actually seven dollars a week for a room, $60 a month for a two-person apartment, so $30/person. The cockroaches lived rent free. In graduate school as a follow up in a similarly sized town the year before I got married, I rented a room in a converted house for $35/month with hotplate privileges. I also had a toaster oven, which ranked me above the other three guys on the floor. There was a communal refrigerator in the hall for the four rooms, this being prior to… Read more »
Bob Eno (epm)
Guest
Obviously, nsb went to college when a dollar was a dollar and men lit matches with their fingernails. We live in last times now: students vape, nsb’s old library has computer clusters where bookshelves once stood, and old timers recount tales of balls in play to grandkids who’ve never seen one. . . . I have a question. In my understanding of park factor calculation, the complexity is such that no attempt is made to adjust the basic R/G data to exclude games pitched by individual pitchers. For example, the Stadium park factor for pitchers during Ford’s tenure there averaged… Read more »
Mike L
Guest

I don’t see Ford as an inner-circle Hall of Famer. I do see him as better than 53.5 BWAR for his career, and a 99th-ranking among starters by JAWS.
I agree completely that if his career WAR was 65-70 this entire discussion wouldn’t have taken place–but maybe that’s good thing since it’s causing us to look more deeply into what devalues him (and, by extension, other Yankee pitchers of that Dynasty era or other pitchers ) Perhaps WAR is an accurate measure of their true worth, perhaps not.

Bob Eno (epm)
Guest
I don’t know whether Richard’s response to nsb is the ideal way to measure how Ford was used. For one thing, the notion that Ford was “leveraged” (scheduled to pitch irregularly to increase starts against tough opponents) only applies to the Stengel years (1950, 1953-60), and that’s where we should be looking for that issue. The second point is that above/below-.500 is not a very accurate way to measure. I’ve poked around through some of Ford’s seasons: 1953-56 and 1958-60 under Stengel (leaving out his rookie debut in 1950 and the year he was injured, which could skew the figures),… Read more »
Bob Eno (epm)
Guest

By the way, I should have noted that I borrowed both the term “leveraging” and the methodology for calculating it from Chris Jaffe’s book on managers.

Bob Eno (epm)
Guest
More on Ford. I was thinking about some points Doom made about Whitey Ford in relation to his relatively low career WAR number: that he missed two prime years to military service and that he retired a bit early. In addition, Stengel’s leveraging of Ford over a period of 8 prime seasons reduced his IP, and thus his opportunity to earn WAR. So I decided to see how Ford would rank if we calculated career WAR on a rate basis: that is, WAR/9IP. It raised his rank, but not as much as I expected. Ford ranks #85 in straight WAR… Read more »
no statistician but
Guest
Bob: Of your top 20, ten were peaking late-Eighties onward, corresponding with the drop in complete games. So I don’t think the stat tells too much that isn’t arguable. Sorry, because it’s a good idea. As for trying to raise Ford in the WAR list, I’m not sure that that is the way to go at this point anyway. WAR is what it is. The sensible way to look at not just Ford but all players in terms of trying to compare them is to view WAR as an important stat, even the most important stat, but to look at… Read more »
Bob Eno (epm)
Guest
nsb: 1st par.: good point. I hadn’t thought about that. Perhaps adding a factor reflecting IP/GS might make the rate stat more useful. (I’ve implicitly done that by eliminating Rivera). I have no disagreement in what you say in the second paragraph either, both in terms of Ford and general principles. Since the CoG draws the line at 1900, more or less, Ford’s rank by traditional pWAR is much higher (also eliminating players too recent for CoG): I think about #64. On the rate list I think he’d be about #53. Either way, you’re right that the CoG process has… Read more »
no statistician but
Guest

Adam Darowski’s Hall of Stats is kind of idiosyncratic (Eddie Cicotte, Babe Adams) and includes a number of pre-1900 pitchers, but Ford comes in the low 60s by his reckoning (The listing isn’t numbered, and pitchers and position players are thrown together. I’ve gotten three slightly different rankings in three tries with my ancient eyesight, and I’m not trying again).

In Darowski’s Hall of Consensus, Ford jumps into the top forties among the pitchers. As usual, I can’t get the link to work, but it’s

http://www.hallofstats.com/consensus

no statistician but
Guest

Holy cow! as Harry Caray used to say.

Bob Eno (epm)
Guest
nsb, I think the “Hall of Consensus” is a little too skewed for use. For example, to rank among the top group (true consensus), you have to be in the Hall itself, which rules out players like Clemens, Mussina, Schilling . . . It is true that everyone considers Ford to be Hallworthy, regardless of how their Hall is configured. But I don’t think that’s a ranking matter. I don’t actually see a ranking for the Hall of Consensus, and I’m guessing you just used the Hall of Stats list and knocked off everyone who wasn’t a consensus choice. That… Read more »
no statistician but
Guest

Bob:

I wasn’t touting the rankings, just refreshing people’s memories that they exist. Multiple views for comparison are often useful, a platitude for the ages.

Bob Eno (epm)
Guest

At our age, if you can’t dish out platitudes at will, you shouldn’t qualify for Medicare.

Kenton
Guest

Hall of Stats has Ford as 76th among pitchers. (Click on “positions” then “pitchers”.) His hall score is 106, meaning he did 6% more than the 228th (or whatever number of people elected as MLB players is), by the formulae used by HoS, which start with bWAR.

Bob Eno (epm)
Guest

Actually, he looks like #66 to me, Kenton. (Perhaps a typo?)

Doug
Guest

Here’s a quick little quiz. The Red Sox, Rockies, Astros, Royals, Marlins, Athletics and Nationals are the only teams so far this season to record what pitching accomplishment. Hint: each team has done it only once.

no statistician but
Guest

Without looking I’d say it might be complete game shutouts.

Doug
Guest

On the right track, but neither CGs nor shutouts are part of it.

Bob Eno (epm)
Guest
This is a comment in praise of Doom’s nWAR (not so much DWins, but I don’t mean to knock that stat in any way). I was just reading an essay by Bill James in the 2015 version of The Fielding Bible. (The same essay apparently appeared on James’s blog years ago, so some of you may have already seen it; I hadn’t.) In the essay James has one of those brilliant insights that make him so terrific. He tells us that traditional batting and pitching stats are not just numbers, we read the as a language. A regular fan knows… Read more »
Dr. Doom
Guest

Thanks for your kind words, Bob!

As a parenthetical, on my personal “nWAR” spreadsheet, I also include FIP, because I think it’s a useful comparison to see how different pitchers look, and I include a balanced 50/50 FIP+ and ERA+ ranking. That helps me to see how a pitcher might be viewed differently, and translates those “numbers” into “language,” as you (and James) so elegantly put it. Thanks, and I’m glad you had fun with this stuff!

wpDiscuz