I’ve added a few more games to the top of this post, above the solid line.
Brewers 5, @Twins 3: Just when it seemed the Crew would waste a leadoff double in the 9th and perhaps skid to a 4th straight loss against weak AL Central foes, Martin Maldonado‘s 2-out, 2-and-2, 2-run HR gave them a lead, and John Axford (working a 3rd straight day) converted his first 1-2-3 save since May 23.
- Maldonado, called up when Jonathan Lucroy was hurt in late May (and hitting .198 in the PCL at the time), has a .902 OPS, 4 HRs and 12 RBI in 49 ABs, including 3 big-impact blows in the last 8 days; he has 3 of the 24 WPA marks of .35 or better that have been achieved in MLB in that time. In high-leverage situations, he’s 5 for 11 with 3 HRs, 10 RBI and a 1.727 OPS. Brewer catchers have a combined .300/.388/.521 slash line.
- All Milwaukee’s runs came on two swings, including Ryan Braun‘s 3-run HR. He has hit in 10 straight games.
- Minny’s Trevor Plouffe (2 solo shots) is on a home run binge: 13 HRs in 88 PAs over 22 games. But his 14 HRs this year have plated just 16 runs.
- DH Ryan Doumit struck out in all 4 trips, thrice with a man aboard.
Diamondbacks 5, @Angels 0: Just like the old days, Trevor Cahill mastered the Halos on 3 hits over 7 IP, and won a 3rd straight start for the first time in over a year. Aaron Hill redeemed his earlier ABs (flyout with RISP and GIDP) with a 2-out, 3-run HR in the 6th, after an IBB to Miguel Montero. Nos. 1-5 in Anaheim…’s order went 0 for 19 with a walk, and only 2 Angels got as far as 2nd base, both with 2 outs, as they were blanked for the 9th time this year.
- Reliever Brad Ziegler induced a GIDP for the 8th time in 27 chances, a 30% rate (MLB rate 11%). The DPs are tied for 12th in the majors, and among those with at least 5 DPs, only Cliff Lee has a higher DP%.
- Arizona, 7-3 in interleague games, has won 9 of 12 and slithered into hailing distance of the wild-card discussion. The Angels are 1-2 with 4 total runs since some bloggart predicted they would win the AL West.
@Blue Jays 3, Phillies 0: Toronto starter Drew Hutchison left in the 1st, their 3rd SP in 5 days lost to an in-game injury. But former starter Carlos Villanueva stepped up with 4 scoreless innings, Brett Lawrie had 2 doubles (including the game’s first RBI), and the Jays scored 2 unearned runs after a Mike Fontenot throwing error in the 4th, sending the Phils to their 10th loss in 13 games.
- Cliff Lee isn’t the only Philly pitcher getting stiffed by the offense. They’ve scored 1 run total in Vance Worley‘s 3 losses. Worley and Lee have a combined mark of 3-6, 3.01 in 20 starts.
- Toronto leads 19-18 in the all-time series between the current and former Blue Jays (see sidebar).
Royals 3, @Cardinals 2: If there’s such a thing as a walk-off win on the road, this was it. KC won their 4th straight, each by the slimmest of margins. The Cards are 6-11 in one-run games, and 5 games below their Pythagorean record.
- Vin Mazzaro, perhaps best remembered for this epic adventure, delivered a half-dozen goose eggs for the 2nd time in 3 June outings.
- Alex Gordon hit his 20th double, out of 61 total hits (32.8%). Only 6 qualified batters have ever had a full-season 2B rate that high. The runaway leader in that regard is Eric Hinske, with 45 two-baggers out of 109 hits in 2003 (41.3%). Gordon is just 6th among current qualifiers in 2B rate, a group led by Joey Votto (28 of 78); Votto is on pace for a record 72 doubles.
- Milestone moment marred: Carlos Beltran bagged his 300th steal to become the 8th member of the 300 HR/300 SB fraternity (with a nifty 87% career success rate), reaching 2nd base as the tying run with 2 out — and then got picked off to end the inning.
- Matt Holliday (0-4) has 1 RBI in 11 games this month and is hitting .200 with RISP this year.
- St. Loo leads the all-time series, 38-29 … or 41-33, if you want to ruffle some feathers.
@Rangers 6, Astros 2: Throwing 69% strikes, a personal best, Yu Darvish notched 11 Ks against the Astros’ porous bats, including 8 of his last 9 outs in a solid 8 innings. He walked 2, including a leadoff in the 3rd that came around for the game’s first run, but none after that. The big 5th inning had none of the expected Texas thunder: After an error and a HBP, 5 runs were scored on 5 straight singles.
- Despite some hiccups, Darvish has matched the Rangers’ record with 8 wins in his first 13 games, set by Matt Harrison in 2008. His countryman Kazuhisa Ishii has the best career-opening run this century, winning his first 6 starts and 10 of his first 12 in 2002.
- Josh Hamilton got a rest after snapping his 8-game “vitamin” hit streak the day before. The AL’s Player of the Month for April & May is hitting .196 in June, with a HR, 5 RBI and 3 runs in 13 games (Rangers 6-7).
- Houston’s 146 Ks in 13 June games are 23 more than any other NL club.
- Texas has dominated this series 17-5 since 2007, and leads 40-30 all-time.
@Indians 2, Pirates 0: Justin Masterson worked out of a 7th-inning jam, retiring Alex Presley with 2 out and the go-ahead run on 2nd, to earn his 3rd win. Pittsburgh had just 5 hits and 9 baserunners, and went 0-6 with RISP. Carlos Santana‘s 2-out, full-count double tallied the only run off James McDonald, who didn’t have his best control — 54 strikes in 101 pitches, though just 2 walks.
- Win, lose or draw, McDonald’s numbers look the same. In 3 losses, he has a 2.29 ERA and 1.02 WHIP; in 5 no-decisions, 2.57 and 0.96; in 5 wins, 2.14 and 0.89.
- Git ‘er done: Nothing about Chris Perez yells “star!” except the bottom line: MLB-best 21 saves, none blown since Opening Day. And maybe this: Perez and Craig Kimbrel are the only full-time closers who have not allowed a home run.
- Andrew McCutchen was 0 for 3 with a walk, and moved the tying runner to 3rd with 1 out in the 6th; his WPA was slightly negative at -0.055. The Pirates, who still have no other hitters with OPS+ of at least 100, are 4-12 when his WPA is -0.05 or lower.
- The Bucs have scored a run or less 19 times, tops in MLB.
________________________________________
@Cubs 3, Red Sox 0: The latest scoreless streak belongs to Ryan Dempster, 22 innings in his last 3 games, allowing 11 hits and 3 walks.
- Dempster won his 3rd straight start, evening his record at 3-3 and trimming his ERA to 2.11 in 81 IP. That would tie the lowest qualifying mark by a Cub since 1933, when Lon Warneke had a 2.00 ERA. The other 2.11 was by Dick Ellsworth in 1963. Since then, their only sub-2.40 was by Greg Maddux in 1992, his first of 4 straight Cy Young seasons (and last year with Chicago).
- Not enough? Dempster, an .097 hitter (56-579) — 4th-worst among actives with 300 PAs — went 2 for 2 (half the Cubs’ hit total) and legged out his 2nd career triple.
- The Cubs’ last 3-start scoreless streak in one season was by (I kid you not) Ken Holtzman, in 1969, when he tossed 6 shutouts, including a no-hitter (not in the streak). Holtzman also had a 3-shutout streak in ’68. And that’s it for the Cubs in the expansion era.
Giants 4, @Mariners 2 (9th): Ryan Vogelsong allowed 2 runs in 7 IP, raising his ERA to 2.29.
- If Vogelsong and Matt Cain (2.18) keep this up, they’ll be the first Giants teammates with ERA+ of at least 150 since Christy Mathewson and Jeff Tesreau exactly 100 years ago, and the first with ERA of 2.30 or lower since Carl Hubbell and Prince Hal Schumacher in 1933.
- Not to be outdone, Melky Cabrera (4-1-2-2, HR, .365 BA) would be the only Giant besides Barry Bonds to bat .360+ since Bill Terry and Freddie Lindstrom both did it in 1930.
@Athletics 10, Padres 2 (9th): Brandon Moss homered for the 4th straight game, which no A’s hitter had done since Jack Cust in 2007. The franchise record since 1918 (covering all 3 cities) is 6 games by Frank Thomas in 2006. Only 3 others have done 5: Matt Stairs (1998), Dave Kingman (1986) and George Alusik (1962). Alusik, bought from Detroit in May of that year, would hit just 13 more HRs after that streak, finishing with 23 in his 5-year career.
@Dodgers 7, White Sox 6: Sailing along with a 5-1 lead and a 5-start win streak, Chris Sale opened a porthole in the 6th by walking the leadoff man, destiny’s darlings swamped him for 5 runs. The Sox tied it up again, but then … um … aw, hell, I don’t know what to make of this play-by-play line:
“J Loney scored, E Herrera to second on fielder’s indifference, E Herrera to third on wild pitch by M Thornton.”
- Say again? Maybe this will help.
- Adam Dunn seized the MLB lead outright with his 23rd HR. Yes, yes, relax — he also walked and whiffed. But the game ended with Dunn on deck.
Yankees 7, @Nationals 2: About time someone stuck up for the AL East! With both teams riding 6-game win streaks against their Eastern opposites, New York pulled away with a big 7th inning against Brad Lidge, and assailed some of their own demons with a couple of bases-loaded hits and an over-all 4 for 8 with RISP,
- With 2 outs and 2 strikes in the 9th, New York was in danger of winning without a home run for the first time this year, but Curtis Granderson drilled his 20th HR and became the first player with 20+ HRs in each of the past 6 years. Washington’s winning streak ended at 6
Rockies 12, @Tigers 4: How do you snap an 8-game losing streak? How about 8 runs in the 10th? Colorado had the biggest eruption in the 9th or later this year, and the biggest in extra innings since 2009-08-16, when the Angels dropped 9 on Baltimore. Jose Valverde, whose error helped trigger the onslaught, was charged with 6 runs, tying the worst game of his career.
- First Rox road game scoring more than 8 runs and the 2nd with more than 6. They began the night with an average of 3.5 R/G away, 6.3 at home.
Reds 7, @Mets 3: Jay Bruce ran out the second inside-the-park home run in Citi Field history, after Pistol Pete Bay wiped out in a valiant effort to catch or corral a deep drive and accidentally slid head-first into the LF wall. Bay wobbled off the field and may have another concussion. The previous Citi ITPHR was hit in 2009 by Angel Pagan against Pedro Martinez, in his first game in the new park after calling Shea home for 4 seasons.
- Brandon Phillips homered in his 3rd straight game, a new personal best. His 40 RBI are 1 behind Dan Uggla among all second basemen.
- Kirk Nieuwenhuis homered for the 2nd straight game, after sitting out for the first time since he was called up for the 2nd game of the year. He has appeared in 63 of their 65 games. The franchise record for a first-year player is 151 games by Rey Ordonez in 1996.
- Ike Davis hit in his 6th straight game. The 9-for-16 barrage has raised his BA from .158 to .191 — a season high.
_______________
I’m too tired to organize these, so….
Three hitters began Friday’s games with 3-game HR streaks: Trevor Plouffe of the Twins, Oakland’s Brandon Moss (who hit 4 HRs in a Coors series), and Brandon Belt, the first Giant in almost 3 years to do it at home. Belt hadn’t hit a HR all year, 141 PAs, and began the series with as many or more career HRs in Coors Field (3 in 6 games) and the Marlins’ old park (2 in 1 game) as he had on his home ground (2 in 55 games). Those 3 in Denver also were in consecutive games. So Belt, with 12 career HRs in 114 games, has two 3-game HR streaks, each within a single series.
Carlos Pena walked and scored in his 2 PAs against Carlos Zambrano Friday — just as he did in his only prior encounter with Big Z, last Saturday. Four career PAs, 4 walks, 4 runs. For good measure, Pena walked and scored in his third trip Friday, against Chad Gaudin. Three “disaster starts” in his last 5 outings have multiplied Big Z’s ERA from 1.96 to 3.92. Pena has 4 hits in 37 ABs this month, no HRs — but he’s still scored 9 runs in 13 games, thanks to 14 walks.
James McDonald (1 R in 6 IP) has not allowed more than 3 runs in any of his 13 starts, the longest streak this year. He’s 3 shy of the Pirates’ live-ball record, set by Vern Law in 1959 (the year before he won the CYA).
I loved that triple by Dempster. Why? Because it illustrated just how hard baseball is. We see, night after night, the best players in the world making the game look easy. But there is a great player (Adrian Gonzales), playing out of position in RF… and he looked very much like a beer-league softball player taking the wrong angle and flopping to the ground with a caught-in-between effort. Thanks Adrian!
Of course, Jason Bay didn’t look much better, and he’s an actual outfielder.
*Just teasing
Good point, Voomo, and always worth remembering.
Holy cow, how long does it take you to look all this stuff up?
Interesting that none of Jackson, McGwire, Canseco, Tejada or Giambi are on that list of five or more straight homer games.
Carlos Pena ended up with a pretty unique line that you don’t see very often;
1 4 0 0
Since 1918 it’s happened 9 other times by Darren Lewis, John Jaha, Harry Hooper, Jim Gilliam, Charlie Gehringer, Russ Derry, Barry Bonds, Dick Bartell and Floyd Baker.
Thanks, Richard. One of the pleasures of searching for matching events is running across players like Russ Derry and George Alusik, who had decent but brief careers, and whom I’d never heard of.
Russ Derry, a wartime replacement in MLB, was a 4-time HR champ in the minors, 2 before and 2 after his big-league days. If you look too briefly at his ’45 season with the Yanks, 13 HRs seems no big deal; but he did that in 253 ABs, on a team that had just one guy with more than 10 — Nick Etten, 18 HRs in 565 ABs, which was 2nd in the AL.
Derry struck out a lot for the day and had a low BA, but with walks and power he was a plus hitter in that league. But they sold him to the A’s after that, and he was soon back in the minors for good. He had several good years in AAA after that, with HR titles in 1949-50, averaging 36 HRs, 122 walks and a 1.000 OPS — just the kind of offensive game that was being featured in the majors in that era, but he never got another shot.
John: Russ Derry hit 3 GS in his first 47 games. Is there any way to see if that was a record at the time (1945)? I know Shane Spencer beat that.
Never mind, I did it. It was a record at the time.
Actually, after 4 PA’s, Carlos Pena had 4 R and 4 BB. If he had not gotten a 5th PA, I believe he would have been only the third major leaguer ever to walk 4 times in 4 PA’s and score all 4 times.
Paul E: You are correct, Joe Morgan (7-27-73) and Rickey Henderson (7-29-89) are the only players to do that.
Since 1918, Pena is the only first baseman to score 4+ runs without a hit.
Except for catcher and pitcher, every other position is represented in that list.
19 different catchers had 3 runs without a hit; Ray Schalk did it twice. Surprised not to see Tenace or Tettleton in that bunch.
7 pitchers have scored 3 times without a hit. The only one since 1983 is Joel Pineiro in this game.
Can somebody explain the Elo player rating system algorithm to me, because the following explanation at BR.com makes no sense:
“All players have an initial rating of 1500 points. These ratings are then updated by randomly selecting pairs of players and having them “play” each other. We start by computing the win probabilities for each player (let’s call them A and B):
P(A wins) = 1 / (1 + 10^((RB – RA) / 400))
P(B wins) = 1 / (1 + 10^((RA – RB) / 400))
where RA = rating for A and RB = rating for B
After the winner has been determined, the ratings of the two players are adjusted. If A wins the match then the new ratings are:
RA_new = RA + K * P(B wins)
RB_new = RB – K * P(B wins)”
If everybody starts with 1500 points this will never work because RB-RA and RA-RB will both always be zero for every matchup, and therefore the probabilities always equal (at 1/2 for everyone)
By “play each other” it means that the site picks two players and asks the user which one is better. It presents the user with some statistics to make the comparison but, other than that, it is completely subjective.
I still don’t get it. So the user is supposed to estimate P(B wins) and P(A wins), or just makes a choice of who will win? If the latter, then I still don’t see how the algorithm proceeds with that information.
Later in their explanation is the following sentence: “Before opening this up to the public we simulated several 100,000 matchups in order to give the players more realistic starting ratings.”
OK, so every player *doesn’t* really start with 1500 points? And if they did at the beginning of these simulations, then my original question remains.
My recollection is they did the simulations using WAR to start – so they wouldn’t have the user making a choice between Babe Ruth and Shooty Babitt. But the choice is A or B, which one is the better player. I believe it works more or less how ratings among chess players are done.
That makes sense. Obviously they had to initialize the thing with *some* type of difference between players, and WAR would be the logical stat to choose. I guess they just badly messed up their explanation.
Tangent: In a discussion of low batting crouches during last night’s Mets broadcast, Ron Darling mentioned Shooty Babitt.
Here’s what puzzles me: How does Ron Darling know Shooty Babitt’s batting stance? They were never in the same league, never really even at the same level of baseball at the same time. They grew up thousands of miles apart. Darling played college ball, Babitt didn’t.
Darling mentioned Babitt and Rickey Henderson as being low targets in the same lineup, so he was probably talking about 1981, Shooty’s lone year in the majors. But when and how would Darling have seen those games? Ron was pretty busy in ’81, pitching for Yale and then starting his pro career. And there weren’t MLB games all over cable TV back then.
Spring training, maybe? The Rangers and A’s both train in Arizona now, so I assume they did so back then.
These are the things that keep me up nights….
All of Shooty’s baseball cards (Topps, Donruss, and Fleer) have him standing around. No batting stance.
Maybe Darling just remembered that he was 5’8″ and lumped him in with Rickey. Or maybe somehow he knew that Babbitt tried to adopt Rickey’s style.
Can we just send Ron Darling an email?
Weird that Darling and I would both bring up Shooty Babitt. I wasn’t watching that part of the Mets game so it is a complete coincidence.
I thought maybe Babitt was an instructor of some kind with the A’s when Darling was there. I looked up his wikipedia page and it doesn’t mention anything about that, but he has been an advance scout for the Mets since 2008 so their paths have probably crossed in the last few years. It also mentioned that Babitt’s son was picked in the 10th round of the recent draft by the Dodgers.
You guys have awe-inspiring knowledge of Shooty Babbitt. These are the posts that keep me humble. I’d thought perhaps Evan had invented a clever name, and now I have to question whether I’m even HHSworthy. . . .
It’s a popularity contest. They should make you post your age and, if say Mickey Mantle retired before you were born, you forfeit your inalienable right to judge him against anyone – his contemporaries and those you have seen.
The ELO rater gives sluggers who were mediocre and poor fielders a bad name
Or very possibly a self-confirming exercise of the validity of WAR.
I’ve watched the Phillies versus the Toronto Energizer Bunny, Brett Lawrie, this weekend. Playing the game like a hockey player on a 40-second shift, he does some seriously reckless and “unwise” (kind & understated for stupid) things out there. He doesn’t take a walk and, so far, is a piss-poor base stealer. He’s on pace for 36 EXB hits and 32 BB’s…He’s only a rookie, but I think his talent pales in comparison to Trout and Harper. That being said, “WAR” loves him….I dunnno.
It must be me.
You’ll probably appreciate this piece Paul:
http://www.beyondtheboxscore.com/2012/6/16/3085251/be-wary-of-war-a-cautionary-tale
Really good article, Jim. Embedded is a link to a BProspectus article which does a great job of explaining exactly how Brett Lawrie’s dWAR number is so high this year at B-Ref. Normally I wouldn’t steal a link from a link that you just posted, Jim, but since we’ve all been a little baffled by Lawrie’s numbers this year, I’m going to let others see it in case they don’t click on the link in your article:
http://www.baseballprospectus.com/article.php?articleid=17183#commentMessage
I just can’t resist. Let me summarize what this article is saying. DRS’ +/- rating system works this way: If a player makes a play that is not made by 60% of other fielders at his position, he is given a +0.6 rating. If he doesn’t make that same play that 40% of fielders do, he is given a -0.4.
So since Lawrie has been placed in short right field, a place where you will never find other third baseman(except maybe a Pirates 3Bman-since they are reportedly using a similar shift), when Brett makes a play in short right he is credited with making a play that 99% of all other 3Bmen would not make(utterly disregarding the fact that most third basemen are, you know, standing near the third base bag). So Lawrie’s +/- rating on this play is very near a full point, or 0.99. That’s where the faulty inflation of his numbers are occurring. Sure, he’s penalized for bunts down the third base line with the shift on but obviously far fewer batters are bunting against the shift than hitting into the teeth of it.
Thought I ought to check in here, being this site’s #1 heretic regarding the sanctity of fielding WAR.
My comment this time: If Lawrie’s current dWAR were an isolated anomaly, that would indicate merely that the stat formulators hadn’t allowed for the particular shift being used. But 1) it isn’t an isolated anomaly, as prior discussions here have shown. Similar things have been cited time and again, just none so obvious and exaggerated. 2) There’s no way the stat formulators can adjust their numbers to allow for a shift that will be credible. That’s just an opinion from a non-statistician, of course, but some things are too obvious to need figures to back them up.
“Some circumstantial evidence is very strong, as when you find a trout in the milk.” —Thoreau
Further thoughts the next morning:
Not having seen anyone trying to defend dWAR in this circumstance, at least here, I have to assume that silence means assent. But will this gigantic pin prick the bubble, or will people go on citing dWAR as if it has the force of truth behind it?
And will the situation generate at least a little skepticism about the less ludicrous but still suspect results of the pitching WAR formulation?
This is all really interesting stuff.
It took me a while to find a definition/explanation of DRS (Defensive Runs Saved), but even that did not provide the exact formula or algorithm, assumedly because it’s proprietary? Apparently the +/- that bstar discusses above is only one component of DRS, but there’s no detail provided about how the different components are integrated to create the final DRS score.
(http://www2.baseballinfosolutions.com/innovations/defensive-runs-saved/)
I’m also wondering how often teams like the Jays and Rays put these extreme shifts on and whether such data is available?
I also found this quote at the article I linked to above:
“Defensive measures will not be perfectly accurate until Fielding F/X data is released to the public.” Apparently this refers to the raw data (collected by Baseball Info Solutions?)
Jim, Hit F/X=Fielding F/X. It’s the batted ball version of Pitch F/X, which you can see on every thrown pitch over at MLB.com when you’re watching gameday over there. Exact trajectory, angle, velocity, etc. That kind of stuff for every batted ball. Apparently MLB clubs already have this at their disposal but are not releasing it to the public yet. It should do a better job of helping to quantify dWAR and also BABIP, where we will finally be able to see, for example, if certain pitchers are indeed inducing weaker contact or not.
no stat, this has never, to me, been about a “dWAR is the truth/trash the whole system” type of argument. It’s your choice to see it that way, if you want. There are more examples of questionable dWAR numbers in the first article that Jim linked to. In fact, that’s the gist of the entire piece. But the author cautions those who seemingly want to put dWAR’s inexactness front and center and use it to argue that WAR itself is the problem:
“…The critiques of sabermetrics always begin and end their arguments with the imperfection of defensive stats. However, to say that WAR is a flawed (or dare I say useless) statistic, because of the imperfection of its defensive component, is wrong and a complete and utter cop out. Defense has to be a component of WAR. WAR attempts to put a single number to everything that occurs on the baseball field. Defense is a large component of the game; thus, it needs to be included in WAR. The day of Field F/X has yet to come (and it may never); thus, sabermetricians cannot sit around and wait for that day. They must keep trekking on by evaluating defense to the best of their ability with the information that is available…”
bstar:
I think you misinterpret my complaint, or the major part of it:
Read rightly and you will see that I’m calling for a skeptical approach to complex statistics. Far too many posts here cite WAR as if it were the ultimate truth. It isn’t. It may be a useful tool, or it may be a club to beat your opponent in an argument over the head with, as if there were no court of appeal beyond the holy writ of the sabermetrics of the moment.
As for wanting to junk WAR, that would be a vain pursuit. Stopping people from accepting it as something that it isn’t seems to get peoples backs up, too, and I’m about ready to give up trying.
OK, that’s fine. It just seemed, from @42 and @43, that you were looking for some sort of response from someone, so I offered one. Who here is taking WAR as the ultimate truth and refusing to view some seemingly odd numbers with at least a grain of skepticism? I just don’t really see that on this site necessarily.
I didn’t really hear anyone not looking for some sort of explanation as to why Brett Lawrie’s defensive numbers look like they will end up as the single-season best of all-time, and perhaps we may have uncovered why. To me, that’s a positive thing.
Great, thanks for that explanation bstar, very helpful. I assumed it was raw data of some kind. So apparently, MLB (or individual MLB teams?) collect this data in real time, and Baseball Info Systems also creates their independent version of it by watching game telecasts and replays. I guess.
If the different versions of dWAR described were easily available, I’d do some simple linear regressions between them. The article I linked to only evaluated the top ten players; better to evaluate all regular position players. But alas, I don’t see that the data are easily obtained for any of the three methods.
Jim, thanks. In the comments section of the WAR article you linked, I can’t really tell if you agreed that taking an average of the three defensive metrics listed is a good approach. You mention a “robust mean” which uses pair similarities. Can you explain that further?
I’m pretty good at stats; I was studying to be an actuary at one point ’til real life intervened, but that was over twenty years ago. So if you could go as slow as possible I think I could understand it.
Do you think a mean, median, or “robust mean” would be the best approach if one were to try and take an aggregate look at defensive numbers? I don’t understand what the author means when he says that by taking the average, ‘the differences would be lost in translation’. Isn’t this what every average is doing, losing the differences in translation? What does he mean by this?
bstar, somebody over there was wondering “why not just average the three versions of WAR” (I guess there are actually four?). I was responding to that. The basic idea is that if you have one version that is way out of line with the other two, don’t average the three, down-weight the one that is most out-lying. Of course, there’s no guarantee that that one is the “wrong” one; this is a purely statistical approach that assumes that an outlying measure is more likely to be in error. than others that cluster near each other. In this case however, the Prospectus piece that you linked to gives good reason to believe that the rWAR computation is indeed more strongly biased than the other two versions, so it’s a reasonable solution.
BTW that BP piece was really good!
Man am I ever scatterbrained right now. To continue:
“Do you think a mean, median, or “robust mean” would be the best approach if one were to try and take an aggregate look at defensive numbers? I don’t understand what the author means when he says that by taking the average, ‘the differences would be lost in translation’. Isn’t this what every average is doing, losing the differences in translation? What does he mean by this?”
I’m not exactly sure what he means, but I have a decent idea. Taking the mean of any set of numbers will tend to bring the true signal out from the noise–that’s a big reason for so computing. But this is only true when the “noise” around that mean is randomly distributed (and a couple other assumptions also hold, e.g. independence and constant variance of those deviations from the mean). If these conditions are not met, taking the mean becomes correspondingly less accurate, i.e. biased. In that case you need to compute some other metric of central tendency, of which there are a number of options, like the median, or a robust mean that down-weights the measure lying furthest from the cluster formed by the rest of the group.
My opinion is that in this case, a robust mean would be best, because a median can also have real problems when sample sizes are small. However, you have to come up with a weighting scheme for a robust mean, and there’s potentially subjectivity in so doing.
Got it, I think. 🙂 So you’re sort of saying that the average of 20 for this set of numbers {3,4,2,4,87} doesn’t give us a great idea about these values since 87 is so out there, right? So you would down-weight the 87 accordingly to get a better representation of those numbers. Is that the point of a “robust mean”? If it is, I don’t really see why that’s any better than a simple average because as you pointed out we don’t really know that the outlying number is in any way “less correct” than the others. But thanks for the explanation, Jim.
And for what’s it’s worth, I do think side-by-side looks at the three(or four, whatever it is) most popular defensive metrics could work because they are all based on runs saved, if I’m not mistaken.
Yes, you got the point of a robust mean; it’s just a way of dealing with suspected outliers. Of course, unless one has independent information, one often does not know whether an identified outlier is actually valid data or not–a *very* common occurrence. Given this fact, the assumption of ~normally distributed data is made and aberrant data are identified based on the very low probability (e.g. p < .00001) of high deviations from the mean, like say 5 standard deviations or more. In this case, however, we have legitimate reason to believe that one of the three metrics (DRS) is indeed biased, as detailed in that BP article. That is, its numbers are (apparently) heavily biased whenever balls are hit to fielders who are playing way out of their normal location; those plays (apparently) have a much larger influence than do other plays, hence biasing the DRS metric. Or at least, that's their argument.
“So you’re sort of saying that the average of 20 for this set of numbers {3,4,2,4,87} doesn’t give us a great idea about these values since 87 is so out there, right?”
Right. Assuming we know nothing whatsoever about the process that generated that set of five numbers, the likelihood that they would have that distribution, given the kinds of distributions typically found in the real world, is very low. And it’s the 87 that makes it so, because it’s about 85 standard deviations from the mean of the other four values. Assuming the data are even only very approximately normally distributed, that’s an astronomically low probability. Hence we conclude: bad data point.
The Royals/Cards game ended with a run of the mill Tyler Greene SB(2B)/Adv to 3B – E2/out at home 4-5-2 play:
http://mlb.mlb.com/video/play.jsp?content_id=22319555&topic_id=11493214
When I watched this play on an MLB Gameday live look-in I wondered if Greene was safe (it looks to me like he was probably out on account of an elevated foot despite a questionable tag), but on looking at the highlight I wonder if he would’ve been safe had he not been slowed slightly by the positioning of the 3B umpire.
Only one error is given on the play (on the catcher’s throw to 2nd).
And there is no Caught Stealing on the play at home.
So what is the box-score lingo for why the runner was attempting to get from 3rd to Home? It would have been an error on the throw to 3rd had he been safe at home. But since he was out, is it a Fielder’s Choice (as in “the fielder Chose to throw it ten feet wide of 3rd base to induce the runner to attempt the third leg of a 270 foot sprint)?
Voomo, I believe the P-B-P phrase is “runner out advancing.”
Evan, thanks for that link. FWIW, I thought Greene was out, thanks to an amazing perfect throw by Moustakas.
Dunn now has more RBI than Hits. That has only been done once in a qualifying season, by Mark McGwire in 1999. He also leads the league in home runs, walks, and strikeouts, which as far as I can tell hasn’t been done since Dale Murphy in 1985.
He has also earned a -1.2 dWAR
…playing only 24 games in the field
…and he has not committed an error !
If you can’t get to the baseball you never have an opportunity to commit an error.
But don’t the announcers always say that the most difficult play for a fielder is the ball hit right at them?
Which is comparable to Rafael Palmiero, who, when he won a Gold Glove in 1999 while playing 28 games in the field, had a -1.1 dWar
“The franchise record since 1918 (covering all 3 cities)…”
So, the A’s are the only franchise to move, in two hops, from east to middle to west across the country I guess.
That’s a .403 BABip for Melky Cabrera, trailing only Joey Votto at .420….both Mike Trout and Kirk Nieuwenhuis are top ten in BABiP in MLB as well
Ugh. Just a note on one of today’s (SAT) games.
As a Yankees fan I am glad to have won, but really Joe? Really?
You asked the guy with the 7.66 era,
who had pitched 1.2 innings since May 21st,
to protect the lead in extras,
on the road,
for two innings?
Before going to your closer?
And Garcia makes you look like a genius by being perfect?
FWIW, Voomo meant “protect the tie” there in reference to Freddy Garcia, the guy with the 7.66 ERA.
Yes, JA, thank you.
And why would Garcia HIT for himself with a 2 run lead and men on????
No more batters? Hey, how about Sabathia? 61 career ops+.
C’mon, think outside the freakin box!
Well, Voomo, when you’re playing with house money, as Giardi certainly was after the home plate ump utterly blew the call at the play at the plate in the Nats’ half of the eighth, you perhaps get the idea that Lady Luck is on your side, and you’re even willing to roll the dice with Freddy Garcia. And the dice stayed smokin’ for him.
To Jim & bstar (36/38/39):
Thanks for “peeling the onion”. That being said, Lawrie is only 22 years old and, who knows, could be the next Ryne Sandberg or Adrian Beltre.
On Yu Darvish, and his 8th win of the season, he and Wei-Yin Chen (7-2, going for his 8th win tonight, 6/18) are hot on the trail of Daisuke Matsuzaka’s 15-win rookie campaign in 2007 in which he set the rookie record for wins in a season by an Asian born player.
my bad, wei yin chen earned his 7th on sunday
Hi, after reading this amazing paragraph i am as well cheerful to share
my know-how here with friends.