Notes on 2012 MLB averages

Just some quick notes on MLB seasonal averages:

  • Run scoring is up slightly from last year (4.33 R per game, from 4.28), but is still the 2nd-lowest value since 1992.
  • Strikeouts have reached another all-time high, which has been true for 5 years running. So far in 2012 they stand at 7.5 per game, up a whopping 5% from the record rate last year of 7.1.
  • Meanwhile walks are down to 3.06 per game, the lowest value since 1968!
  • As you might imagine, the 2.43 K/BB ratio is astronomically high. That’s up nearly 6% from last year’s ratio, which itself was an all-time record.
  • Hits per game are at 8.65 per game, the lowest value since 1989.
  • Attendance stands at 31,381 per game, the highest since 2008 and a pinch higher than pre-strike level of 31,256 in 1994.
  • Intentional walks are down to just 0.21 per game. That’s the lowest level in recorded history, which goes back only to 1955. I presume as run-scoring goes down, managers are increasingly reluctant to put more runners on base. Sabermetrics has come a long way in the last 20 years to show just how much the chances of scoring increase when any batter is walked.
  • Interestingly, sacrifice hits have also dropped way down, to just 0.30 per game. That’s also the lowest level since they’ve been recorded, which is back to 1954. Managers have also learned that giving up an out in exchange for a base advance is worth it far less than thought for most of the 20th century.
0 0 votes
Article Rating
Subscribe
Notify of
guest

31 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
MikeD
MikeD
11 years ago

It’ll be interesting to see how the decreasing offensive environment unfolds in the coming years. My gut is we are in a transition period, with elements of the high-offense “old guard” still present, while elements of the “new guard” begin to come more into play. The age of defensive and speed players may return. Perhaps we’ll have another Charlie Lau guru of hitting who will stress more contact, less swing-for-the fences approach. I’m not predicting a true 1968, Year-of-the-Pitcher year, but I think within in the next five to seven years, it’s possible we’ll approach early 1970s level of hitting.

Nash Bruce
Nash Bruce
11 years ago
Reply to  MikeD

Mike, I agree…..baseball now, it looks to be the best of all worlds, the sabermteric reality, accompanied by the non-steroid reality. Things are looking up!

Nash Bruce
Nash Bruce
11 years ago
Reply to  Nash Bruce

ugh, edit button plz- *sabermetric 😉

no statistician but
no statistician but
11 years ago
Reply to  Nash Bruce

Sabermetric reality, the ultimate oxymoron.

no statistician but
no statistician but
11 years ago

Oops.

Irony alert.

Ed
Ed
11 years ago

The Phillies have by far the best K/BB ratio in baseball at 3.23. And yet they’re 11th in the NL in ERA.

Paul E
Paul E
11 years ago
Reply to  Ed

Ed:
That “pound the strike zone” mentality can, at times, get a pitcher like Lee, Hamels, or even Halladay in trouble.
When asked, “What’s the worst pitch a pitcher can throw a hitter”?, Sal Maglie supposedly replied, “A strike.”

Jim Bouldin
11 years ago

I posted yesterday that my analyses regarding success in one run (1-r) games, was wrong, but it was a brief message and people may have missed it. Here’s why it was wrong, and also a new analysis to replace it. I wish I could say it was just a minor gaffe, but it wasn’t; it was a major analytical mistake that changes the whole picture. So it needs to be clarified. The problem is that I forgot to account for the expected deviation from a .500 record (for each team, over the time period of interest) that would be due… Read more »

Jim Bouldin
11 years ago

continuing The problem comes in when you have a larger number of teams. I alluded to this in one of my explanations to John, where I mentioned that I had to take some extra steps to try to approximate a situation in which there is a single “experimental unit” being evaluated–like a single coin being flipped repeatedly, or a two team league in which all games are between those teams only. The problem is, I did this wrong. When you have say 30 teams, it’s more complicated, in a way that I hope I can convey, but which is tricky.… Read more »

Jim Bouldin
11 years ago

continuing What this means in terms of simulating the situation is that you have to estimate how many times *any individual team* would be expected to go over (or under) .500 by a certain amount given a defined number of games played. This is complicated by the fact that the number of 1-r games played varies from team to team, especially as time-span of interest increases, so that factor has to be included also. It also means that you cannot compute the relative probabilities of chance versus skill in being the likely cause of the overall results, using the number… Read more »

Jim Bouldin
11 years ago

continuing: Here is the R code for the analysis: # simulation and games played vectors: sims = rep(NA,1000000) r1 = c(799,938,939,871,974,929,918,894,912,892,860,936,917,896,891,874,866,953,630, 652,904,816,953,958,884,786,858,964,900,874) # game trials: for (i in 1:length(sims)){ games = sample (r1,1) sims[i] = rbinom(1, games, 0.5)/games } # expected 10 percent quantiles: quantile(sims, probs = seq(from=0, to=1, by=0.1)) r1 is the vector of 1-r games played over those two decades (the two low values are the 1994 expansion teams, Dodgers were high at 974 and Tigers were low at 786, for the non-expansion teams). So what I’m doing there is saying, if I run a million individual games… Read more »

Jim Bouldin
11 years ago

First these are the actual results, 1992-2011 (hope this formats right): Rk Tm G W L W% 1 NYY 799 443 353 0.557 2 SFG 938 495 442 0.528 3 CAL 939 494 445 0.526 4 OAK 871 451 420 0.518 5 LAD 974 503 471 0.516 6 ATL 929 479 449 0.516 7 CIN 918 472 443 0.516 8 MIN 894 460 433 0.515 9 FLA 912 465 447 0.510 10 CHW 892 453 436 0.510 11 BOS 860 439 421 0.510 12 SDP 936 475 461 0.507 13 PHI 917 460 457 0.502 14 HOU 896 449 446… Read more »

Jim Bouldin
11 years ago

Now, the quantile (bin) cut points for ten groups and the number of teams expected and observed in each (expect 3 in each, since 10 x 3 = 30 teams total). Min 10% 20% 30% 40% 50% 60% 70% 80% 90% Max 0.405 0.478 0.486 0.491 0.496 0.500 0.504 0.509 0.514 0.522 0.581 Team Win% Obs. Exp. NYY 0.557 1 1 SFG 0.528 1 1 CAL 0.526 1 1 OAK 0.518 2 2 LAD 0.516 2 2 ATL 0.516 2 2 CIN 0.516 2 3 MIN 0.515 2 3 FLA 0.51 3 3 CHW 0.51 3 4 BOS 0.51 3… Read more »

Jim Bouldin
11 years ago

And finally, wrapping up: The overall distribution of teams into the ten bins is not bad overall–it follows expectation reasonably closely, although not perfectly (for example, an excessive number of teams in the quantiles 2, 6 and 10, and a deficiency in quantiles 4 and 7-9, such that the bottom end of the distribution in particular is skewed). But overall, not too bad at all, giving some definite evidence that many games are in fact won by chance, especially by “middle of the pack” teams. However, if we look at the individual teams on the two ends of the distribution… Read more »

Ed
Ed
11 years ago
Reply to  Jim Bouldin

I’m sure a good chunk of the Yankees success in one run games can be attributed to one person in particular.

Jim Bouldin
11 years ago
Reply to  Ed

I’m pretty sure that’s a big ol’ can of worms Ed. You have to develop the lead first before Rivera can protect it.

Jim Bouldin
11 years ago

One last result:

The top 12 teams on that list all fall within the upper third of the expected distribution (expect 10), while only the bottom 7 or 8 teams fall within the lower third (expect 10 again). So, the distribution overall is shifted somewhat in the “successful” direction. The Royals by themselves seem to compensate for a good chunk of this shift by their bad record.

e pluribus munu
e pluribus munu
11 years ago

Jim, Once again you’ve been very clear, but I’m not sure your analysis fits the question – at least the question I’d thought had been posed – which was whether particular teams, like the 2012 O’s, may be more skilled in one-run games than in other games, in which case we might expect good outcomes to continue as the season went on, or whether it was just a matter of luck, in which case we could expect normal W-L results from now on. For example, you show that the Yankees win so many one-run games that it’s reasonable to think… Read more »

Jim Bouldin
11 years ago

epm, Right this analysis does not address the question of whether the Orioles have been skilled vs lucky in going 19-6 in 1-r games to date or whether that rate will continue, at least not directly. The time period is short and the sample small. But I can do the same kind of analysis as the above, on this year’s data alone, to get some insight into it, and probably will. The point of the posts here was only to correct earlier mistakes on how to approach the more general question, because conceptual approach is by far the most important… Read more »

e pluribus munu
e pluribus munu
11 years ago
Reply to  Jim Bouldin

I may have mistaken the question of the earlier thread by reading my own expectations into it, Jim. Perhaps the issue was not the one I was responding to. I actually did mean more skilled in 1-r games than other games. The issue for me was whether there were reasons other than chance why a team like the O’s so outperformed their overall record in 1-r games. It’s actually an issue I’ve always found interesting, having to do with the dynamic of close games (of which there are, of course, many different types, including 1-r games that only become close… Read more »

Jim Bouldin
11 years ago

OK, thanks for clarifying epm, and for the good words as well. I also like the idea of looking at success/failure rates in the different types of 1-r games you mention. Another helpful thing to look at would be trends in 1-r success between say, teams’ first and second halves of seasons, where the rosters/lineups are typically less variable than they are between years. I’d like to look at that in more depth when I get some time. They would get more to the original issue of the likelihood of the Orioles to +/- maintain their current pace. The issue… Read more »

brp
brp
11 years ago

In general it would be helpful to show the overall W/L% for all teams in that 1992-2011 timeframe next to the 1-run game percentages. I would imagine, for example, the Pirates’ .461 percentage is significantly higher than their overall winning percentage in that 20 year span. Or, even better, the teams’ % in just the non-one-run games, if that’s possible.

Not sure what we’re finding out here even with that data, but it’s interesting all the same.

87 Cards
87 Cards
11 years ago

“•Attendance stands at 31,381 per game, the highest since 2008 and a pinch higher than pre-strike level of 31,256 in 1994″…Does any reader know what is the percentage of total attendance capacity if the average is 31,381?

Richard Chester
Richard Chester
11 years ago

According to Wikipedia the average seating capacity of all ML parks is 44,234. The percentage of capacity for 2012 is 70.94%.

Richard Chester
Richard Chester
11 years ago
Reply to  Andy

I see from your new blog that the percentage is 72.9%. I calculated the average capacity (43,250) from that ESPN attendance spreadsheet and saw that the number I got from Wikipedia is incorrect. I also found that the 2012 average attendance is 31,532. By using the correct values I got the same percentage that you did.

Jim Bouldin
11 years ago

Somebody (Lincoln?) said something like “Being right for the wrong reasons doesn’t count”. Very true. But I still feel a little vindicated after running the analysis described above on the 2012 data alone. Here are the results, where column p refers to the probability of winning the observed number of games, for teams with winning records, and of losing the observed number of games, for teams with losing records: Rk Tm W L Win% p 1 BAL 20 6 0.769 0.0028 2 CLE 15 6 0.714 0.0133 3 ATL 16 8 0.667 0.0320 4 PIT 23 16 0.59 0.1668 5… Read more »

no statistician but
no statistician but
11 years ago
Reply to  Jim Bouldin

“The last temptation is the greatest treason:
To do the right thing for the wrong reason.”

from Murder in the Cathedral, by T.S. Eliot

Jim Bouldin
11 years ago

Oh I like that!

Jim Bouldin
11 years ago

Would be really nice to have a way to properly format data tables…