This came up at lunch: “Over a season how many more plate appearances does the leadoff batter have, compared to the ninth-place batter?”
Bill made an insightful (if not completely correct) estimation, based on the premise that leadoff has one more PA than the nine spot every game, therefore, he will have 162 more PAs in a 162-game season. The flaw is that if the last out is made by the ninth batter, then the leadoff & nine spot have the same number of PAs. If you assume that each spot has an equal chance of making the last out (1/9 = 11.11%), then leadoff will have one more PA in 8/9ths of games, 162*8/9=144. 144 is a very good estimate, as we shall see.
Of course, I had to dig into the Retrosheet data to see for myself. I knew I wanted to count PAs, because that would tell me the total PAs for the team for each spot in the batting order. I used the box-score proofing formula to calculate PAs from the Retrosheet Game Log data.
AB + BB + SAC + HBP + INT = R + PO + LOB
The explanation from Wikipedia:
In a baseball game, the number of plate appearances for each team must be equal to the number of batters put out, scored, and left on base. A box score is in balance (or proved) when the total of the team’s times at bat, bases on balls received, hit batters, sacrifice bunts, sacrifice flies and batters awarded first base because of interference or obstruction equals the total of that team’s runs, players left on base and the opposing team’s putouts. In other words, the box score is accounting for the number of batters and what became of them (scored, left on base, or put out). If a box score is unbalanced, then there is a logical contradiction and thus an error somewhere in the box score.
As was discussed at lunch, a batter who is at the plate when the inning ends elsewhere (pickoff, caught stealing, etc.) is not charged with an official PA. Scott passed along the following from Wikipedia. (By the way, I can find no definition of plate appearance in the official baseball rules.)
A batter is not charged with a plate appearance if, while batting, a preceding runner is put out on the basepaths for the third out in a way other than by the batter putting the ball into play (i.e., picked off, caught stealing).
That reminds me. When I refer to the batter who made the last out, what I really mean is the batter who has an official plate appearance when the last out is made.
Anyway, getting back to the proofing equation, I calculated both sides of the equation and verified that they matched. (There were 7 games in the last 20 seasons that did not prove out. The couple I checked were games called by rain, but I haven’t investigated why these didn’t prove out.) The data from 1974-2013 included 176,121 games. The answer for the last 20 seasons is that the ninth-place batter will have 142.4 fewer plate appearances than the leadoff batter. The summary appears below. Congrats to Tom, who came closest with his guess of 140! (His estimate was based on 162*8/9, but I failed to understand that at lunch.)
Scott points out that Bill James also came up with 142 using data from 2006 only.
Still, 142.4 is a bit lower than the “theoretical” estimate of 144. I’d explain that by saying that the leadoff hitter gets “cheated out” of a few PAs, due to the higher likelihood that the bottom of the order will make the last out for their team. The same data used to calculate the number of PAs can also be used to count the number of times a particular slot in the order makes the last out of the game for their team. If this were completely random, the chance for each would be 1/9=11.11%.
The graph clearly shows the average propensity of each slot to make out, with the 3 & 4 batters being the best hitters. (The Sabrmetricians will point out that clubs are leaving runs on the table by not putting their best hitters in the first two spots.)
Going back to the numbers in the table, we see that the average PAs by the leadoff spot is 761. That has actually been surpassed eight times by single players. The record is 778 by Jimmy Rollins in his 2007 MVP season, during which he batted leadoff and played all 162 games for a good offensive team.
For fun, I looked at the histogram of team PAs in game. The mean is 38.4, and the mode is 36.
[CLICK TO ENLARGE]