SABR has a useful page for its Official Scoring Research Committee. There are downloadable files, including a booklet entitled “Official Scoring in the Big Leagues.” The newsletters are interesting too.
Thought about this ordered dice system for some reason. The idea is to use multiple, uniformly colored, six-sided dice to produce a number of values, not all of which have the same probability. For example, with three dice, one can order the die values in ascending order, like 123, 256, 255, 224, 333, etc. This yields 56 possible values with the following probabilities.
20 values of probability 6/216 = 2.78% (no dice identical) 30 values of probability 3/216 = 1.39% (two dice identical) 6 values of probability 1/216 = 0.46% (three dice identical)
This system is actually used in an old tabletop baseball game, “Be A Manager.” My Brother found a group of guys playing in a bar several years ago.
I had a random thought about the differences between minor leagues in terms of being hitter-friendly or pitcher-friendly. I’ve often read qualifications of individual performances, for example, “he’s hitting well, especially since that’s a pitchers’ league,” or “his ERA is not bad, considering that he’s in a hitter-friendly league.” So I decided to go to the stats. I chose to compute the averages of the last five complete regular seasons, 2013-2017. But which stats to use? Runs per game? ERA? Batting average? I decided to compile OPS and ERA as the measurements for hitting and pitching, respectively. I knew that the two would be highly correlated, and that was indeed the case. I really didn’t see anything interesting by considering both stats together, so I simply sorted the leagues by OPS. The data appears in the table below.
I was surprised to see the huge difference between the top and the bottom: 126 points of OPS, 1.59 earned runs! The next surprise was that the leagues don’t cluster much by level. The Rookie leagues are all over the map.
I had a few ideas to explain the differences, then the Commish suggested a few others. Here’s a list of possible explanations.
- Elevation. The Pioneer and Pacific Coast Leagues parks are generally at higher elevations, which helps the hitters.
- Big Spring Training Parks. The Florida State League teams play in the Spring Training parks, which are big. The same probably goes for the Gulf Coast League, even though those are back fields.
- Wood Bats. Hitters in the Short-Season A leagues may be at a disadvantage, because some of the hitters are using wood bats regularly for the first time.
- Windy Florida. Maybe windy conditions are tough on the hitters in the Florida State League and GCL.
This analysis didn’t turn up much interesting. Although I’m not a fan of the DH rule, I had some ideas that the use of the DH had probably changed from its MLB inception in 1973 to the present day. I figured that the early DHs were the ageing sluggers like Cepeda & Oliva, and that the modern game uses a more mix-and-match approach to the DH. Nope.
I looked at regular-season starting lineups from the Retrosheet Event Files. I limited the analysis to American League lineups, because I wanted to focus on teams that used the DH most/all of the time. I included AL lineups in inter-league games when the DH was used.
I looked at the lineup slot occupied by the DH to see how that changed over the years. The table below shows the slots used for each season, 1973 through 2017. Cells are colored like a heat map, with red for the maximum and blue for the minimum.
I’m surprised how variable the data is from season to season. For example, in 1992 the DH led off 209 times (9.2%), and the following season the number was down to 32 (1.4%). Undoubtedly there were a couple of DHs in ’92 that led off regularly and did not do so in ’93. Still, the variation at all batting-order slots is more variable that I had expected. Maybe there’s a bit more consistency in the last ten years or so, but I didn’t do a numerical analysis of this.
Note that the only starting-lineup slot that was not filled by a DH for the entire season was the 9 spot, which had no DH in 1975 and 1997.
Of course, it’s clear that the DH is usually slotted in the heart of the lineup, and that hasn’t changed through history. The totals for all seasons are shown in the chart below. It’s no surprise to me that cleanup is the most common DH slot.
The other thing I looked at is how often a team used a single player as DH through the season. I looked at the number of games started by the most used DH on a team. The team with the most starts by one DH is plotted for each season, as is the team with the least starts by one DH. The mean plotted is the average of the DH leader of all teams. For example, in 1973 Orlando Cepeda started 142 games at DH for Boston (the max), while Kansas City had seven players with ten or more starts at DH, of whom Hal McRae had the most (33, the min).
The 1981 and 1994 seasons were shortened by strikes, so keep that in mind when looking at the data for those seasons.
There’s not much variation over history. I expected to see a decline in the max, but I don’t see it.
The coolest tidbit from this otherwise dull analysis was noticing that the maxima during 1978 & 1979 were 162, meaning that at least two players started every regular-season game at DH. That turned out to be Rusty Staub for the 1978 Tigers, and Willie Horton for the 1979 Mariners. Because of inter-league play, this record will likely never be broken!
The power rankings are based on the run value of players relative to a replacement player. Replacement-player values are based on post-draft free agents at each position, but these pre-season rating are based on replacement levels from the 2017 ABL season. The scale is set to zero for replacement players and 100 for an arbitrary “superstar” level. Run values are adjusted to expected game participation of full-time regular position players, starters, and relievers. Run values do not take into account the following ratings: injury, jump, steal, speed, hold, catcher throw, outfield throw, and double-play turn. Run values are based on an average of all current ABL parks.
The value of keeps are summed without regard to position. For example, if four keeps for one team can play only first base, all four are still counted.
The estimate value for draft picks is calculated differently from last season. Last season I used an average value from each round, taking into account the last few drafts. This time I assumed that the picks would proceed from the highest-value free agent and always proceed to the next highest-value free agent. This is not ideal, as value will tend to fall from the highest picks to lower picks, but at least it accounts for the order of picks within each round.
The family have a couple of tabletop games from the 70s that use funky, six-sided, wooden dice. (Superstar Baseball has a selection of all-time MLB greats, while Bowl Bound has college football teams from the 60’s & 70’s.) There are three dice: one black and two white. There are no pips on the dice—instead numerals are printed on the sides. It’s a bit like Strat-o-Matic. The black-die value is multiplied by ten, and the two white die are added to the total. So, for example, a black 2 and white 3 & 4 represent a value of 27. The faces of the dice are marked as follows:
BLACK: 1, 2, 2, 3, 3, 3 WHITE1: 0, 1, 2, 3, 4, 5 WHITE2: 0, 0, 1, 2, 3, 4
The rating system is based on the run value of a player at a particular position, relative to a replacement player. Replacement-player values are based on post-draft free agents at each position. The scale is set to zero for replacement players and 100 for an arbitrary “superstar” level. Run values are adjusted to expected game participation of regular position players, starters, and relievers. Run values do not take into account the following ratings: injury, jump, steal, speed, hold, catcher throw, outfield throw, and double-play turn. Run values are based on an average of all current ABL parks.
Team power rankings are calculated by adding the run values of 19 players on each team:
- 8 position players chosen for maximum value as a group (Platoons are not considered.)
- 2 additional position players, which represent DH and bench strength
- 5 most valuable starters
- 4 most valuable relievers
The post-draft power rankings are depicted in the chart below.
Syracuse has the strongest position players, Orlando has the weakest. La Jolla has the strongest starting rotation, Ocracoke has the weakest. Chesapeake Bay has the strongest bullpen, Mudville has the weakest.
The power rankings are a simple measurement of team strength and may not accurately predict win/loss records.
The most significant feature of this year’s draft was the lack of starting pitching. Relief pitching was good at the top, but poor in the middle. Batters in the draft pool were stronger than normal. Titusville’s picks went very much according to plan for the first five rounds or so, before the normal confusion set in.
ROUND/OVERALL PLAYER POSITION 1/ 6 Steven Wright starter 2/ 16 Kevin Kiermaier CF 3/ 26 Tyler Thornburg reliever 4/ 36 Evan Longoria 3B 5/ 46 Ervin Santana starter 6/ 56 Miguel Gonzalez starter 7/ 65 Adam Rosales IF 8/ 74 Michael Lorenzen reliever 9/ 84 George Kontos reliever 10/ 93 Jett Bandy C 11/102 Curtis Granderson OF 12/111 Matt Holliday 1B/LF 13/120 Fernando Abad LH reliever 14/129 Domingo Santana OF 15/138 Jackie Bradley CF 16/146 Jake Barrett reliever
The draft pool was quite weak this year, despite the fact that two extra pool teams were left in upon the Gangsta’s last-minute withdrawal. Titusville lacked a second-round pick this year, having traded it away for Yasmani Grandal.
ROUND/OVERALL PLAYER POSITION 1/ 5 Chris Sale starter 3/ 21 Will Harris reliever 4/ 29 Clay Buchholz starter 5/ 37 Kelly Johnson infield/outfield 6/ 46 Enrique Hernandez outfield/infield 7/ 55 Geovany Soto catcher 8/ 63 JJ Hoover reliever 9/ 72 Shawn Tolleson reliever 10/ 81 Travis Shaw infielder 11/ 90 Colby Rasmus outfielder 12/ 99 Eduardo Nuñez infielder 13/108 Brandon Maurer reliever 14/116 Adam Lind firstbaseman 15/124 Matt Thornton reliever 16/132 Scott Kazmir starter 17/140 Nick Martinez starter 18/148 Kyle Hendricks starter
The list doesn’t look too impressive, but it seems to nicely complement the strong returning crew, which includes Harper, Cole, Cespedes, Encarnacion, and Machado.
AL teams have the DH, so their rosters will have fewer pitchers than the NL, right? NL teams have more temptation to pinch-hit for pitchers, so they need more relievers on the roster, right?
As of May 1st, the average NL roster has 12.3 pitchers, while the average AL roster has 11.9. So, the average NL roster has 0.4 of a pitcher more. Simple histograms appear below.
14 NNNNN 13 AA NNNNNNNNNN 12 AAAAAAAAAAA 11 A 10 A 09
BTW, it’s the Angels who have only ten pitchers.
Small sample size. Maybe I’ll gather more data later in the season.
Recently a Binghamton Mets fan commented that AA is a high level of ball for a community the size of Binghamton. Is that true? Let’s take a look at the populations of the metro areas in the Eastern League. The whole metro-area thing is not an exact science, but I think most of the EL cities are reasonably represented. I used a list on Wikipedia that has 2012 population figures. New Hampshire is represented by Manchester, and New Britain is represented by Hartford. The only real choice for Bowie is Washington. Bowie is a bit of an anomaly in the Eastern League, as it’s the only location that very close to an MLB city. The metro-area populations are shown in the chart below. Bowie is not included, because Washington’s nine million population is off the charts.
I’m surprised that New Britain/Hartford is the largest (apart from Bowie/DC). Anyway, the fan was correct: only Altoona has a lower population than Binghamton. I’ve been to Altoona, and not only is it fairly small, but it’s also pretty isolated. I can’t imagine that many people make the trek from Pittsburgh or State College. It’s a fairly new site for organized baseball (1999), and their attendance is relatively strong. Well done, Altoona!
Speaking of attendance, the average EL 2014 home attendance is shown in the chart below.
Binghamton’s place in the cellar may play a large role in the possible demise of the franchise, but it’s been fun while it’s lasted!
I expect to go from first to worst this season, as there wasn’t a good return from last year’s championship squad. In fact, I kept only 13, so had two extra picks, including the last of the entire draft. The strategy was to invest in young position players that might take major strides in 2015. Pitching was relegated to the later rounds. I was hoping to snag Starlin Castro and Mookie Betts, but LBI was clever enough to pounce before me.
ROUND/OVERALL PLAYER POSITION 1/ 10 Carlos Gomez CF 2/ 21 Marcell Ozuna OF 3/ 32 Ian Desmond SS 4/ 41 Travis d’Arnaud C 6/ 61 Tyson Ross SP 7/ 71 Rougned Odor 2B 8/ 81 Eduardo Escobar IF 9/ 91 Oswaldo Arcia RF 10/101 Casey Fien RP 11/111 Danny Salazar SP 12/121 Scott Atchison RP 13/131 Dustin Ackley LF 14/141 Yovani Gallardo SP 15/151 Jonathan Broxton RP 16/161 Jared Hughes RP 17/171 David Buchanan SP 18/181 Justin Wilson RP
During the 2014 ABL season everyone noticed the increase in pitcher cards with the R symbol. I wrote about it in
the 2014 ABL Yearbook. Now that my 2015 card data is in the computer, it’s a good opportunity to see if the Rs are still as numerous. I did a simple count of the pitchers in recent seasons that have each symbol. Starters and relievers are all grouped together. The data is from only the pitchers with ABL eligibility; not all Triple Play cards are represented. I don’t think I’ve missed too many eligible players over the last few seasons, but the first couple of seasons considered here are probably missing a few, especially for the 2008 season. The years listed refer to the ABL season, so the 2015 data is from the 2014 Triple Play cards that we’ll be using in the upcoming 2015 ABL season. OK, enough of the fine-print bullshit, let’s go to the graphs.
Well, it looks like 2014 was a blip for the R symbol. The frequency has dropped down to the previous level.
The H symbol continues to occur infrequently. (To the relief of all ABL managers!) It’s interesting that the level of the H symbol seems to follow that of the R from year to year. I didn’t notice that before, probably because the yearbook study weighted the symbols by how many innings were pitched in the ABL, and nobody likes to give an H pitcher a lot of innings. In 2014, when the R frequency doubled, the H frequency doubled too, from 4.5% to 9.5%! In 2015 it’s back down to 4.5%.
The L symbol is back with a vengeance! Lots of shorts have the L this year, and it looks like every single qualified closer has one. In the yearbook I speculated that the combination of B & L might be constant. It sure doesn’t look like that in 2015. This season should see more walks than ever before erased from batter cards, because the frequency of Bs is up too.
And finally, the F symbol (found on relievers’ cards only) has not fluctuated much over the years.
In summary, compared to last season, expect fewer homers & deeps to be re-rolled, and expect to lose more walks off the batters card.
Commish & I were discussing the standards for official scorers giving errors. Should the same standard be applied regardless of the level, or should the standards be higher at the higher levels?
Commish made the excellent point that throwing errors (especially to first) are going to be automatic and are not really subject to any subjective standard. Since these types of errors are obviously made more frequently at the lower levels, we expect the number of errors to go up as the level goes down.
So, I can’t answer my original question with stats, but I still thought it would be interesting to look at the fielding percentages at the different levels of OB. I used 2013 stats and excluded leagues south of the border.
The trend is clear. Actually, it’s clearer than I expected! When you get down to A ball, errors are twice as likely compared to the Bigs.
The worst thing about going abroad during baseball season is missing baseball. So when I had to go to Sweden on business, I had resigned myself to a break from baseball. The weather in Stockholm was crap anyway. But I did a little search for the hell of it and found that there’s an active Baseball and Softball Association, and that there was a “match” scheduled for Saturday in the top division, the Elitserien. Well, the rain was coming down pretty good Saturday morning, so I figured it would be rained out for sure. Then I checked the web site, and the game was on, as evidenced by live scoring. So I ran to the subway and took the number 17 to Skarpnäck, a suburb south of town. There’s a big public sport park (idrottsplats) with football fields, tennis courts, a baseball field, and a softball field.
I got there during the fourth inning with Stockholm leading 1-0. They were playing Sölvesborg from Blekinge in the south of Sweden. The rain had recently stopped, and the field was in quite good shape. It was a nicely manicured field with an outfield fence (didn’t notice any distances) and a chain-link backstop. They even had lights, but the guy said they weren’t very powerful. The dugouts were rather improvised, but there was a nice set of bleachers behind home plate, as well as a little press box where the guys were scoing the game on the internet. I asked them for a roster, but they didn’t have any spare copies. They had a speaker set up in the bleachers, which they used to announce the batters and even play walk-up music for the home team. (Unlike in Binghamton, the volume was tolerable.) Of course, they played TMOTTBG during the seventh-inning stretch. Not surprisingly, no one was singing. It is Sweden, after all.
There were about a dozen folks in the stands for the whole game, and a few others wandered by to watch from time to time. I talked to one guy who was wearing a Stockholm team jacket and cap. He told me about the Stockholm club, which has quite a few teams: seniors, juniors, cadets, and a few more kids’ teams. Saturday was a double header for the senior team, and they had some of the kids’ games scheduled for Sunday. All the Stockholm players are native Swedes, but the manager is an American guy named Trevor Rooper. (Nope, he’s not in baseball-reference!) Most of the guys looked like they were in their teens and 20s, but there were a few older guys too. There was a girls softball game going on next to the baseball field, but their game was over by the time I wandered over for a closer look.
I talked to a group of folks watching their first game. They live in the area and walk by the sports fields all the time, but this was their first time checking it out up close. I had to explain a sac fly and an intentional walk. They asked me to let them know when they should root on their team!
The level of play was probably a bit below high-school level. There was not much solid contact. Saw lots of weak ground balls and strikeouts. The pitching was much better, however, at least in terms of control. I think I saw only two walks. It looked like mostly BP-type fastballs, but I did see Stockholm’s pitcher throw a couple of off-speed pitches. Fielding was hit-and-miss. I really noticed it during the throws. Not much zip at all. I think even I could have stolen second off the catchers!
The equipment looked top-notch. No names on the back, but a few of the Stockholm guys had a URL (www.frozenrope.se), which was a baseball supply store that looks to have gone under since. I was surprised that they used wooden bats. Not many balls were fouled out of play (lack of contact), and they did not spark frenzied scrambles as in the States. One of the kids would just retrieve it and give it back. Two umps worked the game. Everything I heard them call was in English.
The final was 2-0 Stockholm, which was an unusually low-scoring game. [BOX SCORE] I had to leave after the first game, but Stockholm took the second game 20-5. Mercy rule. Stockholm is the best team in the country. They’ve won something like four of the last five national club championships.
So, who knows, maybe baseball in Sweden is where golf was 30 years ago. They’ve got a long way to go before they can think about the World Baseball Classic, but you never know.
A few guys have mentioned that there are a lot more R symbols out there this season. Commish & I were talking about it and speculated about how the symbols are calculated. I guessed that the R & H symbols depend solely on how many home runs a pitcher gave up with runners on base relative to the total number of homers he surrendered.
I collected some stats from Baseball Reference to see how they compared to the symbols. I initially selected the 43 starters currently on the ABL active rosters. I later added some H-symbol starters from Taxi Teams and the free-agent pool, because the H symbols were underrepresented. I didn’t look at any relievers, but I don’t expect they would have rules different from the starters. I looked at the 2013 MLB stats and the TPB cards we’re using for the 2014 ABL season. In B-R you can find the relevant stats under the “Splits” menu in the “Standard Pitching” section on the particular pitcher’s page. Scroll down to the “Bases Occupied” table. Strasburg’s stats are shown below: 7 homers with the bases empty, 9 with runners on.
I noticed some patterns and figured out an easy rule that predicted all the actual symbols. It’s best understood by looking at the grid shown below. There are two measurements that figure in. The first is the number of homers hit with runners on base divided by the total number of homers. Call this HRonbase. My initial thought was that the symbols would depend on this number only. The average value of this measurement in my sample is 40%. The second measurement is the overall home-run rate: the total number of homers surrendered divided by the batters faced. Call this one HRt. The average value in my sample is 2.2%. So here’s the table showing how the combination of these measurements determines the symbol:
When the overall home-run rate is greater than 2%, the symbols act like I expected them to. If the percentage of home runs with runners on is large, the guy gets an H. If that percentage is small, he gets an R. But it’s a different story when the overall home-run rate is less than 2%. In that case, it doesn’t matter what the stats are for on-base and bases empty; the guy gets an R, period. The clearest example is Henderson Alvarez, who had guys on base every time a home run was hit against him. But that was only two homers in 418 plate appearances, a very low rate of 0.48%. That low rate earned him an R, despite the fact that he gave up zero solo shots.
So it’s obvious that the R symbol is used to reduce the number of homers from the batter’s card when the pitcher gives up fewer than average home runs in general. With power becoming scarcer recently, it’s not surprising that more Rs are required. On the other hand, although there were 273 fewer home runs in 2013 compared to 2012 (as Commish pointed out), there were even fewer in 2011.
I wondered why the overall homer rate couldn’t instead be handled via the Deep ranges. I think the answer is that if you lose the Deeps, then you lose the park variation that forms such an important part of the game. If a guy has no Deep ranges (and there are some, of course), then it doesn’t matter what park he’s pitching in or what Power the batter has (except for the Deeps from Park Effects, of course).
So, my conclusion is that the R & H symbols are based more on the overall home-run rate of the pitcher, and not so much on the state of the bases when the home runs were hit.
This came up at lunch: “Over a season how many more plate appearances does the leadoff batter have, compared to the ninth-place batter?”