bblog – baseball notes

Strat-O-Matic Basic Average e Ratings

I broke out some Basic Strat last night. I don’t like the basic fielding charts, so I’d rather use the simplified “card charts” I use for the Advanced game. Trouble is, the Old Timer cards don’t have e Ratings. If I had some average e Ratings for each position, I could use that.

I found an old article in Strat Fan that gives a formula for e Ratings:

SOM e = 1458 * Errors / Innings_Played

I can find this data for specific positions and seasons on Baseball Reference. I scraped the data for the National and American Leagues from 1901 through 2021 and calculated the average e Ratings. (I treat all outfielders together.) The whole mess can be seen in the busy chart below.

I broke the seasons into six somewhat arbitrary periods. (The longer periods have less variation.) Then I averaged the seasons in each period. This gave me the numbers for the new chart:

Now, if a card doesn’t have an e Rating, I can look up the average value here and use it. Of course, every player at a particular position will have the same rating, but at least it will be representative of the era.

The updated card charts can be downloaded here.

Jacob deGrom in Binghamton

As I write this, Jacob deGrom is having an historic pitching season for the Mets. He just got hurt again, and we’ll see to what extent he can bounce back. But it got me to thinking about seeing him in Binghamton with the B-Mets. I was surprised to see that he pitched only ten games for them in 2013. I thought he had been there longer. He didn’t do that great. He was 2-5 with a 4.80 ERA and a 1.483 WHIP. He was called up to AAA around mid-season and put up similar numbers. I thought I must have some pictures of him in Bingo, but I think that was just before I started bringing my camera to the park. I do have scoresheets, though, and they’re fun to look at after eight years. I saw three of his ten starts.

The first was an early outing against Brandon Workman. Ticket prices were reduced, because of a low temperature that morning, maybe below 40 degrees at 9 a.m. deGrom was a hard-luck loser. He allowed only three hits over eight innings, but gave up a run in the fourth, when Xander Bogaerts doubled and scored on a sac fly. The B-Mets offense provided absolutely zero run support. Getting the save for the Sea Dogs was Chris Martin, who is having a fine 2021 ABL season for the Titusville Perfectos. The scorebook notes that Skibby Bomysoad was at this game.

The second game I saw was against the Harrisburg Senators later in April. deGrom got touched up in this one, and gave up four earned runs in 5-2/3 IP. Anthony Rendon was 2-for-3 against him with a run and an RBI. This game stands out because the B-Mets were NO-HIT. (Possibly the first I’d ever witnessed live.) It was a combined no-hitter, with Paul Demny pitching the first eight innings. (Demny never made it to the bigs.) Also noted in the scorebook is that this was the 3,000th game played by the B-Mets, who started in 1992. Commish was at this game.

Game three was another matchup against Portland, the AA team of the Red Sox. deGrom had a good outing, pitching six innings and giving up only one earned run. He still couldn’t get Xander Bogaerts out! It was an exciting, back-and-forth game that ended with a walk-off wild pitch. The scorebook indicates that I was amused about the local super-fan whiffing on a foul ball, despite wearing a glove. He was a guy from work who came to all the games.

It was fun to look back at the old scorebook. I only wish I had a picture of deGrom as a B-Met.

Plate Appearance Outcomes Over the Years

We know that the three true outcomes have been increasing in recent years, but I’ve never seen a good graphical representation of it. I’ve been wanting to examine this data for a while—a conversation with a friend last night spurred me on to do it.

The goal was to track the outcomes of plate appearances over the years, categorized into plays that involve ball-bat contact and those that don’t. The greenish areas in the chart are the “contact” plays, and the reddish areas are the non-contact outcomes. The major takeaway for me is that, over the last 75 years or so, strikeouts have increased by about the same amount that in-play outs have decreased. (In-play outs are balls in play that are not hits, or, in other words, outs that are not strikeouts.)

I was surprised that the decrease in contact plays overall has not caused a proportionate decrease in the percentage of hits. The percentage of hits (singles, doubles, triples, and home runs) has stayed remarkably constant from 1940-2019, never going outside the range of 22-24%!

Likewise, the percentage of walks has been more consistent than I expected. It has been within the 7-10% range since 1936. Also, there’s no clear recent trend in walk percentage, unlike the case for strikeouts.

Although recent home-run percentages (2.7% for 2010-2019) are higher compared to the 1980s (1.9%), they aren’t that different from the “steroid era” (2.5% for 1990-2009) or the 1950s (2.4%).

One way to look at it is to say that the contact plays (green) are “exciting,” while the non-contact plays (red) are “boring.” By that measure, in 1973 (when I started following baseball) 76.7% of plate appearances were “exciting.” In 2019 that figure had dropped to 67.3%, so one could say that the game is only 88% as exciting now compared to when I starting following it.

By the way, this is just a presentation of the data. I’m not getting into the reasons for why the percentages have changed.

Miscellaneous notes:

The raw data is from the excellent Baseball Reference site.
I chose the National League because it’s the longest continuous league. Excluding the American League avoids effects of the Designated Hitter.
I excluded the 2020 season because of the relatively small number of games and because of the DH in the NL.
The “other” category collects outcomes like catcher’s interference.
Reaching base on an error is included in the “in-play outs” category, because such a play is officially scored as an out for the batter (but for the error).
The home-run statistics include inside-the-park home runs. They are very rare today, but they accounted for about 35% of home runs in 1901.
Statistics for sacrifice hits were not recorded prior to 1894, and sacrifice flies were not always in the scoring rules. No matter—they should all be counted in the “in-play outs” category.
Prior to 1889, more than four balls constituted a walk.

2021 Power Rankings

Pre-Draft value sums the positive values of all signed and presumptive players.

Post-Draft values are for the eight best position players at all positions, plus the next two best position players, plus five starters, plus four relievers, including one closer. Negative values are used as well as positive values. Replacement levels are recalculated after the draft.

XR Prediction

The initial goal was to predict XR values for pitchers prior to the cards coming out. So I used MLB 2019 stats and ABL 2020 XR values (all parks). Got ERA, xFIP, & SIERA from Fangraphs. Correlations were fair, around 0.7. Best was xFIP at 0.71 for starters.

Finally found some linear weights at Baseball Savant, namely wOBA. This provided a much better correlation. (Also tried xwOBA, but it was worse.) Used a list of exclusive starters and exclusive shorts. They were fairly different, with shorts having worse correlation. I’m guessing that’s a combination of smaller sample size and deviation from average platoon rates. Did not try longs, closers, or combos. The pitcher estimate equations:

STARTER:  XR = 824 * wOBA - 103     R=0.88
  SHORT:  XR = 601 * wOBA -  33     R=0.80

If I ever had to choose keeps without cards, I would get data for all qualified pitchers, then sort them into categories, when appropriate.

Did batters too. Unsurprisingly, correlation to wOBA was even better. (xwOBA was again worse.) For batters:

      XR =  736 * wOBA - 138    R=0.86

Take-aways:

TPB pitcher ratings correlate with linear weights, as opposed to, say, ERA.
I could get reasonable XR estimates for both batters and pitchers if I didn’t have access to the cards.

The spreadsheet with the data is named ABL Predictions.

2020 ABL Post-Draft Power Ranking

The power ranking is based on player value for an average of all ABL parks. Batter value is the sum of the greatest combination of the eight fielding positions, plus the highest values of two remaining position players. Pitching values are from the top five starters and top four relievers (includes one closer or short rated at closer). All 30 players on each team immediately after the draft are considered. The Titusville value is adjusted for system bias.

The scale is somewhat different from that of the pre-season ranking, which counted the maximum value of all players, without regard to position.

Power ranking is an estimate of team strength for entertainment purposes only and does not take into account management skill, trading savvy, or the luck of the dice.

2019 Post-Draft Power Ranking

Long Beach Island has the most valuable hitting, and Crown Heights has the most valuable pitching (both starting rotation and bullpen).

The power ranking is based on player value for an average of all ABL parks. Batter value is the sum of the greatest combination of the eight fielding positions, plus the highest values of two remaining position players. Pitching values are from the top five starters and top four relievers (max one closer). Only the active roster is considered, apart from some assumptions about early-season starter taxi moves. The Titusville value is adjusted for system bias.

The scale is somewhat different from that of the pre-season ranking, which counted the maximum value of all players, without regard to position.

Power ranking is an estimate of team strength for entertainment purposes only and does not take into account management skill, trading savvy, or the luck of the dice.

ABL 2019 Pre-Season Power Rankings

Player values are determined as described here. Negative player values are ignored. Replacement levels are estimated for ABL 2019. Average batter & pitcher cards are estimated for ABL 2019. The keeps and draft picks are as of 2018-12-31, so no 2019 trades are included. The positions the keeps are rated for are not a factor, that is, if a team has three players who can play only third base, then all three of those players contribute to the total value.

It looks like there will be more parity in 2019 compared to this time last year.

Player Value

This is a description of the player values that I compute for the ABL. I have bits and pieces of the explanation in various places, but thought it would be good to have everything in one place.

The basis of my calculations is linear weights, which is a method for estimating the number of runs produced by a player using the number of each play outcome for the batter. The particular variety of linear weights I use is called Extrapolated Runs. (See note below.) Each outcome is associated with a run value. A home run is 1.44 runs, a single is 0.5 runs, a strikeout is -0.098 runs. Note that the calculation can be done for both batters and pitchers. Of course, good batters will produce more runs, and good pitchers will allow fewer runs.

Now let’s consider a particular batter’s Triple Play Baseball card. If I can estimate the outcomes of each possible roll (000-999), then I can add up the run values (Extrapolated Runs) for each of those outcomes. If I divide that by 1000, then I have an average run estimate for one plate appearance by that batter. Note that I can do the same thing for a particular pitcher’s card.

To get all those outcomes requires a lot of data and a lot of estimates. The data part involves all the numbers in the main area of the card: this much of a home-run range, this much of an easy-fly range, etc. Then we need to create an average pitcher to face each batter, and vice versa. Then we need to estimate the number of times a batter will face righty and lefty arms, then weight those two values appropriately. We need to calculate the average outcomes of range plays and Deeps! But in the end we can get an estimated runs per plate appearance for every player.

Run values do not take into account the following ratings: injury, jump, steal, speed, hold, catcher throw, outfield throw, and double-play turn.

What’s missing at this point is defense. The Range and Error charts can be used to determine the runs saved by a defender using the same linear weights concept. These adjustments can be applied to a particular player, but if that player is carded at multiple positions, then the combined offensive-defensive run estimate is different for each position.

The goal is to calculate a player “value” that is something like WAR (Wins Above Replacement). Replacement players at different positions have different run-producing capacities. That holds true for both MLB and the ABL. For the ABL I set replacement levels close to the estimated run levels of the best available free agents at each position during the regular season. That level of runs at each position becomes the zero point of my calculated player value. The zero-adjusted run values are then scaled such that only the best players have a player value above 100. Players can have negative player values when free agents with higher run estimates are available at a position.

Defensive ability and position value can lead to very different player values for the same player. For example, an average-hitting catcher may have a significant value behind the plate, but a very low value playing first base, especially if his defense at first is FR/8.

Values are adjusted according to the average number of appearances as a full-time pitcher or position player. For example, on average closers will face fewer batters than a starter, so a closer’s value is adjusted down relative to a starter.

When I total the value of all players on a team, I do not count players with negative player value, because such players are unlikely to get lots of playing time. If a player plays multiple positions, I use the position with the highest value.

Since all free agents are rated, I can use historical ABL draft data to estimate the player value for various points in the draft.

TL;DR: The numbers and ratings on the cards are used to estimate the frequency of outcomes (single, home run, walk, strikeout, etc.). The outcomes are converted into runs using linear weights. The run estimate is adjusted for defense, then adjusted to a scale with zero indicating that an equivalent free-agent player is available, and 100 indicating an arbitrary superstar level.

A note on Extrapolated Runs
Extrapolated Runs (XR) appealed to me, because it is an estimate of absolute runs, unlike Palmer’s Batting Runs, which is measured relative to an average player. XR also includes double plays, which can be estimated from TPB cards.

The big weakness of XR is that it’s formulated to apply over a large span of seasons, specifically 1955-1997. I don’t find any XR coefficients for single, recent seasons.

Jim Furtado wrote an article about the development of XR in 1999.

Historical Attendance Data

A simple chart from BBRef data. 1946!

Official Scoring Resources

SABR has a useful page for its Official Scoring Research Committee. There are downloadable files, including a booklet entitled “Official Scoring in the Big Leagues.” The newsletters are interesting too.

56-Value Dice System

Thought about this ordered dice system for some reason. The idea is to use multiple, uniformly colored, six-sided dice to produce a number of values, not all of which have the same probability. For example, with three dice, one can order the die values in ascending order, like 123, 256, 255, 224, 333, etc. This yields 56 possible values with the following probabilities.

20 values of probability 6/216 = 2.78% (no dice identical)
30 values of probability 3/216 = 1.39% (two dice identical)
 6 values of probability 1/216 = 0.46% (three dice identical)

This system is actually used in an old tabletop baseball game, “Be A Manager.” My Brother found a group of guys playing in a bar several years ago.

Hitter/Pitcher-Friendly Leagues

I had a random thought about the differences between minor leagues in terms of being hitter-friendly or pitcher-friendly. I’ve often read qualifications of individual performances, for example, “he’s hitting well, especially since that’s a pitchers’ league,” or “his ERA is not bad, considering that he’s in a hitter-friendly league.” So I decided to go to the stats. I chose to compute the averages of the last five complete regular seasons, 2013-2017. But which stats to use? Runs per game? ERA? Batting average? I decided to compile OPS and ERA as the measurements for hitting and pitching, respectively. I knew that the two would be highly correlated, and that was indeed the case. I really didn’t see anything interesting by considering both stats together, so I simply sorted the leagues by OPS. The data appears in the table below.

I was surprised to see the huge difference between the top and the bottom: 126 points of OPS, 1.59 earned runs! The next surprise was that the leagues don’t cluster much by level. The Rookie leagues are all over the map.

I had a few ideas to explain the differences, then the Commish suggested a few others. Here’s a list of possible explanations.

Elevation. The Pioneer and Pacific Coast Leagues parks are generally at higher elevations, which helps the hitters.
Big Spring Training Parks. The Florida State League teams play in the Spring Training parks, which are big. The same probably goes for the Gulf Coast League, even though those are back fields.
Wood Bats. Hitters in the Short-Season A leagues may be at a disadvantage, because some of the hitters are using wood bats regularly for the first time.
Windy Florida. Maybe windy conditions are tough on the hitters in the Florida State League and GCL.

The Designated Hitter

This analysis didn’t turn up much interesting. Although I’m not a fan of the DH rule, I had some ideas that the use of the DH had probably changed from its MLB inception in 1973 to the present day. I figured that the early DHs were the ageing sluggers like Cepeda & Oliva, and that the modern game uses a more mix-and-match approach to the DH. Nope.

I looked at regular-season starting lineups from the Retrosheet Event Files. I limited the analysis to American League lineups, because I wanted to focus on teams that used the DH most/all of the time. I included AL lineups in inter-league games when the DH was used.

I looked at the lineup slot occupied by the DH to see how that changed over the years. The table below shows the slots used for each season, 1973 through 2017. Cells are colored like a heat map, with red for the maximum and blue for the minimum.

I’m surprised how variable the data is from season to season. For example, in 1992 the DH led off 209 times (9.2%), and the following season the number was down to 32 (1.4%). Undoubtedly there were a couple of DHs in ’92 that led off regularly and did not do so in ’93. Still, the variation at all batting-order slots is more variable that I had expected. Maybe there’s a bit more consistency in the last ten years or so, but I didn’t do a numerical analysis of this.

Note that the only starting-lineup slot that was not filled by a DH for the entire season was the 9 spot, which had no DH in 1975 and 1997.

Of course, it’s clear that the DH is usually slotted in the heart of the lineup, and that hasn’t changed through history. The totals for all seasons are shown in the chart below. It’s no surprise to me that cleanup is the most common DH slot.

The other thing I looked at is how often a team used a single player as DH through the season. I looked at the number of games started by the most used DH on a team. The team with the most starts by one DH is plotted for each season, as is the team with the least starts by one DH. The mean plotted is the average of the DH leader of all teams. For example, in 1973 Orlando Cepeda started 142 games at DH for Boston (the max), while Kansas City had seven players with ten or more starts at DH, of whom Hal McRae had the most (33, the min).

The 1981 and 1994 seasons were shortened by strikes, so keep that in mind when looking at the data for those seasons.

There’s not much variation over history. I expected to see a decline in the max, but I don’t see it.

The coolest tidbit from this otherwise dull analysis was noticing that the maxima during 1978 & 1979 were 162, meaning that at least two players started every regular-season game at DH. That turned out to be Rusty Staub for the 1978 Tigers, and Willie Horton for the 1979 Mariners. Because of inter-league play, this record will likely never be broken!

MLB Documents

A few interesting documents are on an MLBA page, including the Major League (business) rules and the Basic (labor) Agreement.

2018 ABL Pre-Season Power Rankings

The power rankings are based on the run value of players relative to a replacement player. Replacement-player values are based on post-draft free agents at each position, but these pre-season rating are based on replacement levels from the 2017 ABL season. The scale is set to zero for replacement players and 100 for an arbitrary â€œsuperstarâ€ level. Run values are adjusted to expected game participation of full-time regular position players, starters, and relievers. Run values do not take into account the following ratings: injury, jump, steal, speed, hold, catcher throw, outfield throw, and double-play turn. Run values are based on an average of all current ABL parks.

The value of keeps are summed without regard to position. For example, if four keeps for one team can play only first base, all four are still counted.

The estimate value for draft picks is calculated differently from last season. Last season I used an average value from each round, taking into account the last few drafts. This time I assumed that the picks would proceed from the highest-value free agent and always proceed to the next highest-value free agent. This is not ideal, as value will tend to fall from the highest picks to lower picks, but at least it accounts for the order of picks within each round.

Sports Illustrated / Avalon Hill / 10-39 Dice

The family have a couple of tabletop games from the 70s that use funky, six-sided, wooden dice. (Superstar Baseball has a selection of all-time MLB greats, while Bowl Bound has college football teams from the 60’s & 70’s.) There are three dice: one black and two white. There are no pips on the dice—instead numerals are printed on the sides. It’s a bit like Strat-o-Matic. The black-die value is multiplied by ten, and the two white die are added to the total. So, for example, a black 2 and white 3 & 4 represent a value of 27. The faces of the dice are marked as follows:

 BLACK: 1, 2, 2, 3, 3, 3
WHITE1: 0, 1, 2, 3, 4, 5
WHITE2: 0, 0, 1, 2, 3, 4

Continue reading Sports Illustrated / Avalon Hill / 10-39 Dice

2017 Baseball Road Trip

June 6 – July 18, 2017

27 games
24 parks
20 new parks
7 shutouts
3 walk-offs
2 rain-outs
0 extra-inning games (!)

Links:

Post-Draft Power Rankings

The rating system is based on the run value of a player at a particular position, relative to a replacement player. Replacement-player values are based on post-draft free agents at each position. The scale is set to zero for replacement players and 100 for an arbitrary “superstar” level. Run values are adjusted to expected game participation of regular position players, starters, and relievers. Run values do not take into account the following ratings: injury, jump, steal, speed, hold, catcher throw, outfield throw, and double-play turn. Run values are based on an average of all current ABL parks.

Team power rankings are calculated by adding the run values of 19 players on each team:

8 position players chosen for maximum value as a group (Platoons are not considered.)
2 additional position players, which represent DH and bench strength
5 most valuable starters
4 most valuable relievers

The post-draft power rankings are depicted in the chart below.

Syracuse has the strongest position players, Orlando has the weakest. La Jolla has the strongest starting rotation, Ocracoke has the weakest. Chesapeake Bay has the strongest bullpen, Mudville has the weakest.

The power rankings are a simple measurement of team strength and may not accurately predict win/loss records.

2017 ABL Draft: Titusville Picks

The most significant feature of this year’s draft was the lack of starting pitching. Relief pitching was good at the top, but poor in the middle. Batters in the draft pool were stronger than normal. Titusville’s picks went very much according to plan for the first five rounds or so, before the normal confusion set in.

ROUND/OVERALL    PLAYER              POSITION
    1/  6        Steven Wright       starter
    2/ 16        Kevin Kiermaier     CF
    3/ 26        Tyler Thornburg     reliever
    4/ 36        Evan Longoria       3B
    5/ 46        Ervin Santana       starter
    6/ 56        Miguel Gonzalez     starter
    7/ 65        Adam Rosales        IF
    8/ 74        Michael Lorenzen    reliever
    9/ 84        George Kontos       reliever
   10/ 93        Jett Bandy          C
   11/102        Curtis Granderson   OF
   12/111        Matt Holliday       1B/LF
   13/120        Fernando Abad       LH reliever
   14/129        Domingo Santana     OF
   15/138        Jackie Bradley      CF
   16/146        Jake Barrett        reliever