2019 Post-Draft Power Ranking

Long Beach Island has the most valuable hitting, and Crown Heights has the most valuable pitching (both starting rotation and bullpen).

The power ranking is based on player value for an average of all ABL parks. Batter value is the sum of the greatest combination of the eight fielding positions, plus the highest values of two remaining position players. Pitching values are from the top five starters and top four relievers (max one closer). Only the active roster is considered, apart from some assumptions about early-season starter taxi moves. The Titusville value is adjusted for system bias.

The scale is somewhat different from that of the pre-season ranking, which counted the maximum value of all players, without regard to position.

Power ranking is an estimate of team strength for entertainment purposes only and does not take into account management skill, trading savvy, or the luck of the dice.

Player Value

This is a description of the player values that I compute for the ABL. I have bits and pieces of the explanation in various places, but thought it would be good to have everything in one place.

The basis of my calculations is linear weights, which is a method for estimating the number of runs produced by a player using the number of each play outcome for the batter. The particular variety of linear weights I use is called Extrapolated Runs. (See note below.) Each outcome is associated with a run value. A home run is 1.44 runs, a single is 0.5 runs, a strikeout is -0.098 runs. Note that the calculation can be done for both batters and pitchers. Of course, good batters will produce more runs, and good pitchers will allow fewer runs.

Now let’s consider a particular batter’s Triple Play Baseball card. If I can estimate the outcomes of each possible roll (000-999), then I can add up the run values (Extrapolated Runs) for each of those outcomes. If I divide that by 1000, then I have an average run estimate for one plate appearance by that batter. Note that I can do the same thing for a particular pitcher’s card.

To get all those outcomes requires a lot of data and a lot of estimates. The data part involves all the numbers in the main area of the card: this much of a home-run range, this much of an easy-fly range, etc. Then we need to create an average pitcher to face each batter, and vice versa. Then we need to estimate the number of times a batter will face righty and lefty arms, then weight those two values appropriately. We need to calculate the average outcomes of range plays and Deeps! But in the end we can get an estimated runs per plate appearance for every player.

Run values do not take into account the following ratings: injury, jump, steal, speed, hold, catcher throw, outfield throw, and double-play turn.

What’s missing at this point is defense. The Range and Error charts can be used to determine the runs saved by a defender using the same linear weights concept. These adjustments can be applied to a particular player, but if that player is carded at multiple positions, then the combined offensive-defensive run estimate is different for each position.

The goal is to calculate a player “value” that is something like WAR (Wins Above Replacement). Replacement players at different positions have different run-producing capacities. That holds true for both MLB and the ABL. For the ABL I set replacement levels close to the estimated run levels of the best available free agents at each position during the regular season. That level of runs at each position becomes the zero point of my calculated player value. The zero-adjusted run values are then scaled such that only the best players have a player value above 100. Players can have negative player values when free agents with higher run estimates are available at a position.

Defensive ability and position value can lead to very different player values for the same player. For example, an average-hitting catcher may have a significant value behind the plate, but a very low value playing first base, especially if his defense at first is FR/8.

Values are adjusted according to the average number of appearances as a full-time pitcher or position player. For example, on average closers will face fewer batters than a starter, so a closer’s value is adjusted down relative to a starter.

When I total the value of all players on a team, I do not count players with negative player value, because such players are unlikely to get lots of playing time. If a player plays multiple positions, I use the position with the highest value.

Since all free agents are rated, I can use historical ABL draft data to estimate the player value for various points in the draft.

TL;DR: The numbers and ratings on the cards are used to estimate the frequency of outcomes (single, home run, walk, strikeout, etc.). The outcomes are converted into runs using linear weights. The run estimate is adjusted for defense, then adjusted to a scale with zero indicating that an equivalent free-agent player is available, and 100 indicating an arbitrary superstar level.

A note on Extrapolated Runs
Extrapolated Runs (XR) appealed to me, because it is an estimate of absolute runs, unlike Palmer’s Batting Runs, which is measured relative to an average player. XR also includes double plays, which can be estimated from TPB cards.

The big weakness of XR is that it’s formulated to apply over a large span of seasons, specifically 1955-1997. I don’t find any XR coefficients for single, recent seasons.

Jim Furtado wrote an article about the development of XR in 1999.

2018 ABL Pre-Season Power Rankings

The power rankings are based on the run value of players relative to a replacement player. Replacement-player values are based on post-draft free agents at each position, but these pre-season rating are based on replacement levels from the 2017 ABL season. The scale is set to zero for replacement players and 100 for an arbitrary “superstar” level. Run values are adjusted to expected game participation of full-time regular position players, starters, and relievers. Run values do not take into account the following ratings: injury, jump, steal, speed, hold, catcher throw, outfield throw, and double-play turn. Run values are based on an average of all current ABL parks.

The value of keeps are summed without regard to position. For example, if four keeps for one team can play only first base, all four are still counted.

The estimate value for draft picks is calculated differently from last season. Last season I used an average value from each round, taking into account the last few drafts. This time I assumed that the picks would proceed from the highest-value free agent and always proceed to the next highest-value free agent. This is not ideal, as value will tend to fall from the highest picks to lower picks, but at least it accounts for the order of picks within each round.

Goodbye R, Hello L

During the 2014 ABL season everyone noticed the increase in pitcher cards with the R symbol. I wrote about it in
the 2014 ABL Yearbook. Now that my 2015 card data is in the computer, it’s a good opportunity to see if the Rs are still as numerous. I did a simple count of the pitchers in recent seasons that have each symbol. Starters and relievers are all grouped together. The data is from only the pitchers with ABL eligibility; not all Triple Play cards are represented. I don’t think I’ve missed too many eligible players over the last few seasons, but the first couple of seasons considered here are probably missing a few, especially for the 2008 season. The years listed refer to the ABL season, so the 2015 data is from the 2014 Triple Play cards that we’ll be using in the upcoming 2015 ABL season. OK, enough of the fine-print bullshit, let’s go to the graphs.

sym_homerun

Well, it looks like 2014 was a blip for the R symbol. The frequency has dropped down to the previous level.

The H symbol continues to occur infrequently. (To the relief of all ABL managers!) It’s interesting that the level of the H symbol seems to follow that of the R from year to year. I didn’t notice that before, probably because the yearbook study weighted the symbols by how many innings were pitched in the ABL, and nobody likes to give an H pitcher a lot of innings. In 2014, when the R frequency doubled, the H frequency doubled too, from 4.5% to 9.5%! In 2015 it’s back down to 4.5%.

sym_walk

The L symbol is back with a vengeance! Lots of shorts have the L this year, and it looks like every single qualified closer has one. In the yearbook I speculated that the combination of B & L might be constant. It sure doesn’t look like that in 2015. This season should see more walks than ever before erased from batter cards, because the frequency of Bs is up too.

sym_single

And finally, the F symbol (found on relievers’ cards only) has not fluctuated much over the years.

In summary, compared to last season, expect fewer homers & deeps to be re-rolled, and expect to lose more walks off the batters card.

R & H Symbols

A few guys have mentioned that there are a lot more R symbols out there this season. Commish & I were talking about it and speculated about how the symbols are calculated. I guessed that the R & H symbols depend solely on how many home runs a pitcher gave up with runners on base relative to the total number of homers he surrendered.

I collected some stats from Baseball Reference to see how they compared to the symbols. I initially selected the 43 starters currently on the ABL active rosters. I later added some H-symbol starters from Taxi Teams and the free-agent pool, because the H symbols were underrepresented. I didn’t look at any relievers, but I don’t expect they would have rules different from the starters. I looked at the 2013 MLB stats and the TPB cards we’re using for the 2014 ABL season. In B-R you can find the relevant stats under the “Splits” menu in the “Standard Pitching” section on the particular pitcher’s page. Scroll down to the “Bases Occupied” table. Strasburg’s stats are shown below: 7 homers with the bases empty, 9 with runners on.

rh1

I noticed some patterns and figured out an easy rule that predicted all the actual symbols. It’s best understood by looking at the grid shown below. There are two measurements that figure in. The first is the number of homers hit with runners on base divided by the total number of homers. Call this HRonbase. My initial thought was that the symbols would depend on this number only. The average value of this measurement in my sample is 40%. The second measurement is the overall home-run rate: the total number of homers surrendered divided by the batters faced. Call this one HRt. The average value in my sample is 2.2%. So here’s the table showing how the combination of these measurements determines the symbol:

rh2

When the overall home-run rate is greater than 2%, the symbols act like I expected them to. If the percentage of home runs with runners on is large, the guy gets an H. If that percentage is small, he gets an R. But it’s a different story when the overall home-run rate is less than 2%. In that case, it doesn’t matter what the stats are for on-base and bases empty; the guy gets an R, period. The clearest example is Henderson Alvarez, who had guys on base every time a home run was hit against him. But that was only two homers in 418 plate appearances, a very low rate of 0.48%. That low rate earned him an R, despite the fact that he gave up zero solo shots.

So it’s obvious that the R symbol is used to reduce the number of homers from the batter’s card when the pitcher gives up fewer than average home runs in general. With power becoming scarcer recently, it’s not surprising that more Rs are required. On the other hand, although there were 273 fewer home runs in 2013 compared to 2012 (as Commish pointed out), there were even fewer in 2011.

I wondered why the overall homer rate couldn’t instead be handled via the Deep ranges. I think the answer is that if you lose the Deeps, then you lose the park variation that forms such an important part of the game. If a guy has no Deep ranges (and there are some, of course), then it doesn’t matter what park he’s pitching in or what Power the batter has (except for the Deeps from Park Effects, of course).

So, my conclusion is that the R & H symbols are based more on the overall home-run rate of the pitcher, and not so much on the state of the bases when the home runs were hit.

ABL Reliever Usage Rules

Are the ABL reliever usage limits realistic? One way to try to answer this is to ask the question: “What percentage of 2008 MLB relief appearances would have violated the 2009 ABL rules?” If the answer is “0%,” then I’d say the ABL rules are too lenient in allowing relievers to pitch a lot. At the other extreme, if the answer is “50%,” then I’d say the ABL rules are too strict and don’t allow pitchers to pitch enough. What’s the right percentage that would make you feel that the ABL rules are just about right? 1%? 5%? 10%? More?

Here are the current ABL rules for reference:

Short  
IP          REST 
0-2         0** 
2.1-3       1 
3.1-4       2 
4.1-over    3   
 
Closer 
IP          REST 
0-1         0*** 
1.1-2       1 
2.1-3       2   
3.1-over    3  
 
**  Cannot pitch more than 2 consecutive games 
***  Cannot pitch more than 3 consecutive games 
Note: Short cannot pitch more than 4 IP’s unless no other pitchers are available. 
Note: Closer cannot pitch more than 3 IP’s unless no other pitchers are available. 

Continue reading ABL Reliever Usage Rules

average pitcher card

A companion to the average batter card, here is the average pitcher card. This is a bit of a hack—there’s no adjustment between starters & relievers. The pool is 107 pitchers, made up from the best of the draft, the Perfectos, and most pitchers from the teams I’ve played so far.

   vs L                vs R
---------            ---------
500 - 519     WP?    500 - 519
520 - 547  Range IF  520 - 545
548 - 573  Range OF  546 - 573
574 - 599     EF     574 - 603
600 - 616     RG     604 - 617
617 - 642     1B     618 - 645
643 - 669     EF     646 - 675
670 - 686     RG     676 - 690
687 - 704     SG     691 - 705
705 - 707     HB     706 - 711
708 - 734     1B     712 - 739
735 - 753     2B     740 - 755
754 - 776    Deep!   756 - 776
777 - 803  EF/Tired? 777 - 807
804 - 862     K      808 - 867
863 - 914   K/Tired? 868 - 920
915 - 965     BB     921 - 954
966 - 999     DP     955 - 999

Here are the range numbers:

  vs L              vs R
 ------            ------
  20.0      WP?     20.0
  28.4   Range IF   26.2
  25.7   Range OF   27.6
  26.1      EF      30.0
  17.0      RG      14.5
  26.2      1B      27.8
  26.4      EF      30.3
  17.4      RG      14.8
  17.7      SG      15.1
   3.7      HB       5.7
  26.6      1B      28.3
  18.9      2B      15.8
  23.5     Deep!    21.3
  26.7   EF/Tired?  30.7
  59.0      K       60.2
  51.8    K/Tired?  52.7
  50.9      BB      33.7
  34.4      DP      45.2

Home-field advantage and Park Effects

Following up on the previous post about home-field advantage, I got the MLB batting splits for the last five years from BR.com, which are summarized in the table below. (Here are the 2007 splits.)

hv-splits.png

The biggest effect is that the home team strikes out less. The next largest is more walks for the home team. To capture these effects, we need at least 12 rolls, which is an almost perfect fit into the 16 Park Effects rolls, something suggested by the Commish. A possible Park Effects replacement chart appears below. Since the Park Effects range is identical for every batter, we can eliminate the second roll and simply read the result from the original roll that landed us on Park Effects.

pe1.png

I worked in a couple of things that don’t otherwise tend to happen in the ABL: infield pop outs (pointed out by cnc14) and extra advancement on hits.

One could argue that the home team should have a few SF rolls to account for the extra sac-fly production.

This chart would produce a little more offense. A quick calculation shows that a Park Effects roll on this new chart would yield an average of 0.6 bases per roll, whereas the TBP chart yields only about 0.3 bases per Park Effects roll.

Effect of pitcher symbols on batter rating

I adjusted my rating method to take average pitcher symbols into account on the batters’ cards. As expected, the effect is most profound for guys with big walk & HR ranges, as can be seen for a few guys in the table below. “Before” means pitcher symbols were not taken into account; “after” means they were taken into account.

[image lost]

The higher-rated players will have bigger differences, simply because lots of homers & walks make them highly rated. Of course, nothing changed in the cards, it’s just that I overrated the batters. Assuming my current ratings are much more accurate, I overrated Burrell’s performance by 7.4%, not an insignificant amount!

Average pitcher symbols

For a calculation, I need to know the symbol content of the average pitcher. I did the starters and relievers separately, using the ABL pitchers. Say that of 53 likely ABL starters, 19 have a B symbol. That means that the average starter has 19/53 (0.36) of a B. The chart below shows the averages for all the symbols. Starters & relievers are combined in a ratio of 7:2, to estimate the pitching over an entire game.

avg-symbols.png

Starters & relievers are quite comparable in terms of the walk symbols, B & L. Relievers have a definite edge for the runners-on symbols, R & H. Not surprisingly, only relievers have Fs.

quantifying error rating in terms of batting

This is a first step toward measuring defensive in terms of offense. The question to answer is: how much good hitting makes up for poor fielding? Or, to look at it from the other side, to what extent does superior glove work compensate for light hitting?

There are two components to defense: error and range. Let’s consider error rating only for the moment. And let’s hope my logic is correct.

For one PA of his own, a player will be in the field for approximately nine PAs by the opponent. For one opponent PA, the chance of rolling a possible error is 6.3%, which is 60 rolls on 11-70 plus about 20% of the 16 rolls on 81-96 (Park Effects). A possible error will fall on our guy’s position 10% of the time. So, over 9 opponent PAs, the average number of possible errors for our guy will be 9 * 6.3% * 10% = 0.057, which is equivalent to 57 rolls.

Now we have to consider specific positions. Let’s look at 3B/SS, where errors occur the most, and let’s consider the most extreme difference in error ratings. On a possible error, an error-20 thirdbaseman (Jeff Cirillo) will make an error 24% of the time, while an error-1 (Ryan Braun) will boot it 99% of the time. Let’s ignore the more complex outcomes (E(2), 1B+E(1), RG+) and just say that an error is equivalent to a single. In other words, an error made on defense neutralizes a single hit by the same guy on offense. The 57 rolls calculated above will work out to 14 rolls for a Cirillo error/single and 56 rolls for a Braun error/single, a difference of 42 rolls.

One could use these results to adjust a player’s offensive rating. 14 of Cirillo’s average 25 1B range are neutralized by his errors. Via linear weights, Braun’s 56 neutralized singles are equivalent to 18 neutralized HR rolls. (Braun has a 95 HR range vs. L, 35 vs. R.) You could also do a comparison and say that Braun would have to have 42 more 1B rolls (or 14 more HR rolls, or equivalent) than Cirillo to make him equal value, all other things (range, power, etc.) being equal.

These “neutralized rolls” can be similarly calculated for other positions and all the error ratings. The results are plotted in the graph below. I didn’t count E(0)s (dropped fouls) as errors on the catcher.

neutral1.png

The numbers in the chart are just approximations, and each position’s accuracy is different. For example, all the catcher errors counted are of the two-base variety. Anyway, it’s a start.

It’s interesting that, with the exception of the catcher & outfielders, all the lines are nearly parallel. So the difference between an error-20 & an error-1 at a given position is pretty much the same for either 1B, 2B, SS, 3B, or P.

average batter card

Using the rating data (the best half of the free agents plus the Titusville roster), an average batter card can be calculated.

   vs L                vs R
---------           ---------
  0 -  10    Crazy    0 -  10
 11 -  70    Error   11 -  70
 71 -  80     LO     71 -  80
 81 -  96    Park    81 -  96
 97 - 140     1B     97 - 134
141 - 152     2B    135 - 144
153 - 174     DP    145 - 166
175 - 218     1B    167 - 204
219 - 221     3B    205 - 206
222 - 229     SF    207 - 215
230 - 245     HR    216 - 229
246 - 250     HB    230 - 234
251 - 278     RG    235 - 270
279 - 324     BB    271 - 302
325 - 399     K     303 - 379
400 - 412     2B    380 - 389
413 - 441     RG    390 - 426
442 - 499     EF    427 - 499

In terms of ranges, it looks like this:

  vs L             vs R
 ------           ------
   11.0    Crazy    11.0
   60.0    Error    60.0
   10.0     LO      10.0
   16.0    Park     16.0
   43.5     1B      37.9
   12.0     2B       9.7
   22.5     DP      22.2
   44.0     1B      38.4
    2.9     3B       2.3
    8.1     SF       8.3
   16.0     HR      13.9
    4.5     HB       5.2
   28.5     RG      36.1
   46.3     BB      32.2
   75.0     K       77.1
   12.4     2B      10.2
   28.9     RG      36.6
   58.3     EF      73.1

You can see why lefty pitchers can have a hard time.

Deep Engine 2

More data from the Deep Engine. All results are based on ten million trials.

Here’s the results for all 30 parks from the TPB 2007 data:

     power:     5        4        3        2        1

   homerun:   48.55%   32.38%   19.18%    9.26%    3.11%
    caught:   47.55%   63.72%   76.92%   86.84%   92.99%
      foul:    3.90%    3.90%    3.90%    3.89%    3.90%

As expected, no significant changes from 2006.

I re-ran with the 12 2008 ABL parks, using the TPB 2007 data.

     power:     5        4        3        2        1

   homerun:   40.45%   24.74%   12.86%    5.28%    1.54%
    caught:   55.65%   71.36%   83.24%   90.82%   94.56%
      foul:    3.90%    3.90%    3.90%    3.89%    3.90%

Wow, there are some big parks in the 2008 ABL! It’s much harder to homer, especially for the light hitters who will find it almost twice as hard to hit them out in the ABL compared to the 30-park circuit.

Now let’s see how the numbers look for the different hitting types. Again, this is ten million trials in the 2008 ABL parks.

Rsp
     power:     5        4        3        2        1
   homerun:   39.79%   24.05%   12.36%    5.01%    1.44%
    caught:   57.61%   73.34%   85.03%   92.38%   95.95%
      foul:    2.60%    2.61%    2.61%    2.60%    2.60%


Lsp
     power:     5        4        3        2        1
   homerun:   39.96%   24.20%   12.48%    5.02%    1.46%
    caught:   57.44%   73.19%   84.91%   92.38%   95.94%
      foul:    2.60%    2.60%    2.61%    2.59%    2.60%


Rp
     power:     5        4        3        2        1
   homerun:   40.83%   25.34%   13.35%    5.72%    1.68%
    caught:   53.98%   69.47%   81.46%   89.09%   93.11%
      foul:    5.19%    5.19%    5.19%    5.19%    5.21%


Lp
     power:     5        4        3        2        1
   homerun:   41.23%   25.34%   13.21%    5.37%    1.58%
    caught:   53.55%   69.47%   81.58%   89.43%   93.23%
      foul:    5.21%    5.19%    5.21%    5.20%    5.19%

Not surprisingly, the pull hitters end up with more foul balls. In spite of that, they still end up with a greater probability of homering.

This data can be combined with the batter’s power & the average deeps to estimate the number of home runs a batter will get with Deep! rolls against an average pitcher. Actually, the difference in home-run potential is so similar among the hitting types, that it’s not worth making a distinction. So, for example, a power-5 hitter will homer on about 40% of his 18.7 deep rolls against the average pitcher, effectively giving him an additional 7.5 home-run range.

Combine this with the power distribution, and the probability of a home run on a Deep roll works out to 20.4%. That’s an important number for rating individual pitchers against the average batter.

Deep Engine 1

For the ABL draft & season I’ll develop some ratings based on the card ranges. The batter’s card is pretty straightforward. The ranges can be used to directly compute things like OBP & Slugging that are independent of pitcher-card rolls. The one vital correction, however, is the power, which will determine how many HR result from Deep! rolls. So, I need to get a handle on how power affects those probabilities. Later, I can make calculations based on Deep! ranges, either averages or against particular pitchers.

I wrote a “Deep Engine” that captures the location and distance data and allows me to run some Monte Carlo simulations. For now I’m going to ignore robbed HRs, which is a pretty small effect. For the first calculation I’ll assume a random distribution of pull types (Lsp, Lp, Rsp, Rp) and use all the 2006 parks. (Later I should whittle it down to the 2007 parks in the ABL.) I ran ten million trials for each power rating and came up with the following probabilities.

     power:     5        4        3        2        1

  home run:   48.61%   32.40%   19.17%    9.27%    3.11%
    caught:   47.51%   63.71%   76.93%   86.82%   92.98%
      foul:    3.89%    3.89%    3.89%    3.91%    3.91%

So a power-5 guy has about a 50% chance of hitting it out on a Deep! The average power-1 batter is more likely to send it foul than over the fence fair.

Range Factor and range ratings

The Bill James Handbook lists Range Factor, which is the number of Successful Chances (Putouts plus Assists) times nine divided by the number of Defensive Innings Played. Does this statistic correlate with the TPB range ratings? I picked a couple of the more important defensive positions and compared the 2007 Range Factors for starters with the TPB range ratings from the 2007 TPB Statistics Book. Graphs for shortstops and center fielders are below.

Shortstops show a bit of correlation. It’s no surprise to me that Furcal & Vizquel are highly rated by both measurements. I’m surprised to see that Reyes has such a low Range Factor.

Center fielders are all over the place. Vernon Wells has a Superior TPB rating and the lowest Range Factor!

The red lines are the linear fits to the data. The graphs assume that the TPB ratings are linear, that is, that the difference between VG & SP is the same as between PR & WK. Whether or not that’s the intent, it’s clear that there’s no strong correlation between the Range Factor and the TPB range rating. That could mean that either 1) the two measurements are meant for different purposes, or 2) one or both of the measurements are inaccurate.

I don’t think #1 is likely. Surely each is trying to quantify the ability of a fielder to field balls that are hit in his general direction. Of course, measuring any kind of defensive ability is difficult. (See this discussion of various methods.) Whatever the case, it’s clear that the TPB ratings are not based on Range Factor.