Goodbye R, Hello L

During the 2014 ABL season everyone noticed the increase in pitcher cards with the R symbol. I wrote about it in
the 2014 ABL Yearbook. Now that my 2015 card data is in the computer, it’s a good opportunity to see if the Rs are still as numerous. I did a simple count of the pitchers in recent seasons that have each symbol. Starters and relievers are all grouped together. The data is from only the pitchers with ABL eligibility; not all Triple Play cards are represented. I don’t think I’ve missed too many eligible players over the last few seasons, but the first couple of seasons considered here are probably missing a few, especially for the 2008 season. The years listed refer to the ABL season, so the 2015 data is from the 2014 Triple Play cards that we’ll be using in the upcoming 2015 ABL season. OK, enough of the fine-print bullshit, let’s go to the graphs.

sym_homerun

Well, it looks like 2014 was a blip for the R symbol. The frequency has dropped down to the previous level.

The H symbol continues to occur infrequently. (To the relief of all ABL managers!) It’s interesting that the level of the H symbol seems to follow that of the R from year to year. I didn’t notice that before, probably because the yearbook study weighted the symbols by how many innings were pitched in the ABL, and nobody likes to give an H pitcher a lot of innings. In 2014, when the R frequency doubled, the H frequency doubled too, from 4.5% to 9.5%! In 2015 it’s back down to 4.5%.

sym_walk

The L symbol is back with a vengeance! Lots of shorts have the L this year, and it looks like every single qualified closer has one. In the yearbook I speculated that the combination of B & L might be constant. It sure doesn’t look like that in 2015. This season should see more walks than ever before erased from batter cards, because the frequency of Bs is up too.

sym_single

And finally, the F symbol (found on relievers’ cards only) has not fluctuated much over the years.

In summary, compared to last season, expect fewer homers & deeps to be re-rolled, and expect to lose more walks off the batters card.

R & H Symbols

A few guys have mentioned that there are a lot more R symbols out there this season. Commish & I were talking about it and speculated about how the symbols are calculated. I guessed that the R & H symbols depend solely on how many home runs a pitcher gave up with runners on base relative to the total number of homers he surrendered.

I collected some stats from Baseball Reference to see how they compared to the symbols. I initially selected the 43 starters currently on the ABL active rosters. I later added some H-symbol starters from Taxi Teams and the free-agent pool, because the H symbols were underrepresented. I didn’t look at any relievers, but I don’t expect they would have rules different from the starters. I looked at the 2013 MLB stats and the TPB cards we’re using for the 2014 ABL season. In B-R you can find the relevant stats under the “Splits” menu in the “Standard Pitching” section on the particular pitcher’s page. Scroll down to the “Bases Occupied” table. Strasburg’s stats are shown below: 7 homers with the bases empty, 9 with runners on.

rh1

I noticed some patterns and figured out an easy rule that predicted all the actual symbols. It’s best understood by looking at the grid shown below. There are two measurements that figure in. The first is the number of homers hit with runners on base divided by the total number of homers. Call this HRonbase. My initial thought was that the symbols would depend on this number only. The average value of this measurement in my sample is 40%. The second measurement is the overall home-run rate: the total number of homers surrendered divided by the batters faced. Call this one HRt. The average value in my sample is 2.2%. So here’s the table showing how the combination of these measurements determines the symbol:

rh2

When the overall home-run rate is greater than 2%, the symbols act like I expected them to. If the percentage of home runs with runners on is large, the guy gets an H. If that percentage is small, he gets an R. But it’s a different story when the overall home-run rate is less than 2%. In that case, it doesn’t matter what the stats are for on-base and bases empty; the guy gets an R, period. The clearest example is Henderson Alvarez, who had guys on base every time a home run was hit against him. But that was only two homers in 418 plate appearances, a very low rate of 0.48%. That low rate earned him an R, despite the fact that he gave up zero solo shots.

So it’s obvious that the R symbol is used to reduce the number of homers from the batter’s card when the pitcher gives up fewer than average home runs in general. With power becoming scarcer recently, it’s not surprising that more Rs are required. On the other hand, although there were 273 fewer home runs in 2013 compared to 2012 (as Commish pointed out), there were even fewer in 2011.

I wondered why the overall homer rate couldn’t instead be handled via the Deep ranges. I think the answer is that if you lose the Deeps, then you lose the park variation that forms such an important part of the game. If a guy has no Deep ranges (and there are some, of course), then it doesn’t matter what park he’s pitching in or what Power the batter has (except for the Deeps from Park Effects, of course).

So, my conclusion is that the R & H symbols are based more on the overall home-run rate of the pitcher, and not so much on the state of the bases when the home runs were hit.

ABL Reliever Usage Rules

Are the ABL reliever usage limits realistic? One way to try to answer this is to ask the question: “What percentage of 2008 MLB relief appearances would have violated the 2009 ABL rules?” If the answer is “0%,” then I’d say the ABL rules are too lenient in allowing relievers to pitch a lot. At the other extreme, if the answer is “50%,” then I’d say the ABL rules are too strict and don’t allow pitchers to pitch enough. What’s the right percentage that would make you feel that the ABL rules are just about right? 1%? 5%? 10%? More?

Here are the current ABL rules for reference:

Short  
IP          REST 
0-2         0** 
2.1-3       1 
3.1-4       2 
4.1-over    3   
 
Closer 
IP          REST 
0-1         0*** 
1.1-2       1 
2.1-3       2   
3.1-over    3  
 
**  Cannot pitch more than 2 consecutive games 
***  Cannot pitch more than 3 consecutive games 
Note: Short cannot pitch more than 4 IP’s unless no other pitchers are available. 
Note: Closer cannot pitch more than 3 IP’s unless no other pitchers are available. 

Continue reading ABL Reliever Usage Rules

average pitcher card

A companion to the average batter card, here is the average pitcher card. This is a bit of a hack—there’s no adjustment between starters & relievers. The pool is 107 pitchers, made up from the best of the draft, the Perfectos, and most pitchers from the teams I’ve played so far.

   vs L                vs R
---------            ---------
500 - 519     WP?    500 - 519
520 - 547  Range IF  520 - 545
548 - 573  Range OF  546 - 573
574 - 599     EF     574 - 603
600 - 616     RG     604 - 617
617 - 642     1B     618 - 645
643 - 669     EF     646 - 675
670 - 686     RG     676 - 690
687 - 704     SG     691 - 705
705 - 707     HB     706 - 711
708 - 734     1B     712 - 739
735 - 753     2B     740 - 755
754 - 776    Deep!   756 - 776
777 - 803  EF/Tired? 777 - 807
804 - 862     K      808 - 867
863 - 914   K/Tired? 868 - 920
915 - 965     BB     921 - 954
966 - 999     DP     955 - 999

Here are the range numbers:

  vs L              vs R
 ------            ------
  20.0      WP?     20.0
  28.4   Range IF   26.2
  25.7   Range OF   27.6
  26.1      EF      30.0
  17.0      RG      14.5
  26.2      1B      27.8
  26.4      EF      30.3
  17.4      RG      14.8
  17.7      SG      15.1
   3.7      HB       5.7
  26.6      1B      28.3
  18.9      2B      15.8
  23.5     Deep!    21.3
  26.7   EF/Tired?  30.7
  59.0      K       60.2
  51.8    K/Tired?  52.7
  50.9      BB      33.7
  34.4      DP      45.2

Home-field advantage and Park Effects

Following up on the previous post about home-field advantage, I got the MLB batting splits for the last five years from BR.com, which are summarized in the table below. (Here are the 2007 splits.)

hv-splits.png

The biggest effect is that the home team strikes out less. The next largest is more walks for the home team. To capture these effects, we need at least 12 rolls, which is an almost perfect fit into the 16 Park Effects rolls, something suggested by the Commish. A possible Park Effects replacement chart appears below. Since the Park Effects range is identical for every batter, we can eliminate the second roll and simply read the result from the original roll that landed us on Park Effects.

pe1.png

I worked in a couple of things that don’t otherwise tend to happen in the ABL: infield pop outs (pointed out by cnc14) and extra advancement on hits.

One could argue that the home team should have a few SF rolls to account for the extra sac-fly production.

This chart would produce a little more offense. A quick calculation shows that a Park Effects roll on this new chart would yield an average of 0.6 bases per roll, whereas the TBP chart yields only about 0.3 bases per Park Effects roll.

Effect of pitcher symbols on batter rating

I adjusted my rating method to take average pitcher symbols into account on the batters’ cards. As expected, the effect is most profound for guys with big walk & HR ranges, as can be seen for a few guys in the table below. “Before” means pitcher symbols were not taken into account; “after” means they were taken into account.

[image lost]

The higher-rated players will have bigger differences, simply because lots of homers & walks make them highly rated. Of course, nothing changed in the cards, it’s just that I overrated the batters. Assuming my current ratings are much more accurate, I overrated Burrell’s performance by 7.4%, not an insignificant amount!

Average pitcher symbols

For a calculation, I need to know the symbol content of the average pitcher. I did the starters and relievers separately, using the ABL pitchers. Say that of 53 likely ABL starters, 19 have a B symbol. That means that the average starter has 19/53 (0.36) of a B. The chart below shows the averages for all the symbols. Starters & relievers are combined in a ratio of 7:2, to estimate the pitching over an entire game.

avg-symbols.png

Starters & relievers are quite comparable in terms of the walk symbols, B & L. Relievers have a definite edge for the runners-on symbols, R & H. Not surprisingly, only relievers have Fs.

quantifying error rating in terms of batting

This is a first step toward measuring defensive in terms of offense. The question to answer is: how much good hitting makes up for poor fielding? Or, to look at it from the other side, to what extent does superior glove work compensate for light hitting?

There are two components to defense: error and range. Let’s consider error rating only for the moment. And let’s hope my logic is correct.

For one PA of his own, a player will be in the field for approximately nine PAs by the opponent. For one opponent PA, the chance of rolling a possible error is 6.3%, which is 60 rolls on 11-70 plus about 20% of the 16 rolls on 81-96 (Park Effects). A possible error will fall on our guy’s position 10% of the time. So, over 9 opponent PAs, the average number of possible errors for our guy will be 9 * 6.3% * 10% = 0.057, which is equivalent to 57 rolls.

Now we have to consider specific positions. Let’s look at 3B/SS, where errors occur the most, and let’s consider the most extreme difference in error ratings. On a possible error, an error-20 thirdbaseman (Jeff Cirillo) will make an error 24% of the time, while an error-1 (Ryan Braun) will boot it 99% of the time. Let’s ignore the more complex outcomes (E(2), 1B+E(1), RG+) and just say that an error is equivalent to a single. In other words, an error made on defense neutralizes a single hit by the same guy on offense. The 57 rolls calculated above will work out to 14 rolls for a Cirillo error/single and 56 rolls for a Braun error/single, a difference of 42 rolls.

One could use these results to adjust a player’s offensive rating. 14 of Cirillo’s average 25 1B range are neutralized by his errors. Via linear weights, Braun’s 56 neutralized singles are equivalent to 18 neutralized HR rolls. (Braun has a 95 HR range vs. L, 35 vs. R.) You could also do a comparison and say that Braun would have to have 42 more 1B rolls (or 14 more HR rolls, or equivalent) than Cirillo to make him equal value, all other things (range, power, etc.) being equal.

These “neutralized rolls” can be similarly calculated for other positions and all the error ratings. The results are plotted in the graph below. I didn’t count E(0)s (dropped fouls) as errors on the catcher.

neutral1.png

The numbers in the chart are just approximations, and each position’s accuracy is different. For example, all the catcher errors counted are of the two-base variety. Anyway, it’s a start.

It’s interesting that, with the exception of the catcher & outfielders, all the lines are nearly parallel. So the difference between an error-20 & an error-1 at a given position is pretty much the same for either 1B, 2B, SS, 3B, or P.

average batter card

Using the rating data (the best half of the free agents plus the Titusville roster), an average batter card can be calculated.

   vs L                vs R
---------           ---------
  0 -  10    Crazy    0 -  10
 11 -  70    Error   11 -  70
 71 -  80     LO     71 -  80
 81 -  96    Park    81 -  96
 97 - 140     1B     97 - 134
141 - 152     2B    135 - 144
153 - 174     DP    145 - 166
175 - 218     1B    167 - 204
219 - 221     3B    205 - 206
222 - 229     SF    207 - 215
230 - 245     HR    216 - 229
246 - 250     HB    230 - 234
251 - 278     RG    235 - 270
279 - 324     BB    271 - 302
325 - 399     K     303 - 379
400 - 412     2B    380 - 389
413 - 441     RG    390 - 426
442 - 499     EF    427 - 499

In terms of ranges, it looks like this:

  vs L             vs R
 ------           ------
   11.0    Crazy    11.0
   60.0    Error    60.0
   10.0     LO      10.0
   16.0    Park     16.0
   43.5     1B      37.9
   12.0     2B       9.7
   22.5     DP      22.2
   44.0     1B      38.4
    2.9     3B       2.3
    8.1     SF       8.3
   16.0     HR      13.9
    4.5     HB       5.2
   28.5     RG      36.1
   46.3     BB      32.2
   75.0     K       77.1
   12.4     2B      10.2
   28.9     RG      36.6
   58.3     EF      73.1

You can see why lefty pitchers can have a hard time.

Deep Engine 2

More data from the Deep Engine. All results are based on ten million trials.

Here’s the results for all 30 parks from the TPB 2007 data:

     power:     5        4        3        2        1

   homerun:   48.55%   32.38%   19.18%    9.26%    3.11%
    caught:   47.55%   63.72%   76.92%   86.84%   92.99%
      foul:    3.90%    3.90%    3.90%    3.89%    3.90%

As expected, no significant changes from 2006.

I re-ran with the 12 2008 ABL parks, using the TPB 2007 data.

     power:     5        4        3        2        1

   homerun:   40.45%   24.74%   12.86%    5.28%    1.54%
    caught:   55.65%   71.36%   83.24%   90.82%   94.56%
      foul:    3.90%    3.90%    3.90%    3.89%    3.90%

Wow, there are some big parks in the 2008 ABL! It’s much harder to homer, especially for the light hitters who will find it almost twice as hard to hit them out in the ABL compared to the 30-park circuit.

Now let’s see how the numbers look for the different hitting types. Again, this is ten million trials in the 2008 ABL parks.

Rsp
     power:     5        4        3        2        1
   homerun:   39.79%   24.05%   12.36%    5.01%    1.44%
    caught:   57.61%   73.34%   85.03%   92.38%   95.95%
      foul:    2.60%    2.61%    2.61%    2.60%    2.60%


Lsp
     power:     5        4        3        2        1
   homerun:   39.96%   24.20%   12.48%    5.02%    1.46%
    caught:   57.44%   73.19%   84.91%   92.38%   95.94%
      foul:    2.60%    2.60%    2.61%    2.59%    2.60%


Rp
     power:     5        4        3        2        1
   homerun:   40.83%   25.34%   13.35%    5.72%    1.68%
    caught:   53.98%   69.47%   81.46%   89.09%   93.11%
      foul:    5.19%    5.19%    5.19%    5.19%    5.21%


Lp
     power:     5        4        3        2        1
   homerun:   41.23%   25.34%   13.21%    5.37%    1.58%
    caught:   53.55%   69.47%   81.58%   89.43%   93.23%
      foul:    5.21%    5.19%    5.21%    5.20%    5.19%

Not surprisingly, the pull hitters end up with more foul balls. In spite of that, they still end up with a greater probability of homering.

This data can be combined with the batter’s power & the average deeps to estimate the number of home runs a batter will get with Deep! rolls against an average pitcher. Actually, the difference in home-run potential is so similar among the hitting types, that it’s not worth making a distinction. So, for example, a power-5 hitter will homer on about 40% of his 18.7 deep rolls against the average pitcher, effectively giving him an additional 7.5 home-run range.

Combine this with the power distribution, and the probability of a home run on a Deep roll works out to 20.4%. That’s an important number for rating individual pitchers against the average batter.

Deep Engine 1

For the ABL draft & season I’ll develop some ratings based on the card ranges. The batter’s card is pretty straightforward. The ranges can be used to directly compute things like OBP & Slugging that are independent of pitcher-card rolls. The one vital correction, however, is the power, which will determine how many HR result from Deep! rolls. So, I need to get a handle on how power affects those probabilities. Later, I can make calculations based on Deep! ranges, either averages or against particular pitchers.

I wrote a “Deep Engine” that captures the location and distance data and allows me to run some Monte Carlo simulations. For now I’m going to ignore robbed HRs, which is a pretty small effect. For the first calculation I’ll assume a random distribution of pull types (Lsp, Lp, Rsp, Rp) and use all the 2006 parks. (Later I should whittle it down to the 2007 parks in the ABL.) I ran ten million trials for each power rating and came up with the following probabilities.

     power:     5        4        3        2        1

  home run:   48.61%   32.40%   19.17%    9.27%    3.11%
    caught:   47.51%   63.71%   76.93%   86.82%   92.98%
      foul:    3.89%    3.89%    3.89%    3.91%    3.91%

So a power-5 guy has about a 50% chance of hitting it out on a Deep! The average power-1 batter is more likely to send it foul than over the fence fair.

Range Factor and range ratings

The Bill James Handbook lists Range Factor, which is the number of Successful Chances (Putouts plus Assists) times nine divided by the number of Defensive Innings Played. Does this statistic correlate with the TPB range ratings? I picked a couple of the more important defensive positions and compared the 2007 Range Factors for starters with the TPB range ratings from the 2007 TPB Statistics Book. Graphs for shortstops and center fielders are below.

Shortstops show a bit of correlation. It’s no surprise to me that Furcal & Vizquel are highly rated by both measurements. I’m surprised to see that Reyes has such a low Range Factor.

Center fielders are all over the place. Vernon Wells has a Superior TPB rating and the lowest Range Factor!

The red lines are the linear fits to the data. The graphs assume that the TPB ratings are linear, that is, that the difference between VG & SP is the same as between PR & WK. Whether or not that’s the intent, it’s clear that there’s no strong correlation between the Range Factor and the TPB range rating. That could mean that either 1) the two measurements are meant for different purposes, or 2) one or both of the measurements are inaccurate.

I don’t think #1 is likely. Surely each is trying to quantify the ability of a fielder to field balls that are hit in his general direction. Of course, measuring any kind of defensive ability is difficult. (See this discussion of various methods.) Whatever the case, it’s clear that the TPB ratings are not based on Range Factor.

bbpix

jose_cardenal_1975_toppsjpg.jpgI found a site with lots of small baseball-card pictures. The guy uses them for display in the Strat-O-Matic software.

For a lot of the old TPB cards I can’t remember the guys or have never heard of them. I thought it would be cool to have a searchable database of these photos, so I hacked together bbpix. It’s not a database, but it works.

card conversion

I’ve got a better idea. I can tile them ten to a page, baseball-card size, and I won’t need any guide lines. 2.5+2.5+3.5=8.5

Since there are different numbers of cards on the page, need to cut them all out first (card_cutter_10up.pl), then paste them together in groups of ten (card_paster_10up.pl).

Printing from Preview, the dpi is actually about 359.1 (not 360), which is imposed by specifying the resolution when converting to PNG.

10up.png

I also tried 8-up, just like as-shipped. Worked out fine, but I still like 2.5×3.5.

To do:

  • Deal with odd numbers of cards at the end.
  • Write out as TIFFs instead of PNGs (pnmtotiff -lzw), cat them (tiffcp), & convert to PDF (tiff2pdf -z). Will need to specify resolution at some point.