Wednesday, May 30, 2012

Tugging On Superman's Cape


Considering the number of hitting statistics, it is curious how few focus on runs.  Runs, after all, are what decide ball games.  It's not hits or home runs or stolen bases, but runs.  The best known statistics that focus on runs are Runs Batted In and Runs (Scored). Many believe that these two statistics can be poor describers of an individual batter's performance since both are so heavily affected by the quality of teammates batting in the line up with that batter.  To some extent, I agree.

Bill James created a stat called "Runs Created" in an attempt to measure the contribution of a hitter toward the scoring of runs.  The formula is RC = (H + BB)(TB)/(AB + BB).  I'll be honest.  I'm not a fan of Runs Created.

My problem here, is that there seems no logical reason for bringing together these variables in the formula.  It's as though James has simply thrown together several statistical measures through some combination of mathematical operations, and out spits a result that "seems" to work.  My suspicion is that there is little connection between Runs Created and actual runs scored.  I checked Runs Created with the results of the games from 5/16/12 (yesterday's games from when I started writing this) and found only a 73.9% correlation between Runs Created and actual Runs Scored*. I'll admit this is not nearly enough to fully prove my suspicions, but I'd bet good money I'm on the right track.  And, I think this is the perfect time for me to play the I'm-A-15-Year-College-Math-Instructor-And-You're-Just-Gonna-Have-To-Trust-Me-On-This-One card here, so, I'm calling Runs Created a swing and a miss.

Keeping in mind the quest for the "Holy Grail" hitting statistic and keeping the focus on the scoring of runs, I'd like to end this post by asking two simple but thought-provoking questions. How are runs scored, and how can this process be measured?

Stay thirsty my friends.

* - For all 30 teams, I used the team's combined statistics to calculate the Runs Created and compared it to the actual Runs Scored for that team using the Linear Regression Test resulting in an rvalue of .739.

No comments:

Post a Comment