Monday, December 30, 2013


Most baseball fans my age have an addiction to statistics:  batting average, home runs, runs batted in and stolen bases for hitters, wins, ERA and strikeouts for pitchers.  Over time, we've learned that on base percentage is more important than batting average, wins are more a product of the offense supporting the pitcher, and that there are much better ways to evaluate a player's value to a team.

By the late 1980s, a revolution of sorts was under way, led by Bill James.  Admittedly, he had started doing this years earlier, but it took some time to catch on.  I don't know anyone who thought this was a bad idea, but not everyone accepted it:  after all, Bob Welch won the Cy Young in 1990, going 27-6 with a 2.95 ERA.  A great year, but Roger Clemens went 21-6 with an ERA a full run lower and Welch's teammate Dave Stewart had a 2.56 ERA in forty innings more.  Voters were in love with the 27 wins.

Since then, we've come even further - but that wasn't enough.  Fans/statisticians/sabrematricians/geeks wanted a way to compare players who played different positions, different years, different eras...and so WAR was developed. Unfortunately, while many give lip service to it not being the "end all" stat, they continually use it and refer to it as a way of proving their point.

So - what is WAR, and why is it NOT as good as pundits suggest?

As states:
he idea behind the WAR framework is that we want to know how much better a player is than what a team would typically have to replace that player. We start by comparing the player to average in a variety of venues and then compare our theoretical replacement player to the average player and add the two results together.
There is no one way to determine WAR. There are hundreds of steps to make this calculation, and dozens of places where reasonable people can disagree on the best way to implement a particular part of the framework. We have taken the utmost care and study at each step in the process, and believe all of our choices are well reasoned and defensible. But WAR is necessarily an approximation and will never be as precise or accurate as one would like.
So there are a number of ways to determine WAR.  What do they include, and why are they questionable?

  1. WAR establishes a base level, called "replacement level".  This assumes what a AAA minor leaguer would accomplish at the major league level.  It usually assumes that a team of minor leaguers would have a 52-110 season.  Why?  It is an arbitrary choice - one of many within WAR that doesn't have a factual origin.  Again, it's not exact - Baseball-Reference uses 48-114 as their base.
  2. Ballpark effects are used to balance a hitter's performance between various ballparks.  We can accept that PetCo is a pitchers park, and Coors Field is a hitters park, but to what level?  Again, the numbers for each ballpark are included, and are NOT an exact number.
  3.  Fielding statistics have come a long way since fielders were merely graded on putouts, assists and errors.  For instance, a shortstop on a team of ground ball pitchers will get more opportunities than a SS on a fly ball staff.  BUT, fielding statistics are still incomplete.  UZR (Ultimate Zone Rating) and TZR (Total Zone Rating) are two of the more accepted methods of evaluating fielding, but as it says in its description, "defense is best judged over three-year spans, as a given year contains a relatively small sample and can result in large statistical swings."  In addition, if a fielder positions themselves differently than what UZR anticipates (such as in a shift), it will skew the numbers.
  4. So what is the value of defense vs. offense for a player?  No one has been able to establish the ratio of value - how much value does a no-hit, great fielding shortstop bring in comparison to a Derek Jeter, for instance, who in his career has been viewed as a great hitting, poor fielding shortstop.  How much should defense count in a player's value?  Here too an arbitrary ratio is used.
  5. In addition, values are assigned to each position in the field. Again, from Baseball-Reference:

      1. C: +10 runs
      2. SS: +7.5 runs
      3. 2B: +3 runs
      4. CF: +2.5 runs
      5. 3B: +2 runs
      6. RF: -7.5 runs
      7. LF: -7.5 runs
      8. 1B: -10 runs
      9. DH: -15 runs
         Constants included in the formula designed to give credit to the positions that are more important defensively...or more difficult to play.  For catchers, this does NOT include the ability to "frame pitches", which has significant value and may be included in future WAR formulas, nor does it include the ability to call a game.
6.  For pitchers, the formula is more complex with just as many arbitrary variables.
 So what's my point?

  1. With so many artificial constants placed in the equation, AND variables that aren't quite understood exactly, it is hard to accept WAR as the end-all formula that compares player to player.  It is useful to compare players...but it should not be relied on as the final say.
  2. With our ability to adjust statistics over different years and ballparks, ERA+, OPS+ and others do as good a job of evaluating players.  What those statistics do NOT do is allow the overall comparison between different positions, which is why a formula like WAR is used.
  3. I view WAR as similar to physicists' Unified Theory.  There MUST be a way of uniting all of these other formulas into a nice, neat single method.  Physicists have yet to find the Unified Theory, though much progress has been made.  In that sense, I think baseball is exactly like physics. 

No comments:

Post a Comment