It gives some great insight into some of the more advanced stats, and I hope you enjoy it as much as I did.
In the sport’s world of today, statistics play a gigantic role in how a player is evaluated by their coaches and by the fans. It is prevalent in football when the quarterback is judged by their passer rating or in an analysis of the running backs yards per carry. In basketball, the athletes are assessed based on their points per game, rebounds per game and assists per game to decide if a player is bad, good or great. Every sport counts a player’s contributions to a game and the players are then evaluated to try and quantify their additions to whichever game. This provides a lot of opportunity for players to be evaluated in their performances because there are multiple things that are counted in the world of baseball.
These counts created a major opportunity for statistics to get involved and make baseball a prime candidate for a statistical evaluation. It is obvious that somebody noticed because baseball statistics have been around since the sport’s beginning in the 1800’s and, with very little exception, remained pretty much the same until the late 1970’s. The first attempt to evolve baseball happened when a man by the name of Bill James stopped and analyzed the sport; he realized that the basic statistics that had been used were insufficient to truly evaluate the players. The next step was for James to create a new set of statistics called sabermetrics which he felt were better constructed to truly evaluate the players. This idea has helped the game evolve slightly because players can be better evaluated using these new techniques. The only real problem with sabermetrics for the longest time was that many general managers of the teams did not put much faith in these new statistics. This remains partially true even to this day but more people accept sabermetrics because of a man named Billy Beane deciding to implement some of the ideas which lead to a story called Moneyball. The real benefit of the statistics, whether or not they are accepted, has allowed for a general fan to get an expectation for a team and a player in any given situations.
The problem with batting average is that a player’s impact is much wider than just the at bats that end with “excitement”; the true draw of baseball is a team winning and not just how one specific player does in a certain situation. Therefore, a player’s impact over the entire game is more important to fans and teams than just how a player visually impacts the game. A player’s attributes are best shown through their getting on base and is based on what kind of play happens after the hit occurs. Batting average does have the benefit of being easily calculable but it misses out on so much more of the game if the judgments are left solely on these basic calculations.
Now, one of the statistics that many baseball insiders have quit using but did play a gigantic role in the baseball world before the evolution of sabermetrics was called runs batted in. This was a basic counting stat that counted how many players scored because of a play that the batter did in the game. This statistic is good because it shows how often a player actually adds a score to the game which is obviously necessary to win a game. Therefore, counting the runs driven in makes sense and this statistic could be said to measure a player’s clutch ability.
Of course, there is more to clutch than just driving in runs and this stat is as erred as any other counting statistic because all it does is count what happens with no real measurement. The most obvious flaw in the RBI statistic is the fact that RBI opportunities are based on position in the lineup as much as anything else that a player does. It can be seen, as is mentioned in the book called “Baseball Between the Numbers: Why Everything You Know is Wrong,” a player in the leadoff role obviously gets less chances to drive in runs and therefore is situationally at a disadvantage (The Baseball Prospectus Team). Also, the idea of clutch, rationally, can be seen as not just driving in scores but getting on base in a tough situation. This implies that RBI’s are not totally clutch but that this count is only part of being clutch; counting any one thing can never truly accrue how good a player is and this is definitely true for baseball.
The most common statistic used today in baseball to describe a pitcher is what is known as ERA. ERA is an easy to figure number based on how many earned runs (ER) a pitcher allows and how many innings pitched (IP) or parts of innings that he has recorded. They take these two statistics and enter them very simply as . The first statisticians inserted the multiplication by 9 in order to attempt to make ERA’s similar and based on the same statistics. Also, it helps a manager and fan see exactly how many runs a pitcher is expected to give up if he were allowed to pitch a whole game. This statistic is very reliable in the fact that often times it is repeated, especially when the pitcher remains with the same team for multiple years and they do not change the dimensions of their home ballparks. Also, this stat seems to be a pretty accurate judgment of a pitcher’s ability with respect to how his team plays.
Those things being said, this statistic is really only good at judging how good a pitcher is with that certain defense behind him and with respect to the dimensions of his ball park. Therefore, it is impossible to truly compare a player’s statistics when only the ER and IP are taken into account. On top of that, IP is not under a pitcher’s control because the manager chooses when a pitcher come in or out of a game. Therefore, if a pitcher leaves the game with runners on base, he is accountable for their runs if they happen to score, assuming it was a hit that got them on base. Therefore, if the bullpen pitcher comes in the game and allows the runner to score, the starting pitcher is the one who gets the blame for the run allowed even though he was not on the field when the player actually crossed home plate.
Pitchers are probably one of the most scrutinized positions on the diamond because of the fact that the way they throw the ball changes the entire game. That is why the statistic of strikeouts per nine innings was created because a strikeout is important to keep players off the base paths. There are obvious benefits to keeping a runner off the basepaths because by keeping runners off the bases means that the other team has a more difficult time of scoring. The strikeouts per nine innings is a great statistic especially to judge relievers who are coming into the game in tough situations. It allows a manager to see who has the ability to get strikeouts consistently and then bring them into face batters when a hit could cost the team the game.
Now, as with every basic statistic, there are faults tied into the K/9 statistic. The most obvious one is the fact that baseball is a sport very susceptible to injury and that pitchers seem to get injured quite often, up to 50% of the pitchers in the league end up on the DL (disabled list) according to Dr. Conte in an ESPN story, the head doctor for the Los Angeles Dodgers (a major league baseball team). Injuries are obviously able to detract from a player’s ability on the mound or in the field. Since so many actual pitchers end up hurt, it’s possible to argue that injuries are easily able to cause fluctuation in who leads the K/9 statistic depending on how good or bad the player’s luck has been with the injury bug. Of course, there are players that do have what they call “electric” stuff that do generate a lot of K’s and the number does appear somewhat consistent year to year but there are players who have “beginner’s luck” in the league and players who have outstanding years that can be “one hit wonders” also. This is why basing a choice off of one counting statistic is faulty and leads to more trouble if teams become overly reliant on just one statistic.
The big problem with these prior statistics is that a player is only judged by the obvious characteristics like base hits for batters or runs allowed and strikeouts for pitchers. This leads to a reliance on these meager counts and obviously does not include every category that a baseball player actually contributes to a the game and therefore the statistics are drastically under performing their duty of estimating how a player performs in the game. Therefore, the need for more drastic, deep and complete statistics are necessary to really calculate what any player brings to the ball game defensively, pitching or through the a players performance at while batting. This is why someone, namely Bill James, lead a statistical revolution into what is now known as sabermetrics. Sabermetrics came from the two different things; the first part of the word is taken from an organization called SABR which stands for the Society for American Baseball Research which is an organization dedicated to baseball statistics and the metrics comes from econometrics which is the application of math to describe the economical system.
Obviously, this name was given to these statistics because of the group doing the research and because of the evolutionary idea that they were trying to develop. Sabermetrics have done a lot of things to really revolutionize baseball and how a player is observed production-wise because they take into account, usually, multiple different counts and include all of them in their analysis of a player’s true ability. This evolution at the way baseball is analyzed, observed and even played has truly changed baseball. Through study, people have realized that the game is much more than just the “basic” statistics and that many of them are not consistent throughout the a player’s career. There are many statistics that have been created to change the game but some of the main statistics that have truly changed the game are statistics like on-base percentage, slugging percentage, BABIP, WHIP and quality start.
The most common Sabermetric statistic in baseball is known as on-base percentage which is an adaptation to the batting average statistic. This statistic takes into account the hits and by adding in walks and other counts that BA ignores because they did not think the ignored numbers were flashy enough or really important enough to include in their mathematical analysis. The equation for OBP is simply
which takes into account all the ways that a player is able to get on base which obviously is necessary for a team to score. This is why OBP plays a gigantic role in judging how good a batter really is; this stat shows just how often a player gets on base and gives his team an opportunity to score. A better way to think of it comes out of the “Between Baseball” by the Baseball Prospectus team; OBP is really a measurement of the clock in the game because baseball does not have a true timetable on how long a game will take. Instead, baseball is measured in outs and if a player makes an out he has literally cost his team of the game since they each get 27 outs. Therefore, OBP really tells how likely a player is to bring the game closer to the end because it tells just how often that player really makes an out when he reaches the plate.
OBP is a much better statistic than say BA because of the fact that BA can be affected by injury much more easily than can OBP. The reasoning behind this is that an injury takes away some of a player’s strength and physical ability which they need to consistently get the hit that is needed to boost a player’s BA. OBP on the other hand can still remain high because OBP is affected by walks and other ways to get on a base which a person is still able to get because their decision making is not impaired by a physical injury. Therefore, it is much more likely that the OBP truly represents what a player is able to do at the plate than just what the BA implies a player is able to do.
That being said, obviously injuries still take away hits which are accounted in the OBP but OBP has the ability to counteract this problem; the OBP is based as much on mental decisions as it is the physical ability of a player which are not impaired by injuries. It could be argued that if a player was injured, pitchers would be more likely to pitch to them and try to get them to hit the ball but few pitchers have the control of their pitches to do that consistently. Also, with injured players, they only really lose speed and strength so chances are they will still be able to fight off pitchers and make long battles out of their ABs. This makes for an advantage over the non-patient players because good decision makers can get on base at a good rate. The other obvious problem with OBP is the same problem that occurs within the BA category; a player with just a couple of plate appearances is that a player can have a high OBP if he happens to get on base most or all of the time he is up to bat.
OBP does a great job of foretelling how a player will do consistently in the game but it does not tell a perfect story by itself. That is why they went back to the original offensive statistic, BA, and improved it so as to take a better look at the power of a player. This number they created is called slugging percentage and is represented by the term SLUG. This equation is literally the weighting of the kind of hit by however many bases that the specific hit gives the actual player. For example, a HR would be multiplied by four in order to account for the 4 bases that the player actually receives. They simplified this whole situation by determining a new count called total bases to eliminate the multiplication and change the equation into one that looks very similar to BA. The equation appears like
In the sport’s world, there is only so much a statistician can do to take the probability of chance into accordance with a player’s abilities. There have been attempts to do this though including the statistic of batting average on balls in play or BABIP. BABIP is a modified BA equation that looks like:
which takes the BA and takes all the balls that affect the BA that do not actually land in play meaning subtracting HR, or home runs, from the hits and adapting the AB to not include HR, subtract K and add in sacrifice flies so as to account for every ball that is playable in the field. This records just how often those balls actually land for hits which allows for a keen observation into player’s abilities. This allows fans and mathematician to see just how much luck has occurred for the athlete with respect to base hits either against a pitcher or when a batter hits a ball into play.
For batters, BABIP is useful to judge if a BA is actually consistent with how a player has performed throughout the season. BABIP is one of those statistics that have a tendency to follow a law of averages where the average is around .300 of all balls in play usually land in play for a batter. This means that if a player has a high unexpected BABIP than their performance is bound to fall “back to earth” as the saying goes. Now, this is not to say that BABIP is perfect for judging luck because there are players who tend to “find the holes” more often than other players meaning that they seem to have the ability to get a base hit above the average player. That being said, for most players, the statistic of BABIP does a great job of predicting if a player will continue with their success or if they are bound to fall back to earth depending on the difference between a player’s number compared to the league average.
The other benefit of this statistics is that it is applicable to the mound as well as to home plate. As a matter of course, a pitcher at some point will give up a hit during a game but there are pitchers who seem to always get the ball to be hit into an out. By using BABIP, an analysis of a pitcher’s performance can be done to see how often those balls in play actually land for a hit. This is useful to see how much of a pitcher’s game can actually be accredited to him. Also, using BABIP allows for a judgmental call on whether or not the numbers that the pitcher currently is enduring will continue or if the law of averages will bring the pitcher’s numbers up or down based on where their luck has fallen so far throughout the year. There are always exceptions to the rules and there are pitchers like Justin Verlander who have consistent low BABIPs which might imply that a pitcher can have some control but few pitchers seem to contain this consistent behavior. Even with these exceptions, most players seem to fall in line with the league average which would imply that BABIP is useful for predicting how a player will do.
BABIP in itself is not all that useful to actually evaluate a player’s ability without the aid of other statistics. The only real use of this tool is to assess if a player’s other numbers are an accurate evaluation of who they player is inside of the game. Another important use of BABIP is to help evaluate a team’s defense to see how much they help a pitcher. This is especially useful when a pitcher’s statistics are evaluated in such a way as to see their groundball rate or flyball rate and then analyze how the defense has done behind that certain pitcher. This could allow an owner the slight insight into the strengths of the defense which could allow the general manager or owner to target certain kinds of pitchers for their team or use one of the many flawed defensive statistics to try and find defenders that match the strength of their offense.
WHIP, though, has the same problem that any statistic based on hits does; a hit is based as much on luck of how the defense is positioned as much as it is player’s skill meaning that the hits against a pitcher can be exaggerated if batters seem to be having a good year against the pitcher and getting a lot of balls to fall in safely. The only real mitigating factor that WHIP has backing the statistic is the fact that it takes into account the walks a pitcher deals out and therefore can show just how well a pitcher performs. WHIP is good in the fact that it can give us a good example of what a pitcher does and is a good indicator of how well a pitcher is doing in a given year. Look at a pitcher like Tim Lincecum:
For a long time, the amount of games won by a pitcher really dominated how a pitcher was judged throughout the year. This faulty ideal actually dominated and was one of the key pieces in deciding who won the best pitcher award called the Cy Young award. The problem with the idea lies in the fact that a game often times has multiple pitchers enter the game even if a pitcher is doing well. Take the National League for example; pitchers are routinely pulled late in games even if they are pitching well if the score is close and they need a hitter to bat in the pitcher’s spot in the lineup. Of course, there are other reasons that a pitcher is pulled early even when they are pitching well; mainly they are pulled early because either they are a young pitcher who has an inning limit like Steven Strasburg this year or the manager wants a certain matchup such as lefty vs. lefty or having a pitcher come in that a certain batter has had a bad career against. These situations can obviously put the pitcher in a bad spot because if they are pulled with the lead or their team has been failing to score runs then he could be on the hook for a loss even if he did well. This is why a new statistic was created that was built off of counting certain things during the baseball game called Quality Start (QS).
Quality Start is a very basic statistic where they count certain parts of a pitchers game and decide if the start was a QS. The two parts of QS are the number of innings pitched and the number of runs allowed; they count these two areas because if a pitcher goes deep into a game and allows few runs he can better help his team actually win a game even if he is not necessarily winning when he leaves the game. A Quality Start is a statistic that says that a pitcher did his part to try and win a game and that is all he can do. This statistic is really the leveler of the field of wins and has started to replace the “wins” statistics as the dominant number to show how a pitcher is doing through the year. Now, of course, there are limitations to the statistic because if obviously is only applicable to starting pitchers which means it give no value to relief pitchers (RP). This means that the RP do not have a the same number to accurately compare how they do compared to starting pitchers which is of course why QS is not the only number used to evaluate pitchers.
One of the biggest problems with the way that baseball is statistically analyzed is that baseball has no real good stat that accounts for a player’s defensive efficiency. That is not to say that there have not been attempts to create a statistic that describes a player defensively but it is hard when the only counting defensive stat is based on a person’s opinion. This count is called error and is collected when a player is judged to have made a mistake on a play that resulted in a batter getting on base or advancing to another base because of play that the “scorer” defines as his fault. This would work to describe a player if it was not for the fact that an opinion shapes whether or not a play is an error; the “scorer” is allowed to judge a mistake as not an error if they think that a runner would have beaten the play or that the play was out their control as in when a pop fly is ruled to be lost in the sun. One could argue that any miscue by a defender though results in a negative for the defense which is why errors seem to be flawed in the way that they rate a defender. That becomes an even bigger problems when a player begins being rewarded with an actual reward called the Gold Glove when all the voters for the award really want to see is the highlight plays. They do take into account the fielding percentage, the number of plays a player made divided by the number of plays the player had an opportunity to make, but then they care more about the showiness of the player and how outstanding the plays that they make. This means that defense is not really being rewarded completely when they give out their one defensive award. The basic statistics have literally failed to find an adequate way to account for literally half the game of baseball.
The Sabermetric approach is to take errors and any other countable things from the defense in an attempt to create a statistic which will truly evaluate the defensive performance of any one player. Sabermetricians have tried multiple times to make different sets of numbers based on different amounts of the countable number in order to judge a player’s ability out in the field. They are usually based on how many outs a player makes, how many assists they make and usually have the dreaded statistic of errors calculated in because it is the only way to show a player’s mistakes. Some of the more widely used statistics are range factor, defensive efficiency and fielding runs above replacement.
Now, range factor is specifically created to specifically measure how much ground a player can cover on defense. The equation is completely based on just two countable statistics that the defense is accountable for: putouts and assists. The equation specifically is where PO stands for putouts,
Now, judging based on this stat does seem to show how good a player is but there is one gigantic flaw in this numerical stat they created. This stat is based around how many chances a player gets which means that a player is bound to look better if a player has a pitcher that gives him many opportunities to get an out. Now, of course, players who look good are bound to be pretty decent because rarely is every opportunity that a player receives actually routine. Therefore, the numbers are bound to be higher for better players anyways but it is obviously possible for a player to look good if they get multiple balls hit close to them because they would easily get the outs and the assists needed to raise this figure to a very respectable stat so as to make them look like a good player. Therefore, this statistic’s big problem is that it does not judge across all positions and it is easily fallible if a player has a certain pitcher to feed him outs.
Now, a team’s defense is rated by fielding percentage which is based on the error rate that a team has but Defensive Efficiency (Def Eff) take a whole different approach. It takes that statistic called BABIP and transforms it into a rating on how a team played by doing 1 - BABIP (baseballprospectus.com). This shows how well the defense does against balls that are hit into the field and can show just how good a defense is but only if it was looked at over multiple years. As has been discussed in the BABIP section, pitchers seem to have bad luck and good luck at random such that the team can look better if a batter hits right. The good thing about BABIP is that BABIP does take into account the factors such as pitching because pitching has a lot to do with how a ball is hit because it is known that if a ball is put in a certain spot that it is hard to hit high. Therefore, the one good part about this statistic is that it can show a total defense as much as anything even if luck is a big part of how a team performs BABIP. Also, if looked at consistently through a couple years, it is possible to see how efficient the team is and analyze how well it is constructed overall.
Another big benefit of sabermetrics is that people have tried to use numbers from the past to predict wins that a team will achieve and how many wins a player can add to his team through his contributions. There are some obvious connections between winning and losing including getting on base, scoring runs and how a player does in a game. There have actually been a few equations that have tried to say just how a player contributes to a team offensively and defensively. One of the equations, called WAR (wins above replacement), tried to say just how many wins a player actually contributes to a team. They take into account mostly two different categories: runs created by the player and runs taken away by the defender. The runs created was based on many different categories including base running, hitting and based on how a player does in certain situations. This research has expanded though to go beyond just a player’s contributions to a team. Depodesta in Moneyball found another equation that could predict how many games a team will win based on only two numbers; the numbers were runs scored(RS) by a team and the runs allowed(RA) by this same team. This equation, simply put, looks like this:
where the mathematician gets a percentage that is the percentage of wins that team usually gets in the season (math.bgsu.edu). This is not always true as with the Orioles this year who have scored only 667 runs but have allowed 673 total runs. This would imply that they should have a winning percentage of .495 and yet they are, at this point, .571 and only 1.5 games behind the Yankees for first in the AL East. On top of that, they have the 3rd best record in the AL and the 7th best in all of the MLB.
Therefore, looking at the statistics from last year’s game it is possible to say that the best players did not actually win the World Series. This is obvious by first analyzing the starting pitching of the Cardinals rotation where only 1 pitchers had a WHIP that qualified in the top 30 of starters(espn.go.com). On top of that, there is exactly zero starting pitchers for the Cardinals that qualifies in the top 30 of ERA. So, other than that one pitcher, Kyle Lohse, they do not have a pitcher who was really able to keep the batters off base. The players that would be best for a team to win a world series would have a lower WHIP meaning they allowed fewer players on base and a pitcher who had more QS because they allow less runs and keep their bullpen fresh. Obviously, a team could not afford a perfect staff but there were plenty of pitchers out there that would have improved this staff if the money could be found. The best pitchers from the MLB based on their WHIP and their ability to have the QS in 2011 were not one of the pitchers on the Cardinals. There were actually many other qualified pitchers like Justin Verlander who had the 2nd best QS percentage of games started and the best WHIP in all the major leagues. He, if the world was fair, would be the one with the World Series ring but the world is not fair. Even if you were going to have another pitcher instead of Verlander as an “Ace” of the staff, Clayton Kershaw would be the next in line because he had the 2nd best WHIP along with the 10th best QS percentage of his starts. Then, if a team was able to make their teams using hindsight they would choose a pitcher like Jered Weaver next because he had the best QS percentage and the 4th best WHIP overall. If a GM were to want to find a perfect starting rotation, then picking any of the Phillies top three of Cliff Lee, Roy Halladay or Cole Hamels who ranked 7th, 8th and 4th in WHIP and 4th, 4th and 7th in QS percentage respectively. That being said, having the ability to win three games out of five in a playoff series is all that is really necessary to advance in the playoffs so having any of these three pitchers would more than be enough to give a team a chance to win.
So after making the perfect pitching staff, the next step is obviously forming a great offensive team. The team really only needs a few certain characteristics to be a great team; mainly, they need to be able to get on base at the top of the order, hit well in the middle, and have some good power towards the end of the lineup. At least, this is the way a team could be created and a possible way to make a team perfect. Starting with the middle of the order, a team should be smart enough to focus on OPS to perfect their team and they highest three players from 2011 were Jose Bautista, Miguel Cabrera and Ryan Braun with all three having close to 1.000 OPS. Then, a team should learn to focus on the top of the order and the highest OBP players at positions that have not been filled yet were Matt Kemp (.399), Alex Avila (.389) and Dustin Pedroia (.387). This would fill all three outfield positions along with catcher, second base and first base. This leaves a team in need of shortstop and third base which would leave a team to target any characteristic that they feel compelled to and this team would probably go back to OPS just because of the overall effect of the number. The two positions strongest players were Adrian Beltre (.893 OPS at third base) and Troy Tulowitzky (.916 OPS at SS). A team with these players would literally be as close to perfect as possible in the category of offense.
Of course, there are reasons for any team to win a World Series and the Cardinals did have a good reason that they took the championship. In the regular season, they were great on offense ranking in the top 6 in both OBP and SLUG giving them the 5th best OPS meaning that they literally hit their way into the playoffs because their pitching WHIP was actually ranked in the middle of the pack at 15th. This means that they allowed teams a great opportunity to score and were only able to survive by hitting their way out of trouble, or at least that is how they survived the regular season. In the postseason, their pitching actually had the best WHIP of any team that was not eliminated in the first round. This means that in the postseason, the team was able to pick it up on the mound and carry their team throughout the playoffs and they did it consistently even though they should have faced better competition as they went through each round. On top of that, their offense did not slack off any either where they had the second best OPS behind Arizona who was eliminated in 4 games. This implies that, once again, the Cardinals were able to carry their offense throughout the playoffs better than any other team and hit their way to a win just as often as they pitched themselves to victory. Therefore, even though the Cardinals in St. Louis were behind teams such as the Rangers in the regular season in OPS and WHIP, they were able to do better in the playoffs and therefore get past every other team including the Rangers in that World Series matchup. Therefore, even though the Cardinals literally got in the playoffs at the last game of the regular season and they were nowhere near the best team during the season, they earned their victory by being the best team in these categories when they really mattered.
The great thing about baseball is that the counting statistics like home run totals, hits, strikeouts and other simple summations have been enumerated since almost the beginning of the game. This allows for a comparison of numbers between players from different generations both based on their basic statistics that were tabulated but also through the calculation of sabermetrics using this data. Now, a person could argue that the numbers are not directly comparable because of the fact that players did not perform to the same level in every generation but there is a way to get around this slight obstacle that time has presented; if the averages were calculated for a particular year and did a basic division of a player’s total over that year’s average would allow for an analysis of how well a player did compared to his peers and see if his success was as big as another player’s success.