home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Total Baseball (1994 Edition)
/
Total_Baseball_1994_Edition_Creative_Multimedia_1994.iso
/
dp
/
0019
/
00191.txt
< prev
next >
Wrap
Text File
|
1994-02-14
|
68KB
|
1,029 lines
$Unique_ID{BAS00191}
$Pretitle{}
$Title{Sabermetrics}
$Subtitle{}
$Author{
Thorn, John}
$Subject{Sabermetrics statistical analysis sabermetrician sabermetricians
statistic statistics stat stats numbers Number Linear Weights Runs Wins Run
Productivity Average RPA RBIs RBI home run HR triple 3B double 2B single 1B
walks walk/HBP BB HB stolen base SB caught stealing CS bat AB hits H outs
OOB Formula SABR Fielding Pitching Earned Average ERA Park Factor Relativity
Relief}
$Log{}
Total Baseball: Registers, Leaders, and Rosters
Sabermetrics
Sabermetrics may be a new coinage for the statistical analysis of
baseball but it is not a new phenomenon. Henry Chadwick, in the antebellum
period, was as much a sabermetrician as Allan Roth or Bill James or Pete
Palmer: he saw as clearly as they that because the object of the game is to
win, runs are the best measure of player performance, just as they are of team
performance at the end of a game.
After many decades in which this fundamental truth was lost (amid the
general worship of false idols like batting average and pitcher won-lost
percentage), today's sabermetricians have come around full circle to the game
as it was originally understood. And what's remarkable about this is that in
order to return to the primordial simplicity of the 1840s, '50s, and '60s,
when runs and outs were all that went in to the box score, they have relied
upon computer simulations and higher mathematics. In other words, with the
new statistics, simplicity emerges from complexity; what baseball statistics
have offered for the last hundred years or so has been, despite the appearance
of simplicity, in fact extremely complex.
For the veteran fan as well as for Organized Baseball, new ideas, new
statistics, and new discoveries that dispute long-held verities (Ty Cobb's hit
total, Hoss Radbourn's number of victories in 1884, etc.) may represent a
challenge to tradition and thus a threat to the very soul of baseball, its
proud anachronism. Bernard Malamud wrote, "The whole history of baseball has
the quality of mythology." The editors of Total Baseball relish the game's
myths, from Abner Doubleday to the sacrifice bunt, and believe that in setting
the record straight or turning conventional wisdom on its head, they are
adding to the fan's enjoyment of the rich texture of the game. If you are one
of the skeptics--like Earl Weaver, who once said, "There's no such thing as a
new statistic"--please permit us to make the case for sabermetrics.
If you skipped over the preceding Introduction, which is largely taken up
with an overview of baseball's traditional measures, we would encourage you to
go back and read it before proceeding with this discussion of sabermetrics.
As with the Introduction, much of the material in this section is adapted from
the editors' earlier The Hidden Game of Baseball.
What's in a Number?
On April 27, 1983, the Montreal Expos came to bat in the bottom of the
eighth inning trailing the Houston Astros 4-2. First up to face pitcher Nolan
Ryan was Tim Blackwell, a lifetime .228 hitter who had struck out in his first
time at bat. At this routine juncture of this commonplace game, Ryan stared
down at Blackwell, but his invisible--yet, for all that, more
substantial--opponent was a man who had died the month before Ryan was born, a
man about whom Ryan knew nothing, he confessed, except his statistical line.
For at this moment of his glorious big-league career, Ryan had accumulated a
total of 3,507 strikeouts, only one short of the mark Walter Johnson set over
twenty-one seasons, from 1907 to 1927. Long thought invulnerable, Johnson's
record was in imminent danger of falling, in 1983, not only to Ryan but also
to Steve Carlton and Gaylord Perry.
Ryan fanned Blackwell and then froze the next batter, pinch-hitter Brad
Mills, with a 1-and-2 curveball. The pinnacle was his. Johnson had been
baseball's all-time strikeout leader since 1921, when he surpassed Cy Young.
Ryan would hold that title for just a few weeks, then would be overtaken by
Carlton, only to display an incredible finishing kick and top the 5,000 mark
in 1990. But at the time that Ryan topped Johnson, baseball savants scurried
to assess the meaning of 3,509 for both the deposed King of K and the new.
In the aftermath of Ryan's feat, some writers pointed out that he only
needed sixteen full seasons, plus fractions of two others, in which to record
3,509 strikeouts while Johnson needed twenty-one, or that Johnson pitched over
2,500 more innings than Ryan. Coming into the 1983 season, Ryan had fanned
9.44 men per nine innings, while Johnson was way down the list at 5.33. And
Ryan allowed fewer hits per nine innings than Johnson, or, for that matter,
anyone in the history of the game. So, it would seem 3,509 was not just one
batter better than Johnson, but rather was mere confirmation for the masses of
a superiority that was clear to the cognoscenti years before.
However, other writers introduced mitigating factors on Johnson's behalf,
much as Ruth found supporters as the home run king even after Aaron hit number
715. These champions of the old order cited Johnson's won-lost record of
417-279 and earned run average of 2.17 while scoffing at Ryan's mark, entering
1983, of 205-186 with an ERA of 3.11. This tack led to further argument in
print, bringing in the quality of the teams each man pitched for and against,
the resiliency of the ball, the attitudes of the batters in each era toward
the strikeout, the advent of night ball, integration, expansion, the
designated hitter, the overall talent pool, competition from other
professional sports . . . and on down into the black hole of subjectivism.
Why were so many things dragged into that discussion? Because the
underlying question about 3,509 was: Does this total make Ryan better than
Johnson, or even a better strikeout pitcher than Johnson? At the least, does
it make him a great pitcher? In our drive to identify excellence on the
baseball field (or off it), we inevitably look to the numbers as a means of
encapsulating and comprehending experience. This quantifying habit is at the
heart of baseball's hidden game, the one ceaselessly played by Ryan and
Johnson and Ruth and Aaron--and, thanks to baseball's voluminous records, more
than 13,000 other players--in a stadium bounded only by the imagination.
What's in a number? The answer to "How Many?" and sometimes a great deal
more. In this case, 3,509 men had come to the plate against Ryan and failed
to put the ball in play, one more man than Johnson had returned to the dugout,
cursing. So what's the big deal? That Ryan was .0002849 faster, scarier,
tougher--better--than Johnson? An absolute number like 3,509, or 714 (the
home-run record once thought invulnerable, too), or 4,191 (the erroneous hit
total of Ty Cobb that Pete Rose finally surpassed) does not resound with
meaning unless it is placed into some context that will give it life.
Baseball statistics are not the instruments of vivisection, taking the
life out of the game in order to examine it; rather, statistics are themselves
the vital part of baseball, the only tangible and imperishable remains of
contests played yesterday or a hundred years ago. Baseball may be loved
without statistics, but it cannot be understood without them. As the
statistics reflect more accurately the reality of what happened on the field,
greater understanding leads to a deeper love and appreciation of this great
game--which is, essentially, the case for sabermetrics and the reason for
Total Baseball.
The Linear Weights System
In 1982, Milwaukee's Robin Yount had the year of his life, batting .331
with 29 homers, 114 RBIs and 129 runs scored; he led the American League in
hits, doubles, total bases, and slugging percentage, while finishing just one
point behind the league leader in batting average. First of the two times in
his career, he was voted the Most Valuable Player in the American League,
being named first on all but one of the twenty-eight ballots cast by the
baseball writers.
Over in the other league, Mike Schmidt of the Phillies was having an off
year, batting only .280 with 35 homers and 87 RBIs; the previous year, when he
was awarded the MVP, in only 102 games played he had totaled 31 homers and 91
RBIs. He did lead the league once again in 1982 in slugging percentage, and
he did win the Gold Glove at third base for the seventh straight year, yet in
the MVP balloting none of the ballots listed him higher than fourth; ten
ballots were cast without listing him at all.
For Yount, 1982 was a crowning achievement; for Schmidt, a
disappointment: That is the verdict reached by the baseball writers and
conventional baseball statistics. Yet in terms of actual performance, as
determined by the number of runs contributed, Schmidt's "off year" was
scarcely different from Yount's. With the bat, Yount accounted for 50.3 runs
beyond what an average batter might have contributed; Schmidt, 47.2. Through
base stealing, Yount added 2.4; Schmidt none. With the glove, Yount was 4
runs below league average at his position; Schmidt was 18.8 above average at
his position. Total runs contributed: Yount 48.7, Schmidt 66.0. Total wins
contributed beyond average by each: Yount 6.5, Schmidt 6.3. Both men had
outstanding seasons, the best in their respective leagues, and both
outstripped the second-best player by about the same margin.
Viewing player (and team) performance through this sort of prism
frequently produces such illuminating results. Cecil Fielder had a wonderful
year in 1990, with his 51 homers, 132 RBIs, and league-leading figures in
slugging average and extra-base hits. But how did he convince any writer
voting for MVP that he had a better year than Rickey Henderson? In Total
Baseball, you could look it up: Fielder contributed 4.4 extra wins to his
team (wins that an average player would not), which was the fourth-best figure
in the American League that year; Henderson was responsible for a whopping
7.7, not only the top mark in 1990 but also the third-best mark in the AL
since Ted Williams' epic season in 1941!
This is the kind of analysis of player performance possible with a
variety of sabermetric measures, not just the Linear Weights System. The
common ingredient of most of the new, as yet unofficial statistics is their
creators' recognition of the relationship between runs and wins.
Runs and Wins
George Lindsey, in an article in Operations Research in 1963, was the
first to assign run values to the various offensive events which lead to runs:
Runs = (.41)1B + (.82)2B + (1.06)3B + (1.42)HR. He based these values on
recorded play-by-play data and basic probability theory. Unlike Earnshaw
Cook, who in the following year assigned run values on the basis of the sum of
the individual scoring probabilities--that is, the direct run potential of the
hit or walk plus those of the baserunners set in motion--Lindsey recognized
that a substantial part of the run value of any non-out is that it brings
another man to the plate. This additional batter has a one-in-three chance of
reaching base and thus bringing another man to the plate with the same chance,
as do the batters to follow. The indirect run potential of these batters
cannot be ignored.
Steve Mann's Run Productivity Average (RPA) assigned these values based
on observation of some 12,000 plate appearances: RPA = (.51)1B + (.82)2B +
(1.38)3B + (2.63)HR + (.25)BB + (.15)SB - (.25)CS, all divided by plate
appearances, then plus .016. His values were denominated in terms of the
number of runs and RBIs each event produced. Bill James, at about the same
time, came up with a similar formula, since shunned, with values based on runs
plus RBIs minus home runs. The drawbacks to the approaches of Mann and James
were the drawbacks of the RBI, which gives the entire credit for producing a
run to the man who plates it, and of the run scored, which gives credit only
to the man who touches home, no matter how he came to do so. For example,
with no outs, a man reaches first on an error; the next batter hits a double,
placing runners on second and third; the following batter taps a roller to
short and is thrown out at first, with the run scoring from third. The man
who produced the out is given the credit for producing a run, while the man
who started the sequence by reaching first on an error is likewise credited
with a run. The man who hit the double, which was surely the key event in the
sequence which produced the run, and the only one reflecting batting skill,
receives no credit whatsoever. In this regard, any formula based on "Runs
Produced" (whether R + RBI or R + RBI - HR) is philosophically inferior to
the formula Lindsey proposed, despite his failure to account for walks,
steals, and other events.
The run values in the Linear Weights formula for identifying batters'
real contribution are derived from Pete Palmer's 1978 computer simulation of
all major-league games played since 1901. All the data available concerning
the frequencies of the various events was collected; following a test run,
these were tabulated. Unmeasured quantities, such as the probability of a man
going from first to third on a single vs. that of his advancing only one base,
were assigned values based on play-by-play analysis of over 100 World Series
contests. The goal was to get all the measured quantities very nearly equal
to the league statistics; then the simulation would provide run values of each
event in terms of net runs produced above average. Expressing the values in
those terms would give a meaningful base line to individual performances,
because if you are told that a player contributed 87 runs you don't know what
that signifies unless you know the average level of run contribution in that
year: 87 may sound like a lot, but if the norm was 80, then you know the
player contributed only 7 runs beyond average.
The values obtained from the simulation are remarkably similar from one
era to the next, confounding expectations that the home run would prove more
valuable today than in the dead-ball era, or that the steal was once a primary
offensive weapon. These values are expressed in beyond average runs.
Run Values of Various Events, by Periods
Event Period
-----------------------------------------------------------------
1901-20 1921-40 1941-60 1961-77
-----------------------------------------------------------------
home run 1.36 1.40 1.42 1.42
triple 1.02 1.05 1.03 1.00
double .82 .83 .80 .77
single .46 .50 .47 .45
walk/HBP .32 .35 .35 .33
stolen base .20 .22 .19 .19
caught stealing -.33 -.39 -.36 -.32
out [*] -.24 -.30 -.27 -.25
-----------------------------------------------------------------
* An out is considered to be a hitless at bat and its value
is set so that the sum of all events times their frequency is
zero, thus establishing zero as the base line, or norm, for
performance.
In the years since this simulation was conducted, statistician Dave
Smith ("Maury Wills and the Value of the Stolen Base," Baseball Research
Journal, 1980) convinced Pete to adjust the values of the stolen base and
caught stealing because of their situation-dependent, elective nature:
Attempts are apt to occur more frequently in close games, where they would be
worth more than if they were distributed randomly the way an event like a
single or a home run would be. Pete revised the value for the steal upward to
.30 runs, while for the caught stealing it becomes -.60 runs.
Just as these run values change marginally with changing conditions of
play, they differ slightly up and down the batting order (a homer is not
worth as much to the leadoff hitter as it is to the fifth-place batter; a walk
is worth more for the man batting second than for the man batting eighth);
however, these differences have been averaged out in the figures above. For
evaluating runs contributed by any batter at any time, there is no better
method than Batting Runs, the Linear Weights formula derived from the
computer simulation which is the basis of the table above.
The Formula
Runs = (.47)1B + (.78)2B + (1.09)3B + (1.40)HR + (.33)(BB + HB) + (.30)SB -
(.60)CS - (.25)(AB - H) - .50(OOB).
The events not included in the formula that you might have thought to see
are sacrifices, sacrifice hits, grounded into double plays, and reached on
error. The last is not known for most years and in the official statistics is
indistinguishable from outs on base (OOB). The sacrifice has essentially
cancelling values, trading an out for an advanced base which, often as not,
leaves the team in a situation with poorer run potential than it had before
the sacrifice. The sacrifice fly has dubious run value because it is entirely
dependent upon a situation not under the batter's control: While a single or
a walk always has a potential run value, a long fly does not unless a man
happens to be poised at third base (whether it is achieved by accident or
design is open to question, as well, but that is beside the question--getting
hit by a pitch is not a product of intent, either). Last, the grounded into
double play is to a far greater extent a function of one's place in the
batting order than it is of poor speed or failure in the clutch, and thus it
does not find a home in a formula applicable to all batters. It is no
accident that Henry Aaron, who ran well for most of his long career and wasn't
too bad in the clutch, hit into more DP's than anyone else, nor that Roberto
Clemente, Al Kaline, and Frank Robinson, who fit the same description, are
also among the ten "worst" in this department. If Boston's Luis Rivera
doesn't hit into many twin killings, it's not because of adept bat handling or
blazing speed but because he bats ninth.
The Linear Weights formula for batters may be long, but it calls for only
addition, subtraction, and multiplication and thus is as simple as the
slugging average, whose incorrect weights (1, 2, 3, and 4) it revises and
expands upon. Each event has a value and a frequency, just as in slugging
average, yet as in no batting statistic you have ever seen, outs are treated
as offensive events with a run value of their own (albeit a negative one), a
truth so obvious it somehow escaped notice. Just as the run potential for a
team in a given half inning is boosted by a man reaching base, it is
diminished by a man being retired; not only has he failed to change the
situation on the bases but he has deprived his team of the services of a man
further down the order who might have come up in this half inning, either with
men on base and/or with scores already in.
What Batting Runs does is to take every offensive event and treat it in
terms of its impact upon the team--an average team, so that a man does not
benefit in his individual record for having the good fortune to bat cleanup
with the Giants or suffer for batting seventh with the Astros. The
relationship of individual performance to team play is stated poorly or not at
all in conventional baseball statistics. In Batting Runs it is crystal clear:
the linear progression, the sum of the various offensive events, when weighted
by their accurately predicted run values, will total the runs contributed by
that batter or that team beyond the league average.
Recognizing some dedicated readers of Total Baseball will wish to keep
track of batting performance by computing Batting Runs themselves over the
course of a season, and that they may be frustrated by the difficulty of
calculating the "At Bats-Hits" factor for the league, which is necessary to
determine the negative value of an out, we advise that using a fixed value of
-.25 for outs will tend to work quite well if you wish to include pitcher
batting performance, and a fixed value of -.27 will serve if you wish to
delete it. Actually, any fixed value will suffice in midseason; it's only
when all the numbers are in and you care to compare this year's results with
last year's (or with those of the 1927 Yankees) that more precision is
desirable. At that point the value of the out may be calculated by the
ambitious among you, but ideally, your newspaper or the sporting press will
provide accurate Batting Runs figures. Who, after all, calculates ERA for
himself?
Batting Runs and Production
For those to whom calculation is anathema, or at the least no pleasure,
Batting Runs has a "shadow stat" that tracks its accuracy to a remarkable
degree and is a breeze to calculate: Production, which consists simply of On
Base Percentage Plus Slugging Average. While it is not expressed in runs and
thus lacks the philosophical appeal of Batting Runs, the standard deviation of
its most complete version is 20.4 runs compared to the 19.8 of Batting Runs.
In other words, the correlation between Batting Runs and Production over the
course of an average team season is 99.7 percent.
However, as an average or ratio, Production measures the rate of batting
success (efficiency), while Batting Runs measures the amount of success. For
example, a batter who goes 2-for-5 with a walk in one game, those 2 hits being
doubles, will have an On Base Percentage of .500 and a Slugging Average of
.800; his Production will be 1.3, or as stated for convenience in Total
Baseball, 130. Another batter, who in 162 games gets 200 hits and 100 walks
in 500 at bats, with 400 total bases, will have an identical OBP, SLG, and
PRO. Which player has contributed more to his team? Clearly, longevity, or
amount of production, is no less important than rate of production.
To cite a specific instance in which Production and Batting Runs differ,
take George Brett's remarkable 1980 season in which he batted .390, had 298
total bases, 75 bases through walks or HBP, and 118 RBIs--all in only 117
games played. In the table of all-time single-season leaders in production,
the Kansas City third baseman ranks 44th when his PRO of 1.124 is normalized
to the league average and adjusted for home-park effects. Yet in the table of
park adjusted Batting Runs, Brett's season ranks out of the top 100 because
he missed 45 games, in which his team derived no benefit from his high rate of
performance. (Had Brett played 162 games and continued to perform at the same
level, his Batting Runs would have been not 64.8 but 89.7, the 19th best mark
in history.)
Because PRO is not expressed in runs, it is less versatile than Batting
Runs. For just as runs are proportional to the events that form them, so are
they proportional to wins and losses. This statement, a truism today, was a
novelty in 1954 when Rickey and Roth first stated the correlation between run
differentials and team standings. But they did not take the next step, to
recognize that not only a team's standing but even its won-lost record could
be predicted from the run totals.
"The initial published attempt on this subject," Pete wrote in the 1982
issue of the SABR annual The National Pastime, "was Earnshaw Cook's Percentage
Baseball, in 1964. Examining major-league results from 1950 through 1960 he
found winning percentage equal to .484 times runs scored divided by runs
allowed. . . . Arnold Soolman, in an unpublished paper which received some
media attention, looked at results from 1901 through 1970 and came up with
winning percentage equal to .102 times runs scored per game minus .103 times
runs allowed per game plus .505. . . . Bill James, in the Baseball Abstract,
developed winning percentage equal to runs scored raised to the power x,
divided by the sum of runs scored and runs allowed each raised to the power x.
Originally, x was equal to two but then better results were obtained when a
value of 1.83 was used. . . .
"My work showed that as a rough rule of thumb, each additional ten runs
scored (or ten less runs allowed) produced one extra win, essentially the same
as the Soolman study. However, breaking the teams into groups showed that
high-scoring teams needed more runs to produce a win. This runs-per-win
factor I determined to be ten times the square root of the average number of
runs scored per inning by both teams. Thus in normal play, when 4.5 runs per
game are scored by each club, each team scores .5 runs per inning--totalling
one run, the square root of which is one, times ten."
Note that when Palmer refers to the need for approximately ten additional
runs scored (or ten fewer allowed) to provide a team with an additional win,
he does not mean that it takes ten runs to win any given game. Obviously, in
a specific case, a one-run margin is all that is required; but statistics are
designed for the long haul, not the short.
What does this have to do with Batting Runs? Remembering that Batting
Runs are expressed not simply in runs but in beyond-average runs, the
conversion from a batter's Linear Weights runs to his wins is a snap: simply
divide Batting Runs by the number of runs it takes to gain an extra win in a
given year. Taking the exploits of Babe Ruth in 1927, we see that through
batting alone he contributed 100.7 runs, or 9.56 wins, since in the American
League in 1927 it took 10.53 runs to produce an additional win. If every
other player on the Yankees had performed at the league average, the New York
record should have been 87-67; if each of the seven other batters had
performed only half as well as Ruth and had added five extra wins (discounting
reserves, pitchers, fielders, and stealers, whom we shall presume for this
discussion to have been average), the Yankees would have gained another 35
wins (7 X 5) to finish with a won-lost mark of 122-32.
Stolen Base Runs
The Linear Weights formula for batters contains a factor for base
stealers, expressed in runs. How do you judge the effectiveness of a base
stealer? Conventional baseball statistics will lead you to the conclusion that
whoever has the most steals is the best thief; that is the sole criterion for
The Sporting News annual "Golden Shoe Award" in each league. How often the
man with the most steals may have been thrown out is of no concern.
An article in the 1981 Baseball Research Journal by Bob Davids offered
something more sophisticated yet utterly simple: a stolen base percentage,
which is simply stolen bases divided by attempts. The best stolen base
average of all time, insofar as we know and based on a minimum of 30 attempts,
is Max Carey's in 1922 when he stole 51 bases in 53 attempts. The most times
caught stealing in the course of a season was Ty Cobb's 38 in 1915, until 1982
when Rickey Henderson was nabbed 42 times. But the best method yet devised,
and the one that is pleasingly simple, is to apply the Linear Weights method
to get Stolen Base Runs. One multiplies the steals by their run value of .30
and the failed attempts by -.60, and adds the two products. The
implication for such men as Ty Cobb, Rickey Henderson, and Vince Coleman is
clear: It takes a fabulous stealing performance to produce as much as one
extra win for the team.
In 1915 Ty Cobb, when he established the modern stolen base record of 96,
can be seen to have contributed to his team 28.8 runs, while his 38 foiled
larcenies cost 22.8. Thus Cobb, for all his whirling-dervish activity,
accounted for only 6 non-par runs--not even a single win. Whoa! You mean
that not a single one of Cobb's steals produced a victory? That is not what
is being said: the fact is that while the gain from the stolen base is
entirely visible--an extra base which may be followed by a hit that would
otherwise not have produced a run--the cost of the caught stealing is entirely
invisible, or conjectural, except with the aid of statistics. How many big
innings did Cobb run his team out of? How many batters reached base in
ensuing innings who might, in an earlier inning, have had their contributions
count for runs? What Stolen Base Runs indicate are that, on balance, not on a
specific-case basis, the stolen base is at best a dubious method of increasing
a team's run production.
Now let's take a look at what Henderson did. His record 130 stolen bases
in 1982 produced 39 runs for his team. His 42 failed attempts took away 25.2
possible runs. Net effect: approximately 14 runs, or one and a half wins, a
performance nearly three times as good as Cobb's. In 1983, stealing 22 fewer
bases, he was even better, accounting for 21.0 runs. However, the all-time
best stealing record is that of Maury Wills in 1962, when he stole 104 bases
and was caught only 13 times. Wills' 104 stolen bases produced 31.2 runs;
his 13 failed attempts cost only 7.8. So, his baserunning contribution was
23.4, or a little over two wins.
Fielding Runs
As mentioned earlier, in 1954 when Branch Rickey and Allan Roth came up
with their "efficiency formula" for run scoring and run prevention, the
defensive half of the equation was divided into five segments, the last of
which was fielding, to which they assigned a mathematical value of zero.
"There is nothing on earth," Rickey declared, "anyone can do with fielding."
Since then many have tried, with mixed results, to improve upon the mere
toting up of raw data--putouts, assists, errors, double plays. In the second
edition of Total Baseball, we improved upon the Fielding Runs formula by
calculating innings played at each position, plate appearances for all players
on the team, and then rating each fielder based on his chances per inning.
(Formerly we had rated each position on each team based on totals for all
players on that team at that position; then we split up the total based on
putouts. For more on the formula, see the Introduction to the new Fielding
Register, and the Glossary.)
In the current edition we have also rated left fielders against left
fielders, center fielders against center fielders, and right fielders against
right fielders; where previously all outfield positions had been grouped
together. We revised thoroughly the formula for catchers, which retains the
highest degree of subjectivism because their primary defensive contribution
comes not with the glove but through calling the pitches.
More on this complex subject in the Glossary.
Pitching Runs
Determining the run contributions of pitchers is much easier than
determining those of fielders or batters, though not quite so simple as that
of base stealers. Actual runs allowed are known, as are innings pitched.
Let's assume that a pitcher is responsible only for earned runs. Then why, we
hear some of you asking, is the ERA not measure enough of his ability?
Because it tells only the pitcher's rate of efficiency, not his actual benefit
to the team. In a league with an ERA of 3.50, a starter who throws 300
innings with an ERA of 2.50 must be worth twice as much to his team as a
starter with the same ERA who appears in only 150 innings. Through Pitching
Runs, we seek to determine the number of beyond-average runs a pitcher
saved--the number he prevented from scoring that an average pitcher would have
allowed.
The formula for Earned Run Average is:
ERA = (Earned Runs x 9)/Innings Pitched
The number of average, or par, runs for a pitcher, which is represented
by a Pitching Runs figure of zero, is equal to:
(League ERA X IP)/9
If the league ERA is 3.79 (as the National League's was in 1990) and a
pitcher's ERA is also 3.79, he will by definition have held batters in check
at the league average no matter how many innings he pitched. If, however, his
ERA was 2.67 and he hurled 249 innings (as Frank Viola did for the Mets in
'90), he will have saved a certain number of runs that an average pitcher
might have allowed in his place; to find that number we employ the Pitching
Runs formula:
Pitcher's Runs = Innings Pitched X (League ERA/9) - ER
This represents the difference between the number of earned runs allowed
at the league average for the innings pitched and the actual earned runs
allowed. For the case of Viola, we get
Runs = 249 X 3.79/9 - 74 = 31.2
Viola was 31.2 runs better than the average National League pitcher in
1990, and had he been transported to an average NL team--that mythical entity
that scores as many runs as it allows while winning 81 and losing 81--he would
have made that team's mark 84-78. An alternative way to calculate pitchers'
Linear Weights, useful with oldtimers for whom you may have the ERA but not
the number of earned runs allowed, is to use the pitcher's ERA, subtracted
from the league's ERA, multiplying by the innings pitched, then dividing by
nine. In Viola's case, this approach would look like:
(3.79 - 2.67) X 249/9 = 31.0
The difference of two tenths of a point is accounted for because we are
using the ERA of 2.69, which has been rounded off, rather than the absolute
figure of the pitcher's earned runs allowed, 74.
The two parts of performance--efficiency and durability, or how well and
how long--are incorporated into all Linear Weights measures. If you are
performing at a better than average clip, the more regularly you do so, the
more your team will benefit and thus the higher your Linear Weights measure.
If you are stealing bases nine times out of ten, your team will benefit more
from sixty attempts than from forty; if you are batting at an above average
clip, it's better to play in 160 games than 110; if you're allowing one earned
run per game less than the average pitcher, your LWTS will increase with
innings pitched.
A problem emerges in this regard when trying to compare the Pitching Runs
of a pitcher from 1978 like Ron Guidry, with that of Hoss Radbourn in 1884.
In the "efficiency" component of the formula, which may be understood as the
league ERA minus the individual's ERA, the two compare this way:
Guidry = 3.76 - 1.74 = 2.02
Radbourn = 2.98 - 1.38 = 1.60
Guidry's differential is "unfairly" boosted by the higher league ERA of
1978; in fact, if we had compared the two by their normalized ERAs, which is
logically more sound, the results would have been:
Guidry = 3.76/1.74 = 2.16 Radbourn = 2.98/1.38 = 2.16
Yet because rules and playing conditions allowed Radbourn to extend his
efficiency over 679 innings, while Guidry hurled "only" 274, their Pitching
Runs look like this:
Guidry = 62.0 Radbourn = 120.6
There is a great deal more to say on the subject of pitching and
sabermetric stats: see the Introduction to the Pitching Register and the
Glossary.
Linear Weights in Practice
Having formulas for pitching, fielding, baserunning, and batting, we can
assess the run-scoring contribution of every individual who has ever played
the game, and thus the number of wins that he has contributed in a given
season or over his career. The number of runs required to produce an
additional win has varied over the years between 9 and 11 runs, with a very
few league seasons outside those parameters.
Limited by conventional baseball statistics, one might, in 1990, have
uttered something like, "Barry Bonds hit .293 with 33 homers and 114 RBIs--the
guy must have been worth 10 extra wins to Pittsburgh all by himself!" Or:
"The White Sox are only one pitcher away from winning the division." Or:
"The Yankees are only three players away from being a contender." Or,
"Letting Darryl Strawberry get away was the worst thing the Mets ever did;
they'll be a second-division club for a decade." With Linear Weights, these
statements, or rather the concerns they reflect, can be approached with some
data and with some degree of objectivity. First: Bobby Bonds had a fine year
in 1990, but to have contributed 10 wins by himself he would have had to
account for nearly 100 Linear Weights runs, a mark that has been attained by
only three men in major-league history. In fact, Bonds contributed 6.5 wins
in '90, though he did post 9.0 wins in 1992.
As to the White Sox, they finished 94-68 in 1990, while their Linear
Weights projected them to finish at 81-81. The Athletics, who won the AL West
at 103-59, actually projected to finish 96-66. So, the Sox management might
have asked, how to close ground on the Athletics? Could one pitcher--like Bob
Welch, for whom they bid in the free-agent bazaar--make the difference? To do
so, he would have to contribute about 150 Pitching Runs, a feat no pitcher has
ever accomplished. In 1990, pitching for Oakland--and remember, the Linear
Weights formula is divorced from considerations of batter support--Welch
contributed 20.7 park-adjusted Pitching Runs. So presuming that he pitched as
well for the White Sox as he did for the Athletics, or even slightly better,
he would not be enough to "win" Chicago the flag on paper; Chicago would need
help from other quarters.
Regarding the other statements, you get the picture: sabermetric
analyses like the ones above that employed the Linear Weights System will tend
to puncture fantasies.
Park Factor
A central issue for sabermetricians is the network of illusion created by
home-park dimensions, atmospheric conditions, and visibility for batters. How
many home runs would Mark McGwire hit if he played half his games in Fenway
Park? Will the Atlanta Braves and Chicago Cubs keep "failing" to put together
solid pitching staffs--or has their pitching been adequate all along? Why
have the American League leaders in triples so often worn a Royals uniform?
One's home park has a powerful effect on a player or pitcher's record,
elevating some good players to greatness and denying the spotlight to some
outstanding performers.
It should be understood that the average player does better at home
regardless of the park--familiarity breeds success, it seems. Individuals bat
and pitch at a rate 10 percent higher at home, on average. But parks don't
create performance; they only affect it. For example, a lefthanded hitter at
Fenway can do very well indeed, as Wade Boggs has, by learning to take the
outside pitch to left field. Likewise, a righthanded batter can make the
friendly Green Monster into his nemesis by trying to pull every pitch.
For hard luck in home parks, it is tough to top the record of Dave
Winfield, who has had the misfortune to call both San Diego and Yankee
stadiums home before landing in the more or less neutral Big A in Anaheim.
Through 1990, his lifetime Production, normalized to league average but not
adjusted for park effects, was 117th best on the all time list of those
playing in 1,000 games. Had he played his home games instead in Fenway Park,
his PRO would have projected to the 45th best of all time. Had he even played
in an average hitters' park--which is what PRO+ measures--his record would
show itself to be the 80th best ever.
If we desire to remove the silver spoon or the millstone that a home park
can be, and measure individual ability alone, we must create a statistical
balancer that diminishes the individual batting marks created in parks like
Fenway and augments those created in San Diego. Pete Palmer developed an
adjustment that enables us, for the first time, to measure a player's
accomplishments apart from the influence of his home park.
Parks differ in so many ways that it may be hard to imagine how their
differences can be quantified. The most obvious way in which they differ is
in their dimensions, from home plate to the outfield walls, and from the base
lines to the stands. The older arenas--Fenway Park, Wrigley Field, Tiger
Stadium--tend to favor hitters in both regards, with reachable fences and
little room to pursue a foul pop. The exception among the older parks was
Chicago's Comiskey, which, in keeping with the theories of Charles Comiskey
back in 1910 and the team's perceived strength, was built as a pitcher's park.
Yet two parks can have nearly equal dimensions, like Pittsburgh's Three Rivers
Stadium and Atlanta's Fulton County Stadium, yet have highly dissimilar
impacts upon hitters because of climate (balls travel farther in hot weather),
elevation (travel farther above sea level), and playing surface (travel faster
and truer on artificial turf). Yet another factor is how well batters think
they see the ball; Shea Stadium is notorious as a cause of complaints.
And perhaps more important than any of the objective park
characteristics, suggested Robert Kingsley in a 1980 study of why so many
homers were hit in Atlanta, is the attitude of the players, the way that the
park changes their view of how the game must be played in order to win. Every
team that comes into Atlanta in August knows that the ball is going to fly
and, whether it is a team designed for power or not, it plays ball there as if
it were the 1927 Yankees. In their own home park the Astros may peck and
scratch for runs, but in Atlanta they will put the steal and hit-and-run in
mothballs. Conversely, a team which comes into the Astrodome and plays for
the big inning will generally get what it deserves--a loss. The successful
team is one that can play its game at home--the game for which the team was
constructed--yet is flexible enough to adapt when on the road. How to
quantify attitude?
Rather than try to assign a numerical value to each of the six or more
variables that might go into establishing an estimator of homepark impact,
Pete looked to the single measure in which all these variables are
reflected--runs. After all, why would we assign one value to dimensions,
another to climate, and so on, except to identify their impact on scoring? If
a stadium is a "hitters' park," it stands to reason that more runs would be
scored there than in a park perceived as neutral, just as a "pitchers' park"
could be expected to depress scoring.
The full and lengthy explanation for the computation of the Park Factor
is left to the Glossary, where hardy readers might consider taking a peek
right now. For most of us, though, it will be enough to understand that the
Park Factor consists mainly of the team's home-road ratio of runs allowed,
computed as it was above for the league, compared to the league's home-road
ratio.
Just as Dave Winfield's stats suffered for the home parks he played in
until he joined the California Angels, Dean Chance, star pitcher of the Angels
in the mid-1960s, benefited from playing in Chavez Ravine when it was
notoriously rough on hitters. This is not to say Chance had anything but a
marvelous year in 1964: 20 wins, a 1.65 ERA, and 11 shutouts are hard to
argue with. Still, in 81 home games in 1964, the Angels allowed 226 runs; in
81 games on the road, they allowed 325--44 percent more, where a 10 to 11
percent increase would have been normal. If one is to compare Chance and,
say, Bert Blyleven in his years with Minnesota fairly, you must deny one the
benefit of his home park and remove from the other the onus of his. This is
what Park Factor does.
For decades, the all-time scoring squelcher was Chicago's South Side
Park, which saw service at the dawn of the American League. From 1901 through
1909, its last full year of service to the White Sox, this cavernous stadium
produced home run totals like the 2 in 1904, 3 in 1906, and 4 in 1909; in two
years the Sox failed to hit any homers at home, thus earning the nickname
"Hitless Wonders." In 1906, Chicago pitchers held opponents to 180 runs at
South Side Park, an average of 2.28 runs per game, earned and unearned, in a
decade when 4 of every 10 runs were unearned. This mark held until 1981, when
the Astrodome intimidated opposing hitters to such a point that in the 51 home
dates of that strike-shortened season, Astro hurlers were touched for only 106
runs--2.08 per game. The Pitcher Park Factor of .817 for the Astrodome was
the lowest ever. Those who suspected that men like Joe Niekro, Don Sutton,
Vern Ruhle, et al., were perhaps not world beaters after all were right: Look
at the ERAs the Astro starters registered that year, and what these ERAs might
have been in an average park like Shea that year (BPF: 1.00) or a moderately
difficult pitchers' park like San Francisco (BPF: 1.06).
Houston Pitchers, 1982
ERA BPF: 1.00 BPF: 1.06
-----------------------------------------------------------
Nolan Ryan 1.69 2.07 2.19
Joe Niekro 2.82 3.43 3.64
Vern Ruhle 2.91 3.56 3.77
Bob Knepper 2.18 2.66 2.82
Don Sutton 2.60 3.17 3.36
HOUSTON (all) 2.66 3.24 3.44
SAN FRANCISCO
(all) 3.28 3.09 3.28
-----------------------------------------------------------
Some observations prompted by this table: San Francisco with its team ERA of
3.28 had a better pitching staff than Houston with its 2.66; and Houston
batters, regarded as a Punch-and-Judy crew by all observers, must have been a
lot more effective than heretofore suspected. In fact, when Houston batters'
totals (eighth in runs scored, eighth in LWTS) are adjusted for park, the
Astros emerge on ability as the best hitting team in the National League of
1981! Even without the application of Park Factor, one might have come to a
similar conclusion by examining the runs scored totals for all NL clubs on the
road in 1981. Houston's total was exceeded only by those of the Dodgers and
Reds.
Proceeding from a similar hunch, we may look at the batting record of the
"Hitless Wonders" of 1906, who won the pennant (and the World Series, in four
straight over a Cubs team which went 116-36 during the season). Baseball lore
has it that a magnificent pitching staff (Ed Walsh, Doc White, Nick Altrock,
and others) overcame a puny batting attack (BA of .230, 6 homers, slugging
percentage of .286). In fact, the Sox scored more runs on the road than all
but one AL team, and their Batting Linear Weights, when adjusted for park, was
third in the league--the same rank achieved by their pitching. (How they won
the pennant remains a mystery, though, for both Cleveland and New York had
vastly superior teams on paper.)
Relativity
Sabermetric statistics can be marvelous tools for cross-era comparisons,
enabling us to determine if baseball's history is truly a seamless web or if
its seams are real enough, but are camouflaged by traditional statistics.
If Batter A presented himself to you for approval with these
statistics--.330 batting average, 16 home runs, 107 RBIs--what would your
reaction be? You'd like to have him on your team, right? And what to make of
Batter B, who presents these numbers--.257 batting average, 14 home runs, 53
RBIs? Not bad for a middle infielder with a good glove, you say, but
otherwise undistinguished? In fact, the "impressive" figures of Batter A
represent the average performance of a National League outfielder in 1930,
while the "blah" figures of Batter B are those of the average American League
outfielder of 1968: The former has more than twice the RBIs of the latter,
along with a batting average 73 points higher, yet the two performed at
identical levels, and an argument could be made that Batter B was superior.
In a similar comparison involving those two years of extremes, Bill Terry
led the National League in 1930 with a BA of .401, a mark surpassed by Ted
Williams in 1941 but not equaled since; Carl Yastrzemski led the American
League of 1968 with a performance that oldtimers held to be a disgrace, a
lowly BA of .301, the worst ever to win a batting championship. Terry's mark
was achieved at a time when most pitchers had only two pitches, a fastball and
a curve, and not enough confidence in the latter to throw it when behind in
the count at 2-0 or 3-1. The parks were smaller; there was no night ball; the
game was segregated racially; and you played 22 games with each team, none
farther west of the Mississippi than St. Louis. Moreover, 1930 was the year
in which National League officials, attempting to match the popularity of the
slugging American League, juiced the ball to such an extent that the entire
league batted .312 (if you remove pitcher batting). In other words, the
average nonpitcher in the NL of 1930 batted higher than the AL leader in 1968!
When Yaz hit .301, pitchers dominated the game and the average American League
nonpitcher hit .238. How to compare Terry and Yaz, who played under such
different conditions thirty-eight years apart?
You could view Terry's .401 in relation to his league's BA of .312,
concluding that Memphis Bill was a better hitter (by BA alone, which despite
its previously cited deficiencies remains the most comfortable stat by which
to introduce this technique) by 28.5 percent. You could compare Yaz's .301 to
his league's BA of .238 and conclude that he was a better than average hitter
by 26.5 percent. A mere 2 percentage points separate the men--had they both
played in the National League of 1983, when the league average was .255, the
Terry of 1930 might have hit .328, the Yaz of 1968, .323. (A further
refinement of this method would be to delete Terry's at bats and hits from his
league's, and those of Yastrzemski from his league's, so that the batters are
not in effect compared with themselves. This, however, necessitates the use
of at bats and hits rather than simply the averages and does not significantly
alter the results.)
Why do we need relative measures? Basically, for the same reason we need
statistics altogether, to compare, to interpret, and to comprehend, but in a
more reasonable and accurate manner when the disparity of the data sources
makes the use of absolute, unadjusted numbers illogical. If the analysis
involves data produced under widely varying conditions, such as a sample
including performances 20, 50, or 100 years apart, any comparison will be
meaningless without dragging in a series of rather complex historical
understandings to modify the analysis--and in a highly subjective, unreliable
manner. To compare Terry's .401 with Yastrzemski's .301 with no recognition
of the context in which these marks were achieved, that is, to infer that
Terry was 100 points better than Yaz, is equivalent to comparing Babe Ruth's
salary of $80,000 in 1930 with Pete Rose's $806,250 of fifty years later and
concluding that Rose was $726,250 richer. To understand those figures we must
place them within a context which includes such factors as I.R.S. regulations
and inflation: We might think to re-express the two salaries in terms of
their purchasing power, multiplying each by the Consumer Price Index of its
time as expressed in 1967 dollars; doing this would be to compute a "relative
salary" for Ruth and Rose, just as we computed a Relative Batting Average for
Terry and Yaz. (And just as we discovered there was little difference between
the BAs of the latter couple, we would discover there is little difference
between the salaries of the former pair.)
Few are the fans who could cite the context of Ross Barnes' .429 batting
average of 1876, let alone evaluate its ingredients (these include
considerations of equipment, schedule, travel, physiology, racial exclusion,
daytime games, rules variations, attitudes, and customs). A statistic removed
from its historical context can be as deceptive as a quotation pulled out of
context. How, then, to compare Barnes' .429 with, say, Bill Madlock's
league-leading figure of .339 a century later? Should we discount Barnes'
average 10 percent because in his day batters could demand a pitch above the
waist or below? Or should we augment it 17 percent because a pitcher could
throw eight "balls" before allowing a walk?
We are confronted with a similar problem in trying to quantify the
various differences between home parks; our solution there was to look at the
single measure which reflected all the variables--runs--and from that measure
we proceeded to devise a formula for Park Factor. Similarly, the many
variables that supply the context for Barnes in 1876 supplied an identical
context for every other batter in that year--and the context in which Bill
Madlock hit .339 prevailed for every other National League batter in 1976
(except for home park, of course). Accordingly, if we form a ratio of
Barnes' .429 to his league's average (.265) and another of Madlock's to his
league's average (.263) we obtain figures (1.62 for Barnes, 1.28 for
Madlock--stated for convenience in Total Baseball as 162 and 128), which may
reasonably be compared with each other: Barnes was 62 percent better than his
league in BA, while Madlock was 28 percent better than his; these become the
comparables, not the .429 and .339. The method will not become a time
machine--putting Barnes on a modern club and Madlock on an old-time one--any
more than Park Factor is a place machine, switching Joe DiMaggio to Beantown
and Ted Williams to the Bronx. However, the relativist approach offers
suggestive truths and does measure precisely the extent to which Barnes' and
Madlock's BAs dominated those of their contemporaries.
Until the 1970s, when David Shoebotham ("Relative Batting Averages,"
Baseball Research Journal, 1976) and Merritt Clifton ("Relative Baseball,"
Samisdat, 1979) introduced the relativist approach, all baseball stats were
absolute. And for cross-era comparison, that favorite Hot Stove League
activity, absolute stats were absolutely useless, generating plenty of heat
and precious little light. What the theory of relativity, baseball-style,
does beautifully is to eliminate the need for bringing historical baggage to
statistical analysis. The normalized or relative versions of any
statistic--batting average, Production, ERA, slugging average, you name it;
even homers or strikeouts, though there are problems with these--will be
greater than 1.00 for all above-average performers (1.41, for example, means
41 percent better than average in the given category) while relative
statistics less than 1.00 will indicate a below average level of play (0.88
means 12 percent below the norm).
It is as simple as can be. So Early Wynn had a 3.20 ERA in 1950? What
does that mean? Well, the league ERA was 4.58, so Wynn did very well indeed.
His normalized ERA thus was 143, a mark better than that earned by Tom Seaver
in 1968, when he had an absolute ERA a full run lower at 2.20.
We cannot employ a Relative Won-Lost record, for the league average is
every year the same: .500. (A logical corollary is that one cannot fruitfully
use relative measures of any sort for a single season's analysis, as all like
figures will be compared to the same league average. The numbers may be
changed into normalized form, but the players' rankings will be unchanged:
The top ten in batting average in 1990, for example, will retain their ranks
in Relative Batting Average.)
Relativism in baseball echoes not only Einstein but also Shakespeare,
whose words in Hamlet might be modified to read "There is nothing either good
or bad, but context makes it so." No longer must we accept arbitrary
assessments of performance or regard with awe such old-time figures as Hugh
Duffy's BA of .438 in 1894 (not the accomplishment that Rod Carew's .388 was
in 1977) or George Sisler's .407 in 1920 (not as good as Roberto Clemente's
.357 in 1967). Conversely, a "mediocre" performance of recent years, such as
Bobby Murcer's .292 of 1972, for instance, stacks up as the equal of Eddie
Collins' .360 in 1923, while Charlie Grimm's seemingly solid .298 in 1929
compares unfavorably to Mike Cubbage's .260 in 1976.
Relativism redefines our understanding not only of particular
accomplishments but also of baseball history itself. We see that the men who
batted .400 with numbing regularity in the 1890s and 1920s were not supermen
(would you swap Wade Boggs for Tuck Turner? George Brett for Harry Heilmann?)
anymore than the sub-2.00 ERA pitchers of the late 1960s (Gary Peters, Bob
Bolin, Dave McNally, et al.). Absolute figures lie. Are hitters today worse
because none has hit .400 since 1941? Or are they superior because a Dave
Kingman can average nearly 30 homers a year while Cap Anson only averaged 4?
Are infielders better today because they make fewer errors than their
counterparts of 50, 75, or 100 years ago? Do modern outfielders have
limp-noodle arms because their assist totals pale before those registered in
the early decades of the 1900s? Is baseball improving or declining, and has
its rise or fall been steady? One can spit absolute stats on the hot stove
all winter long and get no closer to the answer, but with relative statistics,
the issues are clarified.
In the May 1983 issue of The Coffin Corner, the newsletter of the
Professional Football Researchers Association, Bob Carroll offered a witty and
perceptive dissection of the relative approach to football statistics. It was
based upon a comparison of two great running backs, Tuffy Leemans of the New
York Giants of the late 1930s and early '40s and George Rogers, then with the
New Orleans Saints. "I've always liked the story," Carroll wrote, "of the
little old lady who scornfully toured a Picasso exhibit and then sniffed, 'If
Rembrandt were alive today, he wouldn't paint this way!' To which a bystander
replied, 'Ah, but if Rembrandt were alive today, he wouldn't be Rembrandt.'"
There are things that relative baseball stats won't do, questions they
won't answer. What would Ty Cobb bat if he were playing today? Lefty O'Doul
was asked this question by a fan at an offseason baseball banquet in 1960.
"Maybe .340," O'Doul answered. "Then why do you say Cobb was so great," the
fan remarked, "if he could only hit .340 with the lively ball today?" "Well,"
O'Doul said, "you have to take into consideration that the man is now 74 years
old." Relative Batting Average cannot tell with certainty what Cobb would hit
today, for as Carroll wrote of Tuffy Leemans, if Cobb were playing today he
wouldn't be the same Cobb; he would be bigger, stronger, and faster, and he
might choose to steal less and go for the long ball more.
Relief Pitching
Absent from the chapter to this point has been the relief pitcher, a
modern specialist who because of his still-evolving role in baseball, presents
a variety of sabermetric problems and opportunities. The nature of the job is
such that his won-lost record is not meaningful (even less so today than ten
or fifteen years ago, with the ace in most bullpens being called upon--in
highly dubious wisdom--only when his team has a lead in the eighth or ninth
inning). A reliever may pick up a win with as little as a third of an
inning's work, if he is lucky, while a starter must go five innings; a
reliever may also pick up a loss more easily, for if he allows a run there may
be little or no opportunity for his teammates to get it back, as they can for
a starter. Earned run average is meaningful for the reliever, but it must be
.15 to .25 lower to equate with that of a starter of comparable ability: a
reliever frequently begins his work with a man or two already out, and thus
can put men on base and strand them without having to register three outs.
Ratios of hits to innings, strikeouts to innings, strikeouts to
walks--all of these have their interest, but none is sufficient by itself to
measure relief-pitcher effectiveness. Relievers may also have an edge in
these ratios because they generally face each batter only once in a game, thus
leading to fewer hits and more strikeouts per inning. Before discussing the
modern alternatives of saves or Relief Points, and our own Relief Ranking,
let's review briefly the rise of the relief pitcher from the role of a mere
hanger-on to, some would say, the most indispensable part of a winning team.
Relief pitching before 1891 was limited, with rare exceptions, to the
starting pitcher exchanging places with one of the fielders, who was known as
the "change pitcher." Substitutions from the bench were not permitted except
in case of injury until 1889, when a tenth man became entitled to designation
as a substitute for all positions; free substitution came in two years later,
but no relief specialists emerged until Claude Elliott, Cecil Ferguson, and
Otis Crandall in the first decade of this century.
The next decade's best relievers were starters doing double duty--notably
Ed Walsh, Chief Bender, and Three Finger Brown. The 1920s, and up to the end
of World War II, brought the first firemen to be employed in the modern way,
although they tended to work more innings and fewer games than today. These
were men such as Firpo Marberry, Johnny Murphy, Ace Adams, and several other
worthies.
When you think of a relief pitcher in the modern-day sense--that is, a
man who can appear in 50 or more ballgames a year, all or nearly all in
relief, and win/save 30 or more--you begin with Joe Page of the 1947-49
Yankees and Jim Konstanty of the 1950 Phils, though Marberry had one such
season in 1926. None of the three, however, ever heard of a "save" in his
playing days--this term wasn't introduced until 1960, the year after Larry
Sherry's heroic World Series in which he finished all four Dodger victories,
garnering two for himself and saving the others; 1959 was also the year
fireman Roy Face went 18-1, not losing until September 11.
Before Jerry Holtzman of the Chicago Sun Times devised the save, baseball
people were looking at really only one figure to measure a reliever's work,
and that was the number of games in which he appeared; any other appreciation
of his efforts was expressed impressionistically. A reliever did not work
enough innings to qualify for an ERA title (Hoyt Wilhelm in 1952 being the
exception), nor could he expect to win 20 games. The introduction of a
specialized statistic for the fireman was acknowledgement of his specialized
employment and conferred upon it a status it had never enjoyed, not even after
the exploits of Konstanty, Page, Wilhelm, and Face. Only when the save came
into being did the majority of relievers take pride in their work and stop
regarding their time in the bullpen as an extended audition for a starting
role.
When The Sporting News, spurred by Holtzman, began recording saves in its
weekly record of the 1960 season, the save was defined in a way different from
today. Then, upon entering the game, a reliever had to confront the tying or
winning run on base or at the plate, and of course finish the game with the
lead. This definition later became eased, so that simply finishing a game
would get the reliever a save; a memorably absurd result of the new ruling was
that the Mets' Ron Taylor gained a save in 1969 by pitching the final inning
of a 20-6 win over Atlanta. This outraged sportswriters and fans alike, so in
1973 the definition was changed yet again: a reliever had to work three
innings or come in with the tying or winning run on base or at bat. This
definition was relaxed yet again in 1975 so that the tying run could be on
deck, thus giving the relief pitcher license to allow a baserunner. It was a
good thing for statisticians when Dan Quisenberry surpassed John Hiller's 1973
record of 38 saves by a decisive margin of 7. Today, of course, Bobby
Thigpen's 1990 mark of 57 saves seems beyond challenge . . . but back in 1920
so did Babe Ruth's 29 homers.
There was a blip in the relievers' trend of rising importance when the
American League introduced the designated hitter in 1973. The predicted
outcome, based on the first few years' experience of the DH, was: increased
offensive production, no more need to pinch-hit for the pitcher, and thus a
greater number of complete games and fewer saves. All those things did happen
in 1973-76, although not quite to the degree expected--and soon the American
League's use of relief pitchers became as extensive as it had been in the
early 1970s. In 1982, despite the DH, American League starters completed only
19.6 percent of their games, an all-time league low (though still
substantially higher than the National League, where CGs dropped below 15
percent the last few years). In 1990 the AL and NL each logged complete games
at about a 16 percent rate.
Relief Points is an improvement over saves, in all of its various
incarnations including the one that provides a penalty for a blown save as
well as for a defeat.
Some folks still long for a measure of middle-relief effectiveness, that
statistical no-man's land. In April 1981 Sports Illustrated came up with an
incredibly complicated series of tabulations to address these final
injustices, and they were dazzling. However, the SI method dazzled in the
same way that the Mills brothers' Player Win Average did--it was ingenious and
well conceived, but involved too much work. Not only did it require
play-by-play analysis, but it also reminded one (queasily) of the National
Football League's quarterback-rating system. Quarterbacks are rated in four
categories, variously weighted, to arrive at a number of "rating points." Not
one fan in a thousand could tell you how the rating points are derived, and
the same holds for the SI relievers' formulas.
The final relief statistic to be discussed is the one we think is the
best--Relief Ranking, which is a weighted variant of park-adjusted Relief Runs
(in the first edition of Total Baseball, we applied the measure to all
pitchers who averaged less than three innings per appearance, and this
resulted in some needless inclusions of pitchers who were primarily starters;
this time around we have broken out all pitchers' relief innings). Relief
Ranking tends to favor closers, while Relief Runs provides a good measure for
middle-relief outings. See the Relief Pitcher Register, a new feature in
this third edition of Total Baseball.
The Future
The most exciting frontier for sabermetrics is in situational stats, the
type employed by Elias, Stats, Inc., and The BaseBall Workshop; as the years
go by and their data bases grow, the sampling sizes of the data will enlarge
and their figures for day vs. night, turf vs. grass, and so on, will be
statistically meaningful as well as statistically correct. Cross-era
comparison remains a subject of intense interest, and the debate over
average-player skill rages on. Fielding and relieving, as discussed, also
provide fertile ground for invention.
Fantasy baseball aficionados seem caught up in the competition and
deal-making (as well as player evaluation), but some of the newsletters, such
as John Benson's, provide sound analysis and trend-spotting tips. It would not
be surprising if Rotisserie-type Leagues, rather than SABR, furnish the best
sabermetricians of the 1990s. See also, in the Appendix, Gary Gillette's
article, "Baseball, Computers, and New Statistics."