Total Baseball (1994 Edition)

home *** CD-ROM | disk | FTP | other *** search

/ Total Baseball (1994 Edition) / Total_Baseball_1994_Edition_Creative_Multimedia_1994.iso / dp / 0019 / 00190.txt next >

Wrap

Text File | 1994-02-14 | 113KB | 1,614 lines

$Unique_ID{BAS00190} $Pretitle{} $Title{Statistics: Introduction} $Subtitle{} $Author{} $Subject{Registers Leaders Rosters Introduction statistic Statistics statistical stat stats Chadwick Origins Flowering Golden Computer Age Batting Base Stealing Fielding Pitching Errors Controversies Sources Missing Incomplete data} $Log{} Total Baseball: Registers, Leaders, and Rosters Statistics: Introduction Part Two, the statistical section of Total Baseball, presents the record of major league contests played from 1871 through 1993--all 158,982 of them. It details the accomplishments of the game's 2,153 teams and 14,052 players more completely and more accurately than any other encyclopedic work; it applies to all of baseball's glorious past the "sabermetric" stats that fans first embraced in the 1980s; it introduces original measures of player performance. Yet for all its innovation, Total Baseball stands squarely in the tradition of baseball record keeping; it is--like each new spring of our national pastime--a link in a long, long chain. As the game of one hundred and fifty years ago lives on in the game of today, so is this volume enriched by the labors of statisticians from Henry Chadwick to Ernie Lanigan, from S.C. Thompson to David Neft to Bill James. The Origins, 1845-1875 In fact, baseball and stats were a tandem from the outset of the game's history, as the editors of this volume first discussed in their earlier Hidden Game of Baseball (1984), from which portions of this introduction are adapted. The first box score appeared in the New York Morning News on October 22, 1845, just a month after Alexander Cartwright and his Knickerbocker teammates codified the first set of rules. Why did these early players and scribes measure individual performance rather than simply count the score? In part to imitate the custom of cricket; yet the larger explanation is that the numbers served to legitimize men's concern with a boys' pastime. The pioneers of baseball reporting--William Cauldwell of the Sunday Mercury, William Porter of Spirit of the Times, the unknown annalist at the News, and later Henry Chadwick--may indeed have reflected that if they did not cloak the game in the "importance" of statistics, it might not seem worthwhile for adults to read about, let alone play. Statistics elevated baseball from other boys' field games of the 1840s and '50s to make it somehow "serious", like business; its essential simplicity was adorned with intricate detail that suited it perfectly to quantification. In the development of baseball statistics, no man is more important than Father Chadwick. Born in England in 1824, he came to these shores at age thirteen steeped in the tradition of cricket. In his teens he played the English game and in his twenties he reported on it for a variety of newspapers, including the Long Island Star and the New York Times. In the early 1840s, before the Knickerbocker rules eliminated the practice of retiring a base runner by throwing the ball at him rather than to the base, Chadwick occasionally played baseball too, but he was not favorably impressed, having received "some hard hits in the ribs." Not until 1856, by which time he had been a cricket reporter for a decade, were Chadwick's eyes opened to the possibilities in the American game, which had improved dramatically since his youth. In 1868 he recalled, "On returning from the early close of a cricket match on Fox Hill, I chanced to go through the Elysian Fields during the progress of a contest between the noted Eagle and Gotham clubs. The game was being sharply played on both sides, and I watched it with deeper interest than any previous ball game between clubs that I had seen. It was not long before I was struck with the idea that baseball was just the game for a national sport for Americans . . . as much so as cricket in England. At the time I refer to I had been reporting cricket for years, and, in my method of taking notes of contests, I had a plan peculiarly my own. It was not long, therefore, after I had become interested in baseball, before I began to invent a method of giving detailed reports of leading contests at baseball . . ." Thus Chadwick's cricket background was largely the impetus to his method of scoring a baseball game, the format of his early box scores, and the copious if primitive statistics that appeared in his year-end summaries in the New York Clipper, Beadle's Dime Base-Ball Player, and other publications. Actually, cricket had begun to shape baseball statistics even before Chadwick's conversion. The first box score reported on two categories, outs and runs: outs, or "hands out," counted both unsuccessful times at bat and outs run into on the basepaths; "runs" were runs scored, not those driven in. The reason for not recording hits in the early years, when coverage of baseball matches appeared alongside that of cricket matches, was that, unlike baseball, cricket had no such category as the successful hit which did not produce a run. To reach "base" in cricket is to run to the opposite wicket, which tallies a run; if you hit the ball and do not score a run, you have been put out. Cricket box scores were virtual play-by-plays, a fact made possible by the lesser number of possible events. This play-by-play aspect was applied to a baseball box score as early as 1856; interestingly, despite the abundance of detail, hits were not accounted, nor did they appear in Chadwick's own box scores until 1867. The batting champion as declared by Chadwick, whose computations were immediately and universally accepted as "official," was the man with the highest average of runs per game. An inverse though imprecise measure of batting quality was outs per game. After 1863, when a fair ball caught on one bounce was no longer an out, fielding leaders were those with the greatest total of fly catches, assists, and "foul bounds" (fouls caught on one bounce). Pitching effectiveness was based purely on control, with the leader recognized as the one whose delivery offered the most opportunities for outs at first base and led to the fewest passed balls. In a sense, Chadwick's measuring of baseball as if it were cricket can be viewed as correct in that when you strip the game to its basic elements, those that determine victory or defeat, outs and runs are all that count in the end. No individual statistic is meaningful to the team unless it relates directly to the scoring of runs. Chadwick's blind spot in his early years of baseball reporting lay in not recognizing the linear character of the game, the sequential nature whereby a string of base hits or men reaching base on error (there were no walks then) was necessary in most cases to produce a run. In cricket each successful hit must produce at least one run, while in baseball, more of a team game on offense, a successful hit may produce none. Early player stats were of the most primitive kind, the counting kind. They'd tell you how many runs, or outs, or fly catches had occurred--later, how many hits or total bases. Counting is the most basic of all statistical processes; the next step up is averaging, and Chadwick was the first to put this into practice. As professionalism infiltrated the game, teams began to bid for star-caliber players. Stars were known not by their stats but by their style until 1865, when Chadwick began to record in the Clipper a form of batting average taken from the cricket pages--runs per game. Two years later, in his newly founded baseball weekly, The Ball Players' Chronicle, he began to record not only average runs and outs per game, but also home runs, total bases, total bases per game--and hits per game. The averages were expressed not with decimal places but in the standard cricket format of the "average and over." Thus a batter with 23 hits in 6 games would have an average expressed not as 3.83 but as "3-5"--an average of 3 with an overage, or remainder, of 5. Another innovation was to remove from the individual accounting all bases gained through errors. Runs scored by a team, beginning in 1867, were divided between those scored after a man reached base on a clean hit and those arising from a runner's having reached base on an error. This was, of course, a precursor of today's earned run average. In 1868, despite Chadwick's derision, the Clipper continued to award the prize for the batting championship to the player with the greatest average of runs per game. Actually, the old yardstick had been less preposterous a measure of batsmanship than one might imagine today, because team defense was so much poorer and the pitcher, with severe restrictions on his method of delivery, was so much less important. If you reached first base, whether by a hit or by an error, your chances of scoring were excellent; indeed, teams of the mid-1860s registered more runs than hits! By 1876, the caliber of both pitching and defense had improved to the extent that the ratio of runs to hits was about 6.5 to 10; today the ratio stands at roughly 5 to 10. By the end of the decade Chadwick was recording total bases and home runs, but he placed little stock in either, as conscious attempts at slugging violated his cricket-bred image of "form". Just as cricket aficionados watch the game for the many opportunities for fine fielding it affords, so was baseball from its inception perceived as a fielders' sport. The original Cartwright rules of 1845, in fact, specified that a ball hit out of the field--in fair territory or foul--was a foul ball! "Long hits are showy," Chadwick wrote in the Clipper in 1868, "but they do not pay in the long run. Sharp grounders insuring the first-base certain, and sometimes the second-base easily, are worth all the hits made for home-runs which players strive for." Chadwick prevailed, and hits per game became the criterion for the Clipper batting championship and remained so until 1876, when the problem with using games as the denominator in the average at last became clear. If you were playing for a successful team, and thus were surrounded by good batters, or if your team played several weak rivals who committed many errors, the number of at-bats for each individual in that lineup would increase. The more at-bats one is granted in a game, the more hits one is likely to have. So, for example, if Player A had 10 at-bats in a game, which was not so unusual in the 1860s, he might have 4 base hits. In a more cleanly played game, Player B might bat only 6 times, and get 3 base hits. Yet Player A, with his 4-for-10, would achieve an average of 4.00; the average of Player B, who went 3-for-6, would be only 3.00. By modern standards, of course, Player A would be batting .400 while Player B would be batting .500. In short, the batting average used in the 1860s is the same as that used today except in its denominator, with at-bats replacing games. Moreover, Chadwick created a measure in the 1860s that divided total bases by games played; change the denominator to at-bats and you have today's slugging average--which, incidentally, was not accepted by the National League as an official statistic until 1923 and by the American until 1946 (baseball was born and bred conservative). Chadwick's "total bases average" represents the game's first attempt at a weighted average--an average in which the elements collected together in the numerator or the denominator are recognized numerically as being unequal. In this instance, a single is the unweighted unit, the double is weighted by a factor of two, the triple by three, and the home run by four. Statistically, this is a distinct leap forward from, first, counting, and next, averaging. The weighted average is in fact the cornerstone of today's statistical innovations, or "sabermetrics." The 1870s gave rise to some new batting stats and to the first attempt to quantify thoroughly the other principal facets of the game, pitching and fielding. Although the Clipper recorded base hits and total bases as early as 1868, a significant wrinkle was added in 1870 when at-bats were listed as well. This was a critical introduction because it permitted the improvement of the batting average, first introduced in its current form by H.A. Dobson of Washington, D.C., in the Dime Base-Ball Player of 1872, and first computed officially--that is, for the National League--in 1876. Since then the batting average has not changed, except for 1876, when bases on balls were figured as outs, and 1887, when they were counted as hits. Total Baseball counts a walk as neither an at-bat nor an out for all years since 1871. The objections to the batting average are well known, but to date have not dislodged it from its place as the most popular measure of hitting ability. First of all, the batting average makes no distinction between the single, the double, the triple, and the home run, treating all as the same unit. This objection had been addressed in 1868 by Chadwick's total bases average. Second, it gives no indication of the effect of that base hit--that is, its value to the team. This was the reason Chadwick clung to runs per game as the best possible batting measure. Third, the batting average does not take into account those occasions when first base is reached via a walk, hit by pitch, or error. This last point was addressed at a surprisingly early date, too, as for 1879 the National League adopted as an official statistic a forerunner of the on-base percentage; it was called "reached first base," which included times reached by error as well as base on balls and base hits. (Being hit by a pitch did not give the batter first base until 1884 in the American Association, 1887 in the National League.) The Flowering, 1876-1920 Ever since the Civil War, serial guides like Beadle and DeWitt and sporting columns like those in the Clipper had carried year-end tabulations of batting, fielding, and pitching exploits, varying from year to year with the brainstorms of Chadwick or other demon compilers like New York's M.J. Kelly or Philadelphia's Al Wright. But the year 1876 was special. It was significant not only for the founding of the National League and the official debut of the batting average in its current form, it was also the Centennial of the United States, which was marked by a giant exposition in Philadelphia celebrating the mechanical marvels of the day. American ingenuity reigned, and technology was seen as the new handmaiden of democracy. Baseball, that mirror of American life, reflected the fervor for things scientific with an explosion of statistics far more complex than those seen before, particularly in the previously neglected areas of pitching and fielding. The increasingly minute statistical examination of the game met a responsive audience, one primed to view complexity as a measure of worth. The crossroads year of 1876 highlights how the game had changed to that point, as well as how it has changed since. In that year, the number of offensive stats tabulated at season's end in any of the publications inspired by Chadwick or Spalding was six: games, at-bats, runs, hits, runs per game, and batting average. (And as with all the various guides until 1941, the stats of men who played in fewer than a specified minimum number of games were not noted.) Of these six, only runs and runs per game were common in the 1860s, while that decade's tabulation of total bases vanished. The number of offensive stats a hundred years later? Twenty. (Today the number is twenty-one, with the addition of on base percentage). The number of pitching categories in 1876 was eleven, and there were some surprises, such as earned run average, hits allowed, hits per game, and opponents' batting average. Strikeouts were not recorded, for Chadwick saw them strictly as a sign of poor batting rather than good pitching (his view had such an impact that pitcher strikeouts were not kept officially until 1889). The number of pitching stats today? Twenty-four. The number of fielding categories in 1876 was six. One hundred years later it was still six (with the exception of the catcher, who gets a seventh: passed balls), dramatizing how the game, which originated as a showcase for fielders, had changed. The fielding stats of 1876 lumped "battery errors" with fielding errors, so that wild pitches and passed balls--in some years, even walks--diminished one's fielding percentage. This practice continued until 1887, but in Total Baseball battery errors are not included in fielding stats. Battery-mates' fielding stats were boosted by the awarding of an assist to the pitcher on strikeouts. This practice lasted until 1889, but is not reflected in Total Baseball. The custom in 1876, as it is now, was to combine putouts, assists, and errors to form a "percentage of chances accepted," or what is today known as fielding average or fielding percentage. A "missing link" variant, devised by Al Wright in 1875, was to form averages by dividing the putouts by the number of games to yield a "putout average"; dividing the assists similarly to arrive at an "assist average"; and dividing putouts plus assists by games to get "fielding average." These averages took no account of errors. (Wright's "fielding average" was reborn a century later as Bill James' Range Factor.) The public's appetite for new statistics was not sated by the outburst of 1876. New measures were introduced in dizzying profusion in the remaining years of the century. Some of these did not catch on and were soon dropped for all time, like the ridiculous "total bases run," while others fizzled only to reappear with new vigor in the twentieth century. These include (a) the above-mentioned "reached first base," which resurfaced in the early 1950s in an unofficial, improved form called on base percentage and became an official stat more than thirty years later, and (b) an 1860s stat, earned run average, which was periodically revived before dropping from sight in the 1880s, only to return triumphant to the NL in 1912 and the AL in 1913. In 1913 Ban Johnson not only proclaimed the ERA official but became so enamored with it that he also instructed American League scorers to compile no official won-lost records (this state of affairs lasted for seven years, 1913-1919). Another stat that was "sent back to the minors" before its eventual adoption as an official stat in 1920 was the run batted in. Introduced by a Buffalo newspaper in 1879, the stat was picked up the following year by the Chicago Tribune and even became an official NL stat for the opening months of 1891. By season's end it had faded as most NL scorers declined to account for it in their summaries (The American Association, however, recorded it all year long.) Ernie Lanigan picked up the RBI baton with his reports to the New York Press in 1907, but only about a third of his data has been found, and he did not figure RBIs for men who played in fewer than ten games, or club totals for traded players. For Total Baseball we have placed much reliance upon the source material donated by Information Concepts, Inc. (ICI) to the National Baseball Library in Cooperstown following publication of its Baseball Encyclopedia for Macmillan in 1969. David Neft also kindly supplied us with his unpublished RBI data for the previously missing National League seasons of 1880-1885. The John Tattersall collection of nineteenth century game accounts and box scores was valuable as well. Other statistics introduced officially before the turn of the century were stolen bases (though not caught stealing); doubles, triples, and homers; and sacrifice bunts (though an at-bat was charged from 1889 through 1893). Pitcher strikeouts, bases on balls, and the hit-by-pitch also appeared before 1900, but hit-by-pitch stats were not kept for batters on a systematic basis until 1917 in the NL and 1920 in the AL. Through newspaper research, we have filled in HBP data from 1884 through 1916 in the National League, Players League, and American Association, from 1901 through 1919 in the American League, and the 1914-1915 Federal League. Hit into double play--including line outs as well as groundouts--was recorded erratically in the nineteenth century, but separate stats for groundouts into double plays have been kept by the leagues only since 1933 in the NL and 1939 in the AL. Batters' strikeouts were reported unofficially in 1891, but not as a league stat until 1910 in the NL and 1913 in the AL. Innings pitched were not kept until 1908 in the AL and 1903 in the NL. Stolen bases were awarded not only for clean steals but also for extra bases taken through daring, from the first year in which totals were kept, 1886, until 1898 (the Macmillan Baseball Encyclopedia begins its record of stolen bases with 1887). Because the figures reported in the guides were grossly inflated (such as Harry Stovey's ostensible 156 steals in 1888), the figures in Total Baseball reflect game-by-game research and refiguring. Caught-stealing (CS) figures are available on a very sketchy basis in some of the later years of the century, as some newspapers carried the data in the box scores of hometown games. From 1912 on, Lanigan recorded CS in box scores of the New York Press, but the leagues did not keep the figure officially until 1920. The AL has tabulated CS from that year to the present, excepting 1927, which members of the Society for American Baseball Research reconstructed from newspaper box scores. National League caught-stealing data exists for 1920-1925, and for 1951 to the present. The new century added little in the way of new official statistics--ERA, RBI, and slugging average are better regarded as revivals despite their respective adoption dates of 1912, 1920, and 1923. But back in 1908 there was a classic case of a statistic rushing in to fill a void, as Phillies' manager Billy Murray observed that his outfielder Sherry Magee had the happy facility of providing a long fly ball whenever presented with a situation of a man on third and fewer than two outs. Taking up the cudgels on his player's behalf, Murray protested to the National League office that it was unfair to charge Magee with an unsuccessful time at bat when he was in fact succeeding, doing precisely what the situation demanded. Murray won his point, but baseball flip-flopped a couple of times on this stat, in some years reverting to calling it a time at bat, in other years not even crediting an RBI. The sacrifice-fly rule was in effect from 1908 through 1930, with a sacrifice being given for advancing any runner, not just to home, for the final four years of this period. The rule was revived for one year in 1939. In none of these years was a distinction made between a sacrifice bunt or fly. When the rule came back into force in 1954, there was a breakdown of each. More recent stats that have followed from this sort of perception--that something important was occurring on the field which had no verifiable reality because it was not yet being measured--are the save and the late, lamented game-winning RBI, which will be discussed later. A signal event took place in 1912: the publication by Baseball Magazine editor John Lawres of Who's Who in Baseball, a small book that became the first to provide career statistics and personal facts for a group of players. Although thoroughly inadequate by today's standards--its only tabulations were games, batting average, and fielding average (even for pitchers, who were given no mound records!)--Who's Who was a groundbreaking work, giving rise to a much-expanded format in 1916 and inspiring two other significant encyclopedic works: in 1914, George Moreland's self-published opus called Balldom (grandiosely subtitled "The Britannica of Baseball," which it surely wasn't), and Ernest J. Lanigan's Baseball Cyclopedia, also sponsored by Baseball Magazine, which debuted in 1922 and was updated annually through 1933. The Golden Age, 1920-1968 There have been other new statistical tabulations in this century, but generally of the counting sort: complete games (NL 1910, AL 1922), games started (AL 1926, NL 1938), games finished (NL 1920, AL 1926). And there were sacrifice bunts allowed (NL 1916, AL 1922), intentional bases on balls (only since 1955), and, in the next period, saves (1969) and game-winning RBIs (1980). The only new average since slugging average was adopted in 1923 has been the on base percentage, adopted in 1985. The ICI group computed saves for prior years. Another such stat that failed to survive, alas, was stolen bases off pitchers, which the American League recorded only in 1920-1924; it has been recorded on an unofficial basis in the 1980s by the Elias Sports Bureau and Project Scoresheet. The only new fielding measure was team double plays, added to the AL list in 1912 and the NL in 1919. Other new and more interesting stats appeared in the 1940s and '50s but have not yet gained the official stamp of approval, such as Ted Oliver's Weighted Rating System, Alfred P. Berry's Average Bases Allowed (opponents' slugging average), and Branch Rickey and Allan Roth's Isolated Power. This period of baseball's history may have fielded its most dazzling array of stars, but strategically and statistically it was pretty dim. There was some excitement, however, in baseball record keeping. First came Daguerreotypes, issued by The Sporting News in 1934, featuring the playing records of many retired players both celebrated and obscure; most if not all of these statistical and biographical profiles originally appeared in the pages of TSN. Although its number of statistical categories was fewer than one might have wished, Daguerreotypes was very useful and, through its several editions ably edited by Paul MacFarlane, long-lived. In 1940 came The Sporting News' Baseball Register, which supplied full records for active players, managers, coaches, and umpires, plus a grab bag of former stars. Since the expansion of the major leagues from sixteen teams to twenty-six, the Register has only accommodated contemporary players and managers, but it remains a valuable source. One year later, TSN issued a notable edition of its Official Baseball Record Book, giving for the first time full statistical lines for all men who played in a major league game the previous year. In 1944 a little-known man named Ted Oliver published in obscurity a booklet called Kings of the Mound. It introduced a new stat called the Weighted Rating System for pitchers, a stat that we modified and continue to employ as Wins Above Team. Moved by the inadequacy of both the won-lost percentage and the ERA to reflect the value of a decent pitcher laboring for a lousy club, Oliver ingeniously subtracted the pitcher's decisions from his team's, then took the difference between the pitcher's won-lost percentage and his team's and multiplied that difference by the pitcher's number of decisions. Although his concept and his math were flawed, his principles--viewing a pitcher's record in relation to his team and weighting the result of his calculation by the number of decisions--were of unparalleled sophistication for the time. (Oliver's formula for his Weighted Rating System and its modification in Total Baseball are detailed in the Glossary, as are the calculations behind every statistic employed in this book.) Then in 1951 came the first true encyclopedia of baseball, the claims of Moreland and Lanigan notwithstanding. Compiled by Hy Turkin and S.C. Thompson, The Official Encyclopedia of Baseball was published by the A.S. Barnes Company. Its 620 pages contained a wealth of features such as manager and umpire rosters, historical essays, playing tips, a bibliography, and much more. But the heart of the volume and the key to its subsequent success was a register of nearly nine thousand men who played one or more games at the major league level from 1871 through 1949 (the 1950 record of players appearing in ten games or more was tacked on to the end). In this register, Turkin/ Thompson also offered birth and death data and what today seems fairly limited statistical information but by previous standards was a veritable cornucopia: year, club, league, position, games, and batting average or won-lost record. A landmark volume that did much to inspire this one, The Official Encyclopedia of Baseball lasted through ten revised editions, the last being published in 1979, ten years after the initial appearance of Macmillan's Baseball Encyclopedia. The genesis of the Turkin/Thompson opus was one day in September 1944 when musician Thompson invited his neighbor, New York Daily News sportswriter Turkin, to "look over his baseball collection." What Turkin saw was a massive treasure chest of data, collected and collated over twenty years. "Tommy" Thompson was a baseball nut--a figure filbert, in the parlance of the time--who researched baseball just for the love of it. He was not alone in this pursuit, although very nearly so--other baseball archaeologists of the time who contributed to this encyclopedia were Frank Marcellus, Tom Shea, Lee Allen, Ralph Lin Weber, Joe Overfield, Bob McConnell, and the aforementioned Ernie Lanigan. The Official Encyclopedia of Baseball went a long way toward making the study of baseball history and records a respectable pursuit, just as a century earlier the statistical accounting of a boys' game had helped to make baseball a sport for grown men. The researchers' ranks expanded to include such men as Bob Davids, who in 1971, aided by other experts like Cliff Kachline, Bill Haber, Ray Nemec, John Pardon, and Joe Simenic, would create SABR, the Society for American Baseball Research (pronounced "saber"). Formerly the lonely pursuit of a handful of "nuts" like S.C. Thompson, baseball research and sabermetrics--a neologism coined in honor of SABR, signifying the statistical analysis of the game's records--would become the pastime of thousands. An article in Life magazine by Branch Rickey on August 2, 1954, gave further impetus to the study of baseball statistics, but not just to set the historical record straight. Indeed, this article may be viewed as the opening shot of the sabermetric assault of the 1980s. In "Goodby to Some Old Baseball Ideas," Rickey, with the aid of some new mathematical tools supplied by Dodger statistician Allan Roth, sought to puncture some long-held conceptions about how the game was divided among its elements (batting, baserunning, pitching, fielding), who was best at playing it, and what caused one team to win and another to lose. This is a pretty fair statement of what sabermetrics is about. Rickey attacked the batting average and proposed in its place the on base percentage; advocated the use of Isolated Power (extra bases beyond singles, divided by at-bats) as a better measure than slugging average; introduced a "clutch" measure of run-scoring efficiency for teams, and a similar concept for pitchers (earned runs divided by baserunners allowed); reaffirmed the basic validity of the ERA; saw the strikeout as the insubstantial thing it was--and more. But the most important thing Rickey did for baseball statistics was to pull it back from the wrong path it had taken with the introduction of the batting average in 1876: to strip the game and its stats to their essentials and start again, this time remembering that individual stats came into being as an attempt to apportion the players' contributions to achieving a team victory, for that is what the game is about. Rickey and Roth devised a formula to measure a team's efficiency in turning its offensive and defensive statistics into runs, and thus wins. They realized, and had confirmed for them by mathematicians at the Massachusetts Institute of Technology, that just as the team which scores more runs in a game gets the win, so a team which over the course of a season scores more runs than it allows should win more games than it loses--and by an extent correlated to its run differential. From this startlingly simple (or rather, seemingly simple) observation in 1954 flowed: first, the trailblazing but little noted work of George Lindsey in the 1950s and early 1960s, when he developed a model for run-scoring probability from the twenty-four combinations of outs and bases occupied; the development of "percentage baseball" stats and strategies by Earnshaw Cook in the 1960s; the play-by-play analysis of complete seasons by the Mills brothers, Eldon and Harlan, in 1969-1970; and, over the next two decades, the statistical and historical works of several sabermetricians, most notably Bill James. The Computer Age, 1969- Despite the death of Turkin in 1957 and Thompson ten years later, their Official Encyclopedia of Baseball remained the dominant book of baseball statistics, although many fans were frustrated with the fragmentary records it presented. As Frank V. Phelps wrote in the 1987 edition of The National Pastime, "Gaps and obvious errors in official averages, the lack of many early records, difficulty in securing the records of players who appeared in only a few games, and frustrating discrepancies among existing guides and registers had long since created a desire for an ultimate, complete, correct set of major league records. But it wasn't until the mid-1960s that the development of sophisticated computers which could absorb, retain, order, and output huge amounts of data finally made a project feasible." Beginning in 1967, a battalion of researchers commanded by David Neft foraged through the official records and newspaper box scores to provide freshly compiled figures for those who had no ERAs, RBIs, slugging averages, saves, and all manner of wonderful things. The material which finally appeared in the tome was entered into a data bank, and the book was the first typeset entirely by computer, now a common practice. Published in 1969, The Baseball Encyclopedia was a milestone in computer technology, but as indispensable as the computer were the old-fashioned scrapbooks and files of Lee Allen and John Tattersall. The result was a mammoth ledger book of the major leagues more thorough than any that had appeared before. The Baseball Encyclopedia researchers not only found new data to correct old inaccuracies but also applied new yardsticks to men who had gone to their graves never having heard of an RBI or a save. They also raised the hackles of traditionalists with many of their findings, which prompted the formation of a Special Baseball Records Committee. Its members ruled upon such matters as whether, for the historical record, bases on balls should be counted as hits (as they were in 1887), outs (as they were in 1876), or neither (as has been the practice in all other years); or whether "sudden-death" home runs--thirty-seven game-winning blows with men on base that they identified as having occurred in the bottom half of the ninth or extra inning--would be credited as homers or, in the practice before 1920, would count for only as many bases as needed to push across the winning run. In the latter controversy, committee members first decided to count the disputed blows as homers, but then, when complaints arose that Babe Ruth's famous total of 714 would change to 715, they reversed themselves. They decided that the National Association of 1871-1875 was not a major league, while the Federal League, Union Association, and Players League were; and they ruled on several other issues, all of which were published in the Appendix to The Baseball Encyclopedia. In Total Baseball, we have abided by most of the committee's decisions--not to preserve Ruth's total, but because there were many more such homers before 1920 than the thirty-seven the committee identified, and the disputes surrounding some of them are now beyond settling. We have, however, treated the National Association as a major league, as Turkin/ Thompson and all previous record books did, and in accordance with the views of most historians. And we have differed from the committee's ruling on awarding pitchers wins and losses in the years before 1920. Not finding any official scoring rule or practice for that time, they chose to apply 1950 guidelines to decisions awarded in 1876-1920. This well-intentioned decision produced substantial alterations in the records of such hurlers as Cy Young, Christy Mathewson, Grover Alexander, and others. In the ensuing years, the notable research of Frank Williams (reported in "All the Record Books Are Wrong," The National Pastime, 1982) revealed that there was indeed a pattern and a rationale for the way decisions were awarded in those days; the data in Total Baseball conforms with his findings. ICI research created new stars, launching several previously underappreciated heroes of old into the Hall of Fame. Sam Thompson, Addie Joss, Roger Connor, Amos Rusie--their phenomenal level of play was hidden simply because statisticians back then were not recording the particular numbers which would show them off to best advantage. If sabermetrics consists of finding things in the existing data that were not seen before, or collecting that data which makes possible the application of new statistics to old performances, the first edition of The Baseball Encyclopedia was a monument in the course of sabermetrics. However, its subsequent editions declined from that standard, dropping valuable data, jimmying figures for star players in a misguided homage to tradition, and making a shambles of individual/team balance in the totals. As Phelps wrote of the second edition, edited by Joseph L. Reichler for the Macmillan Company after the ICI group broke up and relinquished supervision: "Players' batting statistics were changed without compensating for changes in the records of other players on the same teams or in the corresponding team and league totals. Later editions included even more unbalanced adjustments . . . "Quite apart from the problem of record-balancing, the numerous changes in players' totals and averages has caused serious misapprehensions and confusions for fans, writers, and researchers. The records of Fred Clarke and Cy Young differ in all six editions [to 1987] even without counting Clarke's astronomical 1899 BA [in the third edition, Clarke was credited with a batting average of .986 that boosted his lifetime mark by 15 points]. The figures for Burkett, Chesbro, Duffy, Hornsby, Walter Johnson, Radbourn, Speaker, and Waddell differ in five of the six books. The same is so in four of six for at least twenty-three other Hall of Famers, and many more less gifted players." The seventh edition was issued in 1988 and, like the five that preceded it, was less accurate than the classic first issue. The eighth edition, published in 1990, corrected many of the errors in the seventh but--perhaps because of its marketing link with Major League Baseball--retained many once-contested errors that historians had long since expunged from the record, while changing other statistics in a manner at variance with Major League Baseball's wishes. For the ninth edition, MLB withdrew its product endorsement. (David Neft of ICI, along with Baseball Encyclopedia staff alumni Dick Cohen and Jordan Deutsch, went on to form Sport Products, Inc. Since 1974 they have issued the excellent Sports Encyclopedia: Baseball, which has endured as the baseball reference of choice for thousands of sophisticated fans.) We will have more to add about accuracy and balance in the "Errors and Controversies" section of this Introduction. There were two other interesting developments in 1969. The first and less celebrated was a research project launched by Eldon and Harlan Mills that, like the ICI encyclopedia, could not have been contemplated without the computer. The Mills brothers tracked the entire major league seasons of 1969 and 1970 on a play-by-play basis. Then they applied to that record the probabilities of winning which derived from each possible outcome of a plate appearance, as determined by a computer simulation incorporating nearly eight thousand possibilities. What, for example, was the visiting team's chance of winning the game before the first pitch was thrown? Fifty percent, if we are pitting two theoretical teams of equal or unknown ability on a neutral site. If the first man fails to get on base, the chances of the visiting team winning are reduced to 49.8 percent; should he hit a double, the visiting team's chance of victory is raised to 55.9 percent, as determined by the probabilistic simulation. Every possible situation--combining half inning, score, men on base, and men out--was tested by the simulator to arrive at "Win Points." The Millses' purpose was to determine the clutch value of, say, hitting a homer with two men on and one man out in the bottom of the ninth, with the team trailing by two runs, the situation Bobby Thomson faced in the climactic National League game of 1951--oddly, the rookie year of the first modern computer. (It gained for him 1,472 Win Points; had it come with no one on in the eighth inning of a game in which his team led 4-0, the homer would have been worth only 12 Win Points.) What the Mills brothers were attempting to do was to evaluate not only the what of a performance, which traditional statistics indicate, but the when, or clutch factor, which no statistic to that time could provide. This project, detailed in a small book issued in 1970 called Player Win Averages, proceeded from the same impulse that led to other measures of clutch performance: the game winning RBI, introduced as an official major league stat in 1980 and scrapped in 1989; the measure of batting performance in late-inning pressure situations first published by Seymour Siwoff, Steve Hirdt, and Peter Hirdt of the Elias Sports Bureau in 1985; and the historically complete indexes of clutch hitting and clutch pitching developed for this book. The other noteworthy baseball event of 1969 (besides the centennial of professional baseball and the miracle of the Mets) was the adoption by the major leagues of the save, the stat associated with the most significant strategic development since the advent of the gopher ball. Now shown in the papers on a daily basis, saves were not officially recorded at all until 1960; it was at the instigation of Jerry Holtzman of the Chicago Sun-Times, with the cooperation of The Sporting News, that this statistic was finally accepted. (Although Pat McDonough, a founding member of SABR, had developed a similar stat in 1924 which he called "games finished by relief hurlers"; its first appearance in print came in the New York Telegram three years later.) The need for the save arose because relievers operated at a disadvantage when it came to picking up wins. The bullpen specialists were a new breed, and as their role increased, the need arose to identify excellence, as it had long ago for batters, starting pitchers, and fielders. The save's prime statistical drawback is that there is no negative to counteract the positive, no stat for saves blown (except, all too often, a victory for the "fireman"); unofficial attempts to develop such a stat have accelerated in recent years, and now are part of the formula for the Fireman of the Year award. August 10, 1971, marked another milestone, the founding in Cooperstown of SABR, the group in whose annual publications most of today's sabermetricians cut their analytical teeth. Its statistical analysis research committee, headed for more than a decade by Pete Palmer, has served as a sounding board for the inventive approaches of such men as Dallas Adams, Dick Cramer, Steve Mann, Craig Wright, and Bill James. James published The Baseball Abstract from his home in Lawrence, Kansas, for five years to a minute if appreciative audience (its 1977 publication budget: $112.73). In 1982 Ballantine Books, recognizing the increasing sophistication of baseball fans in the computer age, assumed publication of the Abstract, and the audience for sabermetrics became sizable indeed, with James' annuals reaching the bestseller lists and his Historical Baseball Abstract becoming an essential book for anyone who viewed himself as a serious fan. James has popularized a different approach to the whole question of what baseball statistics are for--that they are not brass knuckles to beat a barroom adversary with, but a means of achieving a better understanding of the game and heightening one's pleasure in it. Among the many valuable analytical tools he has developed are the Brock-2 System of projecting career totals, the Victory Important RBI, Offensive and Defensive Winning Percentages, Secondary Average, Range Factor, and Runs Created. The last-mentioned, perhaps because James developed it earlier in his career, is the most widely known, and we apply it in this book to all batters, in all fourteen variations of the formula, bringing in data for stolen bases, caught stealing, hit-by-pitch, and grounded into double play for those years in which it is available. (See the Glossary for the formulas.) The 1980s also brought attention to another attempt to redefine the measure of individual performance. In 1978 Barry Codell of Chicago distributed a paper describing his new statistic, the Base-Out Percentage, to fellow statisticians and figures in the sports media. At about the same time, Tom Boswell, not a statistician by trade or inclination but rather a sportswriter for the Washington Post, developed a stat called Total Average. Like the Base-Out Percentage, Total Average is a gauge of offensive proficiency which takes into account not only batting but also base-running skills. (See the Glossary.) Dallas Adams and Dick Cramer devoted themselves in the late 1970s to a discussion of average batting, pitching, and fielding skill, which more than a decade later remains a subject of intense interest and passionate disagreement. The question, roughly put, is: How would Cy Young do against the batters of today? Or Wade Boggs against the pitchers of the 1890s? How many homers would Babe Ruth hit if he were active today? Or how many strikeouts would Nolan Ryan have registered in 1880, pitching from a fifty-foot distance? In other words, how can we adjust the statistics of players to reflect the certainty that the average batter, pitcher, and fielder have improved over time, thus narrowing the gap between each succeeding era's peak performance and its average one? (For more on this philosophically and mathematically complex subject, we refer you to The Hidden Game of Baseball.) Adams and Cramer advanced a discussion that had begun in 1976 with the first article on cross-era comparison, in which David Shoebotham proposed a new statistic called the Relative Batting Average. Shoebotham recognized that a .320 batting average in 1893, when the National League batted .280, did not represent the same level of accomplishment as that average did in 1968 when, for a number of reasons, the National League batted a measly .243. His solution? To normalize the players' averages to their respective league averages simply by dividing the player's batting average by that of his league. In this fashion he demonstrated, for example, that Pete Rose, who led the NL with a .335 BA in 1968, had a Relative BA of 1.38; while Ed Delahanty, who led the NL with a BA of .380 in 1893, had a Relative BA of only 1.36. Another way of stating this conclusion is that Rose's .335 was 38 percent above the average batting performance in the NL of 1968, while Delahanty exceeded his league's norm by 36 percent. The inferences that might be drawn from this approach are many: that batting skill has not declined since the days of Ruth, Gehrig, Foxx, et al., but that pitching skill might have increased; that no batting average of the years around 1930 ought to be taken without a carload of salt; that some of the most notable batting performances of all time, as measured by the batting average, have occurred right under our noses, unbeknownst to us. Normalizing a statistic to its league average is a valuable analytical tool if employed logically. A Relative Batting Average, for example, tells a good deal more, and tells it more straightforwardly, than Relative Homers or Relative Strikeouts. The relativist approach works better with ratios such as batting average, on base percentage, or slugging average--or for that matter with Runs Created or Total Average--than it does for simple counter stats. Another worthwhile adjustment to various averages is for home-park effects. The pioneering work in this area was done by Robert Kingsley, particularly in regard to why homers flew out of Atlanta's park despite its "normal" dimensions, but Pete Palmer was first to measure the effects of home parks on run totals and then to devise a park adjustment for the records of batters and pitchers. These were discussed in depth in The Hidden Game of Baseball, and the data base for park factor in this book has been upgraded to include runs scored and runs allowed at home parks instead of just the latter. In 1984 the editors of this volume introduced, in The Hidden Game, the Linear Weights System of assessing players' contributions to their teams--at the bat, on the basepaths, in the field, or on the mound--in terms of runs, which are the currency of the game. Its back-to-basics foundation is the same as that underlying the Rickey formula of 1954 and most of the new statistics developed since then: that wins and losses are what the game is about; that wins and losses are proportional in some way to runs scored and runs allowed; and that runs in turn are proportional to the events which go into their making. In the Linear Weights System, these events are expressed not in the familiar yet deceptive ratios--base hits to at-bats, wins to decisions, etc.--but in runs themselves, the runs contributed (by batting or base stealing) or saved (by pitching or fielding). Computer simulations of over 100,000 games produced the run values of, for example: a single (.47 runs), double (.78), triple (1.09), home run (1.40), walk (.33), steal (.30), caught stealing (-.60), out (-.25), and out made on base (-.50). Using a straightforward additive formula, one can calculate a batter or baserunner's contribution to his team in runs. These would be expressed in terms of runs contributed beyond what a league-average replacement player could contribute in his stead, and that average is defined as a baseline of zero. A team composed entirely of average performers would finish with a record of .500, as the league must--so each above-average player contributes positive runs toward a win, and each subpar player contributes negative runs. Normalizing factors (to league average) are built into the formulas for all but base stealing, where league average is not a shaping force; these factors enable us to compute, for example, the number of runs (Batting Runs) that Cecil Fielder provided in 1990 beyond those an average hitter might have produced in an equivalent number of plate appearances. And by adjusting Fielder's Batting Runs for Detroit's homepark influences, the Linear Weights comparison may be extended to how many runs he accounted for beyond what an average player might have produced in the same number of at-bats had he too played half his games in Tiger Stadium. Furthermore, having determined the number of runs above average required to transform a loss to a win in the final standings (generally around ten, historically in the range of nine to eleven; for more on the theory behind this, see the Glossary), we can convert a player's Linear Weights record--expressed as Batting Runs, Base Stealing Runs, Pitching Runs, or Fielding Runs--to the number of wins above average he alone contributed. What are individual statistics for if not to achieve some understanding of this? Last, by reviewing the win contributions of all a team's personnel, we may establish a solid assessment of that team's strength and weaknesses--either to predict a team's chances for success in the upcoming season or, in an encyclopedia like Total Baseball, to analyze how and even why it failed its reasonable statistical expectations or exceeded them. Formulas for the Linear Weights measures for batting, baserunning, fielding, and pitching will be found in the Glossary. Other developments of the decade include the previously mentioned adoption of the game-winning RBI (GWRBI) in 1980; it credited the batter who drove in a run to give his club a lead that it never relinquished. This stat was pilloried in the press from its introduction, with merit, until Major League Baseball finally gave up on it before the 1989 season. In 1984 on base percentage was made official, thirty years after its introduction to the general baseball public by Branch Rickey and Allan Roth. Subsequent years brought the Quality Start, which takes note of a pitcher who gives his club six innings or more while allowing three runs or less. Under this construction, an ERA of 4.50 in a mercifully shortened outing is held to be commendable. The editors of this book do not regard the Quality Start as a quality stat. More interesting are the situational stats which are the specialty of the Elias Sports Bureau and Project Scoresheet--performance in day games vs. night, grass vs. artificial turf, lefty vs. righty, day game following night, bases-loaded situations, and so on. When the data is drawn from a large enough sample, these stats can be provocative and meaningful; too often, however, television announcers desperate to maintain conversation flow will burden their listeners with something like, "Over the last two seasons, he's batted .375 against this guy" (not bothering to add that the figure represents three hits in eight times at bat). Situational stats are the wave of the future in baseball, but are not yet of much use for reviewing the past--Elias has kept them systematically only since 1975. Total Baseball The next major event in the history of baseball record keeping is Total Baseball. Founded upon a unique historical database that Pete Palmer has cultivated for decades--in the tradition of baseball archivists like S.C. Thompson, Bradshaw Swales, Leonard Gettelson, and John Tattersall--Total Baseball is the third-generation encyclopedia of the game. Just as the advent of the Macmillan/ICI encyclopedia supplanted Turkin/ Thompson, the standard for two decades, Total Baseball has taken advantage of new technology and new research, notably by members of the Society for American Baseball Research, to present more accurate data than ever before, and more of it. There are, of course, the traditional stats one would expect in a baseball reference work; there are many of the new, more revealing stats discussed above; there are stats never published before and developed now for this book. And as you have seen in Part One, there is a recognition that baseball history and knowledge resides not only in its numbers. But returning to the statistics and records which make up this second part of Total Baseball, here is a brief rundown of what's coming (full descriptions will be found in the separate introduction to each section): - The Annual Record: Season-by-season standings and records for all teams since 1871, plus the top five league leaders in generally forty-eight categories per season. - The Rosters: A completely revised manager roster, courtesy of some splendid research into the early years by SABR's Bob Tiemann and Richard Topp; and the application to all managers of the "actual vs. expected win" method introduced in the editors' Baseball Annual 1990 (with Eliot Cohen); a definitive roster of the men in blue, compiled by expert Larry Gerlach; a roster of coaches, never before compiled; a roster of club owners and presidents; and a roster of all the black men who played professionally in the years of segregated ball. - The Player, Pitcher, and Relief Pitcher Registers: The heart of this section of Total Baseball, presenting complete seasonal and lifetime records for every major leaguer, with twenty-three stats for players, twenty-five for pitchers, and ten for relievers. - All-Time Leaders: The top one hundred lifetime and single-season performers in 219 categories, including important conventional stats not found in other encyclopedias and dozens of the sabermetric variety. Now that the genealogy of the more significant records and record books has been described, it's time to say a few words about the measures you'll find in the main statistical sections of Total Baseball: the annual record and the player/pitcher registers. We will not attempt to define the basic counting stats such as games, at-bats, wins, losses, and so on; if these are puzzling to you, you have picked up the wrong book. Batting Let's start with the batting statistics, and the first of these to consider will be that venerable, uncannily durable fraud, the batting average. (It consists simply of hits divided by at-bats.) We know as well as anyone else that this monument just won't topple; the best that can be hoped is that in time fans and officials will recognize it as a bit of nostalgia, a throwback to the period of its invention when power counted for naught, bases on balls were scarce, and no one wanted to place a statistical accomplishment in historical context because there wasn't much history yet. Time has given the batting average a powerful hold on the American baseball public; everyone knows that a man who hits .300 is a good hitter while one who hits .250 is not. Everyone knows that--no matter that it is not true. You want to trade Lenny Dykstra for Kevin Mitchell? Willie McGee for Will Clark? Batting average treats all hits in an egalitarian fashion. A two-out bunt single in the ninth with no one on base and your team trailing by six runs counts the same as Bobby Thomson's "shot heard 'round the world." And what about a walk? Say you foul off four 3-2 pitches, then watch a close one go by to take your base. Where's your credit for a neat bit of offensive work? Not in this stat. And a .250 batting average may have represented a distinct accomplishment in certain years, like 1968 when the American League mean was .230. That .250 hitter stood in the same relation to an average hitter of his season as a .282 hitter did in the American League in 1988--or a .329 hitter in the National League of 1930! If .329 and .282 and .250 all mean roughly the same thing, it raises questions about the value of the measure. And yet, the batting champion each year is declared to be the one with the highest batting average, and this will not soon change. And the Hall of Fame is filled with .300 hitters who couldn't carry the pine tar of many who will stay forever on the outside looking in. Knowledgeable fans have long realized that the ability to reach base and to produce runs are not adequately measured by batting average, and they have looked to other measures--for example, the other two components of the Triple Crown, home runs and RBIs. Still more sophisticated fans have looked to the slugging average or on base percentage, and in the 1980s to various sabermetric measures. The slugging average does acknowledge the role of the man whose talent is for the long ball and who may, with management's blessing, be sacrificing bat control and thus batting average in order to let'er rip. (Slugging average is the number of total bases divided by at-bats.) But the slugging average has its problems, too. It declares that a double is worth two singles, that a triple is worth one and a half doubles, and that a home run is worth four singles. All of these proportions are intuitively pleasing, for they relate to the number of bases touched on each hit, but in terms of the hits' value in generating runs, the proportions are wrong. One home run in four at-bats is not worth as much as four singles, for instance, in part because the total run potential for the team of four singles is greater, and in part because the man who hit the four singles did not also make three outs; yet the man who goes one for four at the plate, that one being a homer, has the same slugging percentage of 1.000 as a man who singles four times in four at-bats. Moreover, it is possible to attain a high slugging average without being a slugger. In other words, if you have a high batting average, you must have a decent slugging average; it's difficult to hit .350 and have a slugging percentage of only .400. Even a bunt single boosts not only your batting average but also your slugging average. (The attempt to counteract this problem is a statistic called Isolated Power, which divides only extra bases by at-bats.) Other things the slugging average does not do are: indicate how many runs were produced by the hits; give any credit for other offensive categories, such as walks, hit-by-pitch, or steals; permit the comparison of sluggers from different eras (if Jimmie Foxx had a slugging percentage of .749 in 1932 and Mickey Mantle had one of .705 in 1957, was Fox 7 percent superior? The answer is no, and the reason is in the higher slugging average of the AL in 1932. Well, how about on base percentage? (To calculate this stat, divide hits, walks, and hit-by-pitch by at-bats, walks, hit-by-pitch, and sacrifice flies.) On base percentage has the advantage of giving credit for walks and hit-by-pitch, but it is an unweighted average and thus makes no distinction between those two events and, say, a grand-slam homer. A fellow like Eddie Yost, who, in some years when he hit under .250, drew nearly a walk a game, gets his credit with this stat as does a Gene Tenace, one of those guys whose statistical line looks puny without his walks. Similarly, players like Mickey Rivers or Omar Moreno, leadoff hitters with a lot of speed, no power, and no patience, are exposed by the OBP as distinctly marginal major leaguers, even in years when their batting averages look respectable or excellent. In short, on base percentage does tell you more about a man's ability to get on than does the batting average, and thus is a better indicator of run generation, but it's not enough by itself to separate the "good" hitters from the "average" or "poor" ones. Not by itself, no . . . but when you add it to slugging average, you come up with a very powerful indicator of batting ability. These two one-legged men, when joined together, make for a very sturdy tandem, the infirmity of the one being almost exactly compensated by the power of the other. The virtues of on base plus slugging, a combined stat called Production, are that it is easily computed from officially issued stats and that it is the most accurate of all the newer stats except those denominated directly in runs. Its weaknesses are that because it is stated as the sum of two averages, it is--like a batting average or earned run average or any other average--a measure of the rate of success rather than the amount, and the fan needs considerable context to know what it means. Is a Production mark of .750 poor, average, or outstanding? (Answer: pretty good, because the league average figure in recent years has been in the low .700s--although in the NL of 1930 it exceeded .800.) This second drawback may be eliminated in the same manner for all averages: by normalizing, or adjusting, each individual performance to the league average in that category for the year in which it took place. If a batter's Production was .700 in a year when the league average was .700, he performed at a rate of 100 (his Production divided by the league's, discarding the decimal point for ease of expression). If his Production was .800, his league-adjusted mark would be 114. The meaningfulness of that performance might be further refined by adjusting it once more, to take into account the run-producing characteristics of the man's home park: a batter whose home park was a hitters' haven like Wrigley Field might have his Production adjusted downward, while another playing half his games in the Astrodome might have his adjusted upward. In Total Baseball, figures adjusted for league average and park factor are denoted by "/ A" following the raw figure, and the Park Factor (PF) is expressed with a baseline of 100--a hitter's park might have a factor of 110, a pitcher's park 90. In this third edition of Total Baseball, we state Production in the Player Register only in its normalized, park-adjusted form, here termed "Production+." RBIs? Don't they indicate run production and clutch ability? Yes and no. The RBI does tell you something about run-producing ability, but not enough: it's a situation-dependent statistic, inextricably tied to factors which vary wildly for individuals on the same team or on others (including, importantly, the position of each player in the batting order). And the RBI makes no distinction between being hit by a pitch to drive in the twelfth run of a game that concludes 14--3 and, again for comparison, the Thomson blast. RBIs tell how many runs a batter pushed across the plate, all right, but they don't tell how many fewer he might have driven in had he batted eighth rather than fourth, or how many more he might have driven in on a team that put more men on base. They don't even tell how many more runs a batter might have driven in if he had delivered a higher proportion of his hits with men on base. The American League kept RBI Opportunities--men on base presented to each batter--as an official stat for the first three weeks of 1918, then saw how much work was involved and ditched it. The problem remains: how to assess run productivity for batters. Pitchers are easier. Their accomplishments are directly measured in runs allowed. But batters, baserunners, and fielders make their contributions in the constituent parts of runs--outs, hits, and a variety of more or less successful other events. (Even a batter who hits a solo homer contributes more than one run to his team, because he permits another player to bat who otherwise would not have, and each batter has a potential for producing further runs.) You hear a lot in the media about the value of Runs Produced, a stat we track in Total Baseball in the top five section of the Annual Record. Runs Produced is simply runs scored plus runs batted in, subtracting homers because a dinger gives a batter "double credit"--a run scored plus an RBI. The editors view Runs Produced as an odd linkage of one opportunity-dependent stat with another that depends upon largely the same factors, but we offer the stat for those who like that sort of thing. And so we come to the recently formulated game-winning RBI (GWRBI)--a noble attempt at describing the value of a hit to the team, its "clutchness," but a measure which was misconceived in its presumption that a game could be won with a hit in the first inning. A man who drives in a run in the first inning is simply doing his job, not performing an extraordinary feat; if the pitcher makes that run hold up by throwing a shutout, bully for him, but why credit the hitter? Were he to drive in the lone run of the game in the seventh inning or later, that would be different. Nonetheless, the latest formulation of the stat gave the man who drove in that first-inning run a GWRBI even if his team eventually won 22-0, since it gave the team a lead that was never relinquished. Worse, the GWRBI was situation-dependent to an even greater degree than the RBI. You can't play for a lousy team and lead the league in GWRBIs because there aren't enough games won to go around. And it's even harder to accumulate GWRBIs from the eighth place in the batting order than it is to accumulate RBIs. Last, if you put your team ahead with an RBI in the bottom of the eighth, why should you lose your GWRBI simply because the pitcher allows the lead to be lost? Wasn't your hit "clutch"? Say the pitcher allows the score to be tied, then a teammate might pick up the GWRBI that should have been safely tucked away for you. Nicely motivated, the GWRBI, but utterly without merit and thus we barred it from the first edition of Total Baseball, no matter that at the time we prepared that book the GWRBI was still an official Major League Baseball stat. It is no longer. We do, however, present a measure called Clutch Hitting Index, which addresses the problem of run-producing opportunities on a historical basis. We offer this with several reservations, including the classic philosophical one about whether clutch ability exists at all. Is a man who hits .280 with men on base and .240 with the sacks clear a hero in the former situation or a bum in the latter? The Clutch Hitting Index measures actual RBIs over expected RBIs, which have been calculated on the basis of a man's extra-base hits and the opportunities he could have been expected to have, based on the average RBIs per league and where he batted in the lineup and who batted above him. This is, by admission, a rough measure indeed, but we think it's an interesting one. We included it in the Player Register in the first edition; since then we have confined the stat to the Annual Record and Leaders Sections. For teams, the measure of clutch hitting is more elegant: the ratio of its actual runs to its runs as calculated by the Linear Weights method. Previously discussed were Runs Created, Total Average, and Batting Runs. Total Average numbers will tend to look like those of Production, which measures largely the same things only in a different manner; for this reason we have removed it from the Player Register in this edition while retaining it in the Annual Record and Leaders. The numerical expression of Runs Created exceeds that of Batting Runs, except that its baseline of zero defines the worst player in the league rather than the average one. Base Stealing Many fans understand, as a result of sabermetric findings of the 1980s, that a man with a lot of stolen bases is not necessarily the best baserunner, nor even an asset to his team; he might have been caught nearly as often as he stole and thus may have cost his team many runs on balance. The game's encyclopedic reference works have in years past contained stolen base totals, even if the tabulations for the early years were suspect because of unclear standards for what differentiated a steal from clever baserunning. What they have not offered is the flip side of the steal--the caught-stealing numbers that make sense of the steal itself. As mentioned above, caught stealing was recorded officially in the AL beginning in 1920, then was dropped for 1927, was resumed in 1928, and has been continuously in use ever since. In the NL, it was computed for 1920-1925, then was dropped until 1951, when it resumed on a continuous basis. We have figures kept by Ernie Lanigan for the years 1914-1916 in the AL and for 1915-1916 in the NL, and in the second edition we have added caught-stealing data for 1927 from newspaper accounts (about 90 percent complete). In Total Baseball we present, for those years in which the data exists, the raw CS data, Stolen Base Averages, and Stolen Base Runs. This last is expressed in runs, based on the computer-derived value of .30 runs for a stolen base and -.60 runs for a thwarted steal. To make a positive contribution to his team, a base thief must be successful in more than two-thirds of his attempts. Fielding When, back in 1954, Rickey and Roth came up with their "efficiency formula" for run scoring and run prevention, the defensive half of the equation was divided into five segments. The first was opponents' batting average; the second was opponents' reaching base through bases on balls or hit batsmen; the third was a measure of a pitcher's clutch ability; the fourth was his strikeout capability; and the fifth was fielding, to which they assigned a mathematical value of zero. "There is nothing on earth," Rickey declared, "anyone can do with fielding." Besides, he added, good fielding might account for the critical run in a ballgame only four or five times a year. Was Rickey right? The central weakness of the fielding average has long been known: you can't make an error on a ball you don't touch. To counter this weakness in fielding average and to credit the plays made as well as the plays not made, total chances per game is a more useful statistic--and when errors are deducted from chances, you have a fielder's Range Factor. James pointed out how absurd it had become, in a time when the best-fielding second baseman might commit ten errors a season and the worst twenty, to focus on this difference of ten rather than on the 250-300 in total chances which might separate the most agile keystoner from the exemplar of Lot's wife. Another difficulty with the fielding average is that to understand what figure represents mean performance (and thus be able to identify inferior and superior fielders), one must adjust for position: a shortstop who fields .980 has done quite well, but a first baseman, catcher, or outfielder with that figure would have been below average. Thus the fan must bring to the fielding average a great deal of background knowledge--the mean for each fielding average for each position in each season. This is a demand that, on first reflection, is not created by the batting average (all men stepping to the plate occupy the same position--batter). On second thought, however, the knowledgeable fan recognizes that a batting line of .267, 10 HRs, 80 RBIs will mean different things when applied to a shortstop or to a left fielder. In other words, just as any evaluation of fielding performance carries an inherent positional bias, so does batting performance. High double-play totals are believed to indicate excellence among middle infielders, but the more double plays a club turns, as a rule, the worse the pitching. Which teams had the most double plays in major league history? In the 154-game season, the Philadelphia A's of 1949 and the Los Angeles Dodgers of 1958; in the 162-game season, Toronto and Boston of 1980 and Pittsburgh of 1966. Of these, only the last-mentioned had a team ERA better than the league average. If the pitchers are putting a lot of men on base, the team can get a lot of double plays even without a great-fielding shortstop and second baseman. So what to do? How do we assess fielding excellence? The idea of crediting stellar fielding plays individually has been proposed occasionally ever since 1868, when Chadwick wrote: "The best player in a nine is he who makes the most good plays in a match, not the one who commits the fewest errors, and it is in the record of his good plays that we are to look for the most correct data for an estimate of his skill in the position he occupies." Father Chadwick was correct to see that fielding percentage emphasized failure rather than success, but in truth the fielding percentage was a far better measure of ability in the 1860s, when one play in four produced an error, than now, when only two plays in a hundred are flubbed. The choice in Total Baseball has been to concentrate on Total Chances but not to disregard the error, as Range Factor does; nor to include it in Total Chances, as David Neft would favor; nor to subtract it from Total Chances, as Barry Codell once advocated. The error may be infrequent today but it is not insignificant; instead, it is a peculiarly damaging event, turning an out (with its computer-derived run value of -.25) to, in effect, a hit (with its run value of +.50). This is a turnaround of .75 runs, or the equivalent of three outs; an outfield error costs even more, because it so often produces more than one base for batter and runners both. Thus the defensive stats we favor in this book and include in the Player Register and Pitcher Register are Linear Weights formulas, expressed in runs and computed differently for the different positions (see the Glossary for the formulas, which have a significant refinement in this edition). However, in all cases the elements of the statistics are putouts, assists, double plays, and errors (and for catchers, passed balls). Position players are gauged by Fielding Runs, a Linear Weights measure of the runs they saved (or allowed) through their play that an average man at that position would not have (second baseman are compared with other second baseman rather than, say, with left fielders--even the worst-fielding second sacker would cost his team fewer runs at the position than the best defensive left fielder). Pitcher Defense (like Pitcher Batting) is to be found in the Pitcher Register. Raw fielding statistics, of questionable value in and of themselves, have been excluded from this third edition of Total Baseball. An innovation this time around is the placement of the fielding average in the Player Register. It is computed for the position at which a man played the most games. Pitching On to the pitching statistics you will see in the Annual Record and Pitcher Register. First to be reviewed are wins and losses, and won-lost percentage. Wins are a team statistic, obviously, as are losses, but we credit a win entirely to one pitcher in each game. Why not to the shortstop? Or the left fielder? Or some combination of the three? In a 13-11 game, several players may have had more to do with the win than any pitcher. No matter. We're not going to change this custom, though Ban Johnson gave it a good try when he banished it from the American League records for seven years beginning in 1913. To win many games a pitcher generally must play for a team that wins many games. Look at Red Ruffing's won-lost record with the miserable Red Sox of the 1930s, then at his mark with the Yankees. Or at Danny Jackson, first with Kansas City, then with Cincinnati. There is an endless list of good pitchers traded to stronger offensive clubs who "emerge" as stars. The recognition of the weakness of this statistic came early. Originally it was not computed by such men as Chadwick because most teams leaned heavily, if not exclusively, on one starter, and relievers as we know them today did not exist. As the season schedules lengthened, the need for a pitching staff became evident, and separating out the team's record on the basis of who was in the box seemed a good idea. However, it was not then nor is it now a good measure of performance, for the simple reason that one may pitch poorly and win, or pitch well and lose. The natural corrective to this deficiency of the won-lost percentage is the earned run average--which, strangely, preceded it, gave way to it in the 1880s, and then returned in 1912. Originally, the ERA was computed as earned runs per game because pitchers almost invariably went nine innings. In this century it has been calculated as earned runs times nine, divided by innings pitched. The purpose of the earned run average is noble: to give a pitcher credit for doing what he can to prevent runs from scoring, aside from his own fielding lapses and those of the men around him. It succeeds to a remarkable extent in isolating the performance of the pitcher from his situation, but objections to the statistic remain. Say a pitcher retires the first two men in an inning, then has the shortstop kick a ground ball to allow the batter to reach first base. Six runs follow before the third out is secured. How many of these runs are earned? None. The prime difficulty with the ERA in the early days, say 1913, when one of every four runs scored was unearned, was that a pitcher got a lot of credit in his ERA for playing with a bad defensive club. The errors would serve to cover up in the ERA a good many runs which probably should not have scored. Those runs would hurt the team, but not the pitcher's record. This situation has been aggravated further by the use of newly computed ERAs for pitchers prior to 1913, the first year of its official status. Example: Bobby Mathews, sole pitcher for the New York Mutuals of 1876, allowed 7.19 runs per game, yet his ERA was only 2.86--almost a perfect illustration of the league's 40 percent proportion of earned runs. It is not an accident that pitchers of the dead-ball era of this century (1900-1919) dominate the lifetime and seasonal leaders tables in ERA. Yes, there were circumstances away from the mound that depressed batting, but the pitchers of that period also benefited mightily in the ERA column from the high number of errors, as compared to today. How to compare the ERA of an Ed Walsh or Three Finger Brown with a Frank Viola or a Dwight Gooden? As with batting stats, normalize the ERA to league average and adjust for home park effects. A pitcher from 1908 whose Adjusted ERA was 150 can be compared to one from 1988 with the same Adjusted ERA--each stood in the same relation to his peers, that is, 50 percent better than average. What gave rise to the ERA, and what we appreciate about it, is that like the batting average it is an attempt at an isolating stat, a measure of individual performance not dependent upon one's own team. Its principal shortcoming is that it indicates only a pitcher's rate of efficiency, not his actual benefit to the team. In a league with an ERA of 4.00, a starter who throws 300 innings with an ERA of 3.50 must be worth more to his team than a starter whose ERA is the same but who pitches in only half as many innings. Through the Linear Weights figures of Pitching Runs (broken out in the top-five section of the Annual Record as Starter Runs and Relief Runs), we can determine the number of runs a pitcher saved his team beyond what a pitcher performing at the league-average ERA would have allowed. A truly simple stat, it consists of nothing more than a pitcher's normalized, or league-adjusted, earned run average weighted by his innings pitched. Because Pitching Runs has a built-in normalizing factor, when you see it in Total Baseball under a heading for "/A," that adjustment will be for park factor. Pitchers' park factor is calculated differently from batters' park factor, for a number of fairly complex reasons that technical-minded readers might best consult in the Glossary. While the ERA is a far more accurate reflection of a pitcher's value than the BA is of a hitter's, it fails to a greater degree than the BA in offering an isolated measure. For a truly unalloyed individual pitching measure, we must look to the glamour statistic of strikeouts, the pitcher's mate to the home run (though home runs are highly dependent upon home park, strikeouts are to only a slight degree). Is a strikeout artist a good pitcher? Maybe yes, maybe no; a good analogue would be to ask whether a home run slugger is a good hitter. The two stats run together: periods of high home run activity (as a percentage of all hits) invariably are accompanied by high strikeout totals. Strikeout totals, however, may soar even in the absence of overzealous swingers, say, as the result of a rules change such as the legalization of overhand pitching in 1884, the introduction of the foul strike (NL, 1901; AL, 1903), or the expanded strike zone in 1963. Just as home run totals are a function of the era in which one plays, so are strikeouts. The great nineteenth-century totals--Matches Kilroy's 513, Toad Ramsey's 499, One Arm Daily's 483--were achieved under different rules and fashions. No one in that era fanned batters at the rate of one per inning; indeed, among regular pitchers (those with 154 innings pitched or more), only Herb Score did until 1960. In the next five years the barrier was passed by Sandy Koufax, Jim Maloney, Bob Veale, Sam McDowell, and Sonny Siebert. Walter Johnson, Rube Waddell, and Bob Feller didn't run up numbers like that. Were they slower, or easier to hit, than Sonny Siebert? Even in today's game, which lends itself to the accumulation of, by historic standards, high strikeout totals for a good many pitchers and batters, the strikeout is, as it always has been, just another way to make an out. Yes, it is a sure way to register an out without the risk of advancing baserunners and so is highly useful in a situation such as when there is a man on third with fewer than two outs; otherwise, it is a vastly overrated stat because it has nothing to do with victory or defeat--it is mere spectacle. A high strikeout total indicates raw talent and overpowering stuff, but the imperative of the pitcher is simply to retire the batter, not to crush him. What's not listed in your daily averages are strikeouts by batters--fans are not as interested in that because it's a negative measure--yet the strikeout may be a more significant stat for batters than it is for pitchers. Bases on balls will drive a manager crazy and put lead in fielders' feet, but it is possible to survive, even to excel, without first-rate control--provided your stuff is good enough to hold down the number of hits. Total Baseball offers two stats that are, like strikeouts, highly interesting but ultimately of debatable value: Opponents' Batting Average and Opponents' On Base Percentage. (The same could be said of Fewest Hits Per Game and Fewest Walks Per Game, of course.) It is illuminating to compare one or the other with a pitcher's ERA or Pitching Runs, but both calculations are somewhat academic, for at the end of a game, season, or career, it doesn't matter how many men a pitcher puts on base. Theoretically he can put three men on every inning, leave the twenty-seven baserunners allowed, and pitch a shutout. A man who gives up one hit over nine innings can lose 1-0; it's even possible to allow no hits and lose. Who is the better pitcher? The man with the shutout and twenty-seven baserunners allowed, or the man who allows one hit? No matter how sophisticated your measurements for pitchers, the best ones are counted in runs. The nature of baseball at all points is one man against nine. It's the pitcher against a series of batters. With that situation prevailing, we have tended to examine batting with intricate, ingenious stats, while viewing pitching through generally much weaker, though perhaps more copious, measurements. What if the game were to be turned around so that we had a "pitching order"--nine pitchers facing one batter? Think of that for a minute. The nature of the statistics would change, too, so that your batting stats would be vastly simplified. You wouldn't care about all the individual components of the batter's performance, all combining in some obscure fashion to reveal run production. You'd care only about runs. Yet what each of the nine pitchers did would bear intense scrutiny, and over the course of a year each pitcher's Opponents' Batting Average, Opponents' On Base Percentage, Opponents' Slugging Average, and so forth, would be recorded and spun to come up with a sense of how many runs each pitcher had saved. A pitching stat with an interesting history is complete games. This is your basic counter stat, but it's taken to mean more than most of those measurements by baseball people and knowledgeable fans. When everyone was completing 90-100 percent of his starts, the stat was without meaning and thus was not kept. As relief pitchers crept into the game after 1905, the percentage of completed games declined rapidly. By the 1920s it became a point of honor to complete three quarters of one's starts; today the man who completes half is quite likely to lead his league. So with these shifting standards, what do CGs tell you? About pitchers, not a lot anymore: about managers and bullpens, a great deal. Can we say that a pitcher with 18 complete games out of 37 starts is better than one with 12 complete games in 35 starts? Not without a lot of supporting help, we can't, not without a store of knowledge about the individuals, the teams, and especially the eras involved. The more uses to which we attempt to put the stat, the weaker it becomes, the more attenuated its force. If we declare the hurler with 18 CGs "better," how are we to compare him with another pitcher from, say, fifty years earlier who completed 27 out of 30 starts? Or another pitcher of eighty years ago who completed all the games he started? (Jack W. Taylor completed every one of the 187 games he started over five years.) Or what about Will White, who in 1880 started 75 games and completed every blessed one of them? But the rules were different, you say, or the ball was less resilient, or they pitched from a different distance, with a different motion, or this, or that. The point is, there are limits to what a traditional, unadjusted baseball statistic can tell you about a player's performance in any given year, let alone compare his efforts to those of a player from a different era. Of shutouts there is little to say that is not perfectly obvious, except that historical totals have been revised because (a) in 1920-1939 the American League did not count games of less than nine innings as shutouts, and (b) in those years and before, in both leagues, a pitcher was credited with a shutout even if he was pulled midway, if he had pitched enough innings of a combined whitewash. Total Baseball counts only complete-game shutouts. Wins Above Team is, as discussed, a variation of Ted Oliver's stat made public in 1944, which he called the Weighted Rating System. Apart from modifying his math, we have taken Oliver's "points"--the thousands of points his formula gave to hurlers who performed well for poor teams--and by retaining the decimal that he would have discarded, we have come up with a stat that is expressed quite properly in wins. In this edition, Wins Above Team is recorded only for the top hundred lifetime and season marks. Newly developed for the first edition of this book was a Clutch Pitching Index that, like the measure for clutch hitting, could be applied to historical data. The CPI is figured by taking how many earned runs the pitcher should have allowed, based on the performance of the batters who faced him, and how many he actually allowed (see the Glossary for the formula). The Clutch Pitching Index consists of expected runs over actual runs, so marks over 100 exceed league-average performance. For relief pitchers, we have previously discussed saves, and Relief Runs (the Linear Weights category) are figured no differently than Starter Runs. Newly developed here is Relief Ranking, which adjusts Relief Runs for the greater situational importance of each run a bullpenner saves or yields. The other elements of the formula are wins, losses, and saves, in a proportion detailed in the Glossary. Games, innings, and ERA in relief are broken out in the new Relief Pitcher Register. Bringing It All Together Pitcher Batting and Pitcher Defense are recorded in the Pitcher Register as Linear Weights figures, expressed in runs. (Pitcher batting has been removed from league stats for such computations, so that the batting records of everyday players are compared only with those of their peers and pitchers' batting records are compared only with those of their peers.) For this edition we have added to Pitcher Batting Runs the seasonal totals for hits and the pitchers' batting averages. The totals for Pitcher Batting Runs and Fielding Runs are seldom of a great magnitude--and for AL pitchers since 1973, the batting figure is, of course, zero--but in earlier years a pitcher's ability to help himself and his team off the mound has occasionally counted for a great deal in a given season; spitballer Ed Walsh in 1907, the year before he won 40 games for the White Sox, accounted for an astounding 2.3 Fielding Wins. The hitting ability of a Wes Ferrell or Don Drysdale certainly counted for something in their teams' prospects for victory. In Total Baseball a pitcher's overall contribution is reflected in the Total Pitcher Index, converted from Runs above average to Wins, based on the Runs required to create an extra Win in that year. For everyday position players, add Fielding Runs to Stolen Base Runs to Batting Runs, then convert those combined Runs to Wins, and you have the best measure of the complete ballplayer: the Total Player Rating. We believe, however, that a positional adjustment must be made to the above combination to reflect the greater skill required to play, for example, second base than left field; this adjustment is based on the average batting skill required at that position to hold a major league job. Historically, left fielders have presented the best record in Batting Runs and middle infielders the worst. In other words, a left fielder who accounted for 10 Fielding Runs should not be regarded as having the same value to a team as a shortstop who also contributed 10 Fielding Runs: Have the two men switch positions and you would soon see who made more of a defensive contribution. And because some positions--shortstop, catcher, second base, and third base--are harder to play than others, we see a relative scarcity of good hitters at these positions and an abundance at the others. Again, see the Glossary for more detail. The ultimate stat brings together batters, pitchers, fielders, and baserunners in the Total Baseball Ranking. The equivalent of a Most Valuable Player Award and Cy Young Award wrapped into one, it reveals the best baseball player every season and the best ever. Relief pitchers and shortstops can compete on the same plane--wins contributed to their team through all their accomplishments. In 1978 the MVP question in the American League was whether to vote for Jim Rice, who had 46 homers, 139 RBIs, and 400 total bases (the first time for an American Leaguer in forty-one years), or for Ron Guidry, 25-3 with an .893 won-lost percentage that was the all-time high for a starter with 20 or more wins, and whose ERA of 1.74 was less than half the league average. Why don't you flip to the page in the Annual Record for 1978 and see for yourself who deserved the MVP Award that year. Total Baseball also sums things up on the team level. Fielding Runs are expressed as Wins in the team stats section of the Annual Record, as are Batting Runs, Stolen Base Runs, and Pitching Runs. This enables one to see the component parts of a team's predicted success or failure--that is, the wins or losses beyond the average (a .500 season) that the players' performance could have been expected to produce. The Differential figure (DIF) in this section of the Annual Record states the spread between the team's actual won-lost record and that predicted by the Linear Weights measures of batting, base stealing, fielding, and pitching. The miracle Mets of 1969 exceeded expectations by 13 Wins--in other words, instead of finishing 87-75 as their players' performance would have warranted, they finished 100-62. Did the Blue Jays also do it with mirrors in 1992? Check for yourself. Errors and Controversies The data ICI reported in the first edition of The Baseball Encyclopedia upset many people in baseball, for their numbers were different from those traditionally accepted; however, their changes were responsible ones, the product of new research that corrected errors of long standing, or in response to the rulings of the Special Baseball Records Committee. For example, much of the statistical information on Hall of Fame plaques was rendered obsolete. The result has been that through the ensuing editions, the offending data has been fudged to bring it into line with tradition--more on this in a moment. Despite the uproar that greeted ICI's revised numbers, 1969 was hardly the first time corrections had been made to official data. In 1929 Grover Cleveland Alexander won his 373rd game, breaking Christy Mathewson's National League record, then thought to be 372. He never won another game. A number of years later, Joe Reichler found a game in which, by today's rules, Matty should have gotten the win, this game taking place on May 21, 1902. The record was changed and they were given a tie. The problem was that no one checked all of Mathewson's other games to see how many times he received a win under the old rules that wouldn't have been credited that way today. When ICI did their research in 1968, they found Matty had only 367 wins total by today's rules, while Alexander had 374. (Further research, notably by Frank Williams, has restored Alexander and Mathewson to a tie at 373 wins.) The Records Committee decided that all wins and losses should be awarded according to the present rules, so Macmillan printed the totals as such in the Baseball Encyclopedia. However, after the book came out, Commissioner Bowie Kuhn decided that it was better to show stats that agreed with previously published recognized sources, so all records--not only those of Mathewson and Alexander--were supposed to be changed back in accordance with the scoring practices at the time. What happened in the next edition was that some records, especially those of the stars, were changed, while others were not; team totals and the records of other players on the same team were not; and the data base was corrupted. Here's another celebrated example of record-book flip-flops. When the American League was formed in 1901, Nap Lajoie was credited with a .422 average, with 220 hits in 543 at-bats. After a number of years, someone noticed that if you take these at-bats and hits, the average comes out only to .405, so his average was changed. (Turkin/Thompson gave Nap a mark of .409 in its first edition.) Later in the 1950s, John Tattersall had his doubts and decided to go through his newspaper collection of box scores. He found 229 hits for Lajoie, not 220--the error had been in the figure for hits, not in the figure for batting average. Thus his average was restored to .422, which happened to be the highest in American League history. Then ICI research in this area came up with a .426 mark (232 for 544, based on newspaper accounts), which was published in the first edition, then trimmed back to .422 in subsequent editions. The .426 figure is the one this book uses. Nap seemed to be involved in a number of controversies. ICI research found four more hits for him in 1902, raising his average from .369 to .378. Later editions have changed Lajoie's stats back to the old values; we have not. In 1910 there was a very close batting race between Cobb and Lajoie. At the end of the season, most people thought Nap had won, based on his getting seven hits in a doubleheader on the final day of the season. There was talk that the opposing Browns had let him get a number of bunts by playing back, so that the hated Cobb would lose. However, the AL office went over their figures and gave Cobb the title, .385 to .384. Nearly eighty years later, Pete Palmer discovered a critical error: a game in which Cobb had two hits in three at-bats had been entered twice. This was found because Sam Crawford had 14 games on his official sheet for the homestand yet the Tigers only played 13. It turned out that Detroit played a doubleheader on September 24, but the second game inadvertently was inserted in the official sheets as being played on September 25. Later, this second game of the twenty-fourth, which appeared to have been missing, was put in the scoresheets again. The League Office discovered this mistake soon after its official announcement that Cobb had won the batting title, because the double entry was corrected for all the other Detroit players. However, Ban Johnson had made a big deal out of how carefully his people had checked the figures in order to settle the controversy, so they kept quiet about the gaffe, leaving Cobb the winner. Appeals to Commissioner Kuhn in 1981 to set the matter straight officially were to no avail, because that would not only have changed the outcome of the 1910 batting race, it would also have altered Cobb's lifetime hit total, then being pursued to massive media attention by Pete Rose. Kuhn's statement read, in part, "The passage of 70 years, in our judgment . . . constitutes a certain statute of limitation as to recognizing any changes in the records with confidence of the accuracy of such changes. . . . Since a variety of questions have been raised through the years about the accuracy of the statistics of that period, the only way to make changes with confidence would be for a complete and thorough review of all team and individual statistics. That is not practical." It may not have not been practical, but we have done it, and are continuing to do it. A notable area of change reflected in this edition of Total Baseball is the National Association period of 1871-1875, in which the research of Michael Stagno and a team of SABR researchers have not only supplied new, more accurate statistics but also a handful of new players, previously not included in any baseball encyclopedia. In 1912, Heinie Zimmerman got credit for a Triple Crown victory, although it wasn't called that then. Ernie Lanigan's RBI figures gave him 98, compared to 94 for Honus Wagner. However, ICI research gave Wagner 102 and Zimmerman 99. Later editions of The Baseball Encyclopedia changed Zimmerman up to 103--giving him back his phony Triple Crown. The National League batting data has been pretty accurate since 1910. That was the first year that the NL kept daily game records for teams as well as players and compared the team totals to the sum for the players for that team and tried to resolve any differences. Before then, the team totals simply were the sum for the players. The American League had team totals all the way back to 1905, but never compared them with the sum of the players and therefore had a great many errors. The AL, however, did introduce team pitching first in 1930, while the NL followed in 1941. The AL never published league totals, so the fact that the batter hits, strikeouts, walks, etc. did not agree with the corresponding pitcher totals was somewhat academic. However, the NL did publish league totals starting in 1926, and when they first presented team pitching, the totals did not agree. In order to make this look correct, they doctored the pitching totals to agree with the batting stats. After a few years, this was no longer necessary, as they took the time to resolve and correct differences. For the AL, most team totals did not agree with the sum of the players for that team until around 1935: the at-bats, runs, hits, and extra-base hits usually checked out, but walks and strikeouts did not add up until the 1960s. The AL converted to computer in 1973, while the NL did in 1981, improving accuracy. On the whole there have been surprisingly few errors in the National League stats. Most of the bigger ones have involved innings pitched in the years before 1930. Because no one added up the innings and compared them to putouts to check for discrepancies, in 1926 Wayland Dean brought up the rear in ERA with a 6.10 mark. It turned out that his innings pitched had been added up incorrectly, and he should have had 204, not 164. This reduced his ERA to 4.90. For a game in 1920, Jimmy Ring had his faced batsman total of 35 put in the innings pitched column, giving him 26 extra innings pitched for the game. It would seem that someone adding up innings pitched would question a figure of 35 for one game, but it slipped through. Ring was also credited for nine extra innings in 1923. But the strangest mix-up in the NL was in 1909, the year before the team totals were kept. For some strange reason, 700 putouts were dropped from the team totals, all the result of adding mistakes for catchers. Pat Moran and Red Dooin each lost 200, while Peaches Graham, Bill Bergen, and Doc Marshall lost 100 each. The American League has had many errors of 100 or more putouts or assists over the years due to addition mistakes, as well as quite a few blunders in innings pitched. Ed Willett in 1910 lost 77 innings, showing only 147 instead of 224. The correction lowered his ERA from 3.60 to 2.36. However, this was still more than a run behind the leader, Ed Walsh. Frank Williams discovered a dozen or more errors in entering wins and losses for pitchers in the AL every year from 1905 through 1919. And John Tattersall, in his home run research, found over 100 official errors, about 80 percent in the AL and most before 1920. George Sisler picked up 3 new homers, These were on April 12, 1916, September 22, 1921, and June 29, 1929. From 1912 to 1914, the AL statistician decided not to enter anything for a player who had all zeroes for his line in any given game. Most of these were relief pitchers, but they had entries on their pitching sheets and these games were restored by the ICI researchers. There were about 600 other cases where nonpitchers had games omitted. These are included in Total Baseball. This kind of record keeping over the early years kept some men out of the encyclopedias altogether, like pinch runners or defensive replacements. SABR research has added several of these one-time ciphers to The Baseball Encyclopedia over the years, as well as to Total Baseball. For the American League records of 1913, the official sheets disagree with the data published in the baseball guides for almost every player. The only logical explanation is that the official figures weren't ready when it came time to publish the guide, so they must have used data from another source. Total Baseball uses the official figures, as they have daily sheets to support the data. An interesting quirk in the way records are kept--and another reminder, as if one needed it, that baseball record keeping remains subject to error and controversy--occurred as recently as 1981. The league rule was to round off the innings pitched at the end of the season, although the weekly reports showed thirds of innings. Baltimore's Sammy Stewart had 29 earned runs in 112 1/3 innings, while Oakland's Steve McCatty had 48 in 185 2/3 innings. This gave Stewart the ERA title, 2.323 to 2.327. But when the innings were rounded off, McCatty won, 2.32 to 2.33. McCatty got the title, but the next year both leagues decided to count thirds of innings. Sources The computer has made possible the rapid analysis of mountains of raw baseball data based upon observed games or mathematically accurate, probabilistic computer simulations. Questions once thought to be unanswerable are mysteries no longer. What is the worth, in terms of its run-producing capacity, of a single, or a walk, or a homer? How valuable is a stolen base? Who were the best clutch hitters? But as invaluable as the computer has been in producing the statistical data for Total Baseball, the editors owe more to the people who have contributed their time, their expertise, their love of the game, and their passion for getting things right. These individuals are listed here, in the Acknowledgements, or in the table at the end of the book of those readers of the first two editions who helped us improve the accuracy of Total Baseball this time around. A collective debt is owed to the Society for American Baseball Research and the National Baseball Library. The statistics were obtained primarily from the following sources: - John Tattersall Collection of newspaper box scores and compilations for 1876-1890 NL. - ICI computer printouts, National Baseball Library, 1891-1902 NL, 1882-1891 AA, 1884 UA, 1890 PL, 1901-1904 AL, 1914-1915 FL. - Official league averages, 1903-date NL, 1905-date AL. - Michael Stagno Collection of newspaper box scores and compilations for 1871-1875 NA, supplemented by research of SABR's nineteenth century research committee, headed by Bob Tiemann, Bob Richardson, and Fred Ivor-Campbell. Supplemental sources were: - For batters hit by pitch, 1884-1896 AA/NL/PL, 1909-1916 NL, 1909-1919 AL, research from newspapers by Alex Haas, Pete Palmer, John Schwartz, Bob Davids, John Tattersall, Lyle Spatz, Herb Goldman, Keith Carlson, and others. (Note: research continues for the 1897-1908 period, but the data is, at this writing, about 90 percent complete.) - For home runs allowed by pitchers, 1876-1950 AL/NL, the Tattersall Collection, reviewed and corrected by Bob McConnell. - For runs batted in, 1903-1919 NL, 1905-1919 AL, ICI research. - For runs batted in, 1880-1885 NL, David Neft. - For pitcher saves (except 1901-1919 AL) 1876-1968 NL/AA/UA/PL/AL. - For stolen bases, 1886 NL, Spalding Baseball Guide. - For wins and losses for pitchers, 1876-1900 NL/AA/PL, and for wins, losses, games started, complete games, shutouts, saves, 1901-1919 AL, and complete pitching data, 1892, research from newspapers and official sheets by Frank Williams. - For shutouts, 1876-1939, Joe Wayman. - For biographical data, the biographical research committee of SABR, notably Richard Topp, Bill Carle. - For caught-stealing data, 1914-1916 AL, 1915-1916 NL, Ernie Lanigan, courtesy of Bob Davids. - For home/away data, 1876-1891 NL/AA/UA/PL, Bob McConnell. - For game scores, 1876-1884 NL/AA/UA, Bob Tiemann. - For game scores, 1885-1891 NL/AA/PL, Richard Topp. - For runs and homers home/away, 1980s NL/AL, Bill Carr. Missing data includes: - Hit batters: 1897-1908, scattered data, especially for New York and Cincinnati. - Caught stealing: 1886-1914, 1916 (players with fewer than 20 steals), 1917-1919, 1926-1950 NL; 1886-1891 AA; 1890 PL; 1901-1913, 1916 (players with fewer than 20 steals), 1917-1919, 1914-15 FL. - Sacrifice hits: 1927-1930 (fly balls advancing runners to any base counted as sacrifice hits). - Sacrifice flies: 1908-1930, 1939. - Runs batted in, 1882-1887, 1890 AA; 1884 UA. - Strikeouts for batters: 1882-1888, 1890 AA; 1884 UA; 1897-1909 NL; 1901-1912 AL. (Team batting strikeouts are presented for 1897-1902 NL and 1901-1904 AL.) Incomplete data for those years through 1902 NL and 1904 AL are available from the ICI computer printouts at the National Baseball Library. Additional research could turn up more data. If your research or sharp eye should detect errors or gaps in Total Baseball, please write us in care of the publisher and we'll be delighted to improve our data and credit your catch in the next edition.