"What is the evidence, and what does it mean?" Bill JamesThis article was posted to the alt.sports.baseball.tor-bluejays newsgroup April 3, 2000, before the Blue Jays' first game.
Welcome to my 4th-annual "Blue Jays Projected Record" article, where I add up individual player projections for each Blue Jays player to project a win-loss record for the Jays *before* the season starts. (See disclaimer below on accuracy.)
For each player, I projected his Equivalent Average (EqA) using my own home-brew EqA projection system (outlined at the end). This is a short-cut compared to previous years, when I used STATS' projections, which are probably more accurate, but required more data entry on my part. My projections are just based on the major league data in the Lahman database, hence for some players with limited or no major league playing time, I overrode my calculated projection as noted below.
For projected Plate Appearances, I normally used the STATS projection for At Bats plus Walks, except where noted below.
For those who don't know, an average major league hitter will have an EqA of .260, while an EqA of .300 represents excellence. Because of the DH rule, the average American League hitter has about a .264 EqA.
The individual projections are
Regulars:
Age ProjEqA
Stewart LF 26 .276 676 PA
Bush 2B 27 .244 518 PA (ProjEqA - .025)
Mondesi RF 29 .280 656 PA
Delgado 1B 28 .300 643 PA
Fullmer DH 25 .268 512 PA
Batista 3B 26 .277 567 PA
Fletcher C 33 .258 418 PA
Cruz CF 26 .278 482 PA
Gonzalez SS 27 .246 520 PA (PA+150)
----
.271 4992 PA
As noted above, I lowered Bush's projection to take into account his subpar minor league numbers, and I increased STATS' projection for Gonzalez's playing time. Otherwise, I went with my computer EqA projection and STATS' playing time projection.
Backups:
Age ProjEqA
Cordova OF 30 .261 410 PA
ACastillo C 30 .223 248 PA
Grebeck IF 35 .248 187 PA
Woodward IF 24 .220 170 PA (me, used BP EqA)
Wise OF 22 .224 58 PA (me, used BP EqA)
VWells OF 21 .247 100 PA (me)
----
.242 1173 PA
For Woodward and Wise, I used the EqA projection in Baseball Prospectus. I set the plate appearances for Woodward based on what appeared to be the leftover plate appearances for infielders. I assumed Vernon Wells would be called up for a month to cover for an outfield injury.
Overall, this works out to a team EqA of .265, and remember the A.L. average is .264, so the offense is projected to slip back to just a little better than average in 2000. Projected improvements at DH, LF and CF are more than offset by projected declines at SS, 2B, RF, C and 3B.
Note that my projection system doesn't predict "breakout" years, though it does make age adjustments. The Jays have lots of breakout candidates. My feeling is that the Jays will have a better offense than last year. All it takes is for each player to beat his projection by 4 points of EqA (0.004), or about 12 points of OPS.
The EqAs for last year's Jays (my home-brew calculation) are online at http://www.stephent.com/jays/teams/1999.html
For the pitchers, I used STATS' projections where available. I assume the RA (Runs Allowed per 9 innings) will be 1.09 * the projected ERA.
STATS has surprisingly optimistic projections for Wells and Hamilton. I didn't assume any major improvements for Carpenter, Halladay and Escobar, except for durability. I used the STATS projection for Castillo in their '99 book, halving his innings and starts and slightly adjusting his RA for park factor and extra year of age:
Starters:
Age
Wells 37, 226 IP, 4.44 RA, 112 R, 33 GS (STATS)
Carpenter 25, 195 IP, 4.71 RA, 102 R, 30 GS (me)
Halladay 23, 188 IP, 4.64 RA, 97 R, 30 GS (me)
Escobar 24, 198 IP, 5.09 RA, 112 R, 33 GS (me)
FCastillo 31, 70 IP, 5.68 RA, 44 R, 12 GS (STATS for DET'99, IP/2, RA')
Hamilton 29, 138 IP, 4.78 RA, 73 R, 23 GS (STATS)
CAndrews 22, 6 IP, 6.00 RA, 4 R, 1 GS (me)
--- ---- ---
1021 IP, 4.80 RA, 544 R
In the pen, STATS is suprisingly optimistic about Frascatore, but pessimistic about Quantrill. I increased Quantrill's innings projection. I doubt Fregosi will be so restrained in using Borbon, but didn't change STATS' projection (except for park and league adjustments):
Pen:
Age
Koch 25, 70 IP, 3.99 RA, 31 R (me)
Borbon 32, 51 IP, 3.89 RA, 22 R (STATS for LA/NL, RA*1.055*1.06)
Quantrill 31, 84 IP, 5.18 RA, 48 R (STATS, IP+25)
Frascatore 30, 79 IP, 4.37 RA, 38 R (STATS)
Painter 32, 58 IP, 5.12 RA, 33 R (STATS for STL/NL, RA*0.98*1.06)
Munro 25, 80 IP, 5.96 RA, 53 R (me)
--- ---- ---
422 IP, 4.80 RA, 225 R
Total runs allowed: 769
If this all happens, a 4.80 RA translates to a 4.39 ERA, which would have ranked 3rd in the league last year. The pitching is potentially very good.
Note that I didn't take into account the quality of the team defense in the pitching numbers. My feeling is that the defense has improved compared to last year, which should improve the pitching numbers, but the numbers already seem optimistic about the health of the rotation, so I won't make them rosier.
(Which isn't to say the pitching & defense can't be even better than projected above.)
I'm assuming STATS was figuring a league average of about 5.16 runs per 9 innings in 2000 for their pitcher projections, based on a weighted average of the past 3 years. That would be down a bit from 5.26 runs scored per 9 innings last year.
So to change the offense's .265 EqA to runs, the calculation is
((.265 EqA / .260) ^2.5) / 1.04 (lineup includes DH) * 0.99 (park factor) * 5.16 (DH-league average runs per 9 innings) * (1443 IP/9IP) = 826 runs.
We already figured 769 runs allowed, so if we apply the Pythagorean formula we get the following projection:
826^1.83 / ( 826^1.83 + 769^1.83 ) == .533 WPct ==> 86 wins, 76 losses
This year I project 86 wins. But it's easy to see this team winning more games. Add 10 points to a regular's EqA, which seems plausible for most of them, and you figure 1 more win from each. Subtract 0.50 from a starter's RA, which seems plausible for Carpenter/Halladay/Escobar, and you figure 1 more win from each. This team could win 100 games.
But 86 wins is my "official" projection this year.
Toss a fair coin 162 times, and your best bet is to predict 81 heads, but there's a less than 6% of chance of that happening, even though it really is the best bet.
So this method probably has a less than 6% chance of producing the actual win total, and that's assuming the errors in the individual projections all cancel out, which they probably won't.
Previous year's articles are on the web at
Only the '98 projection produced exactly the right number of wins (88). Last year's projection was 4 wins too high.
My simple EqA projection system goes something like this:
(1) The inputs from each season are the hitters'
Runs (i.e. Estimated Runs Produced, park and league-adjusted),
Outs (roughly At Bats minus Hits), and
Age.
(2) To project the hitter's Future Runs & Outs at a particular
Future Age, assume for each year the hitter's Runs
will increase by roughly the following percentages:
for age 21 or less, assume 10% increase on each previous year;
for ages 22, 23, 24, 25, 26, 27
assume 9%, 8%, 7%, 6%, 4%, 2% respectively;
for ages 28, 29, 30, 31, 32, 33, 34
assume -1%, -1%, -2%, -2%, -3%, -4%, -5% respectively;
for ages 35 and higher, assume -6% per year;
and assume Outs stays constant.
For example, if a hitter produced 100 Runs in 400 Outs at
Age 25, then his projected Runs at Age 28 would be
100 * 1.04 * 1.02 * 0.99 = 105
i.e. a projection of 105 runs in 400 Outs at Age 28.
(3) If you add up the projected Runs and Outs for a given Age
from all of the hitter's previous seasons, and take the
Runs/Outs ratio, that's the basis for my first projection
[which wasn't used in this article].
(4) If you do the same thing but weight the projected Runs and Outs
from each season by 2^Age, then that's the basis for my second
projection. [This is the one I used in this article.]
(5) To convert Runs/Outs to EqA, the formula I use (based on
Davenport's) is (((27*Runs/Outs)/4.5)^0.4)*.260
so that .260 is average and .300 represents excellence.
I haven't done any evaluation of whether the projections come close to what players actually do in future years, but I've noticed at least that the 2nd projection (which weights recent years more) is a pretty good predictor of what other projection systems will forecast (e.g. STATS, BP99) :-)
Baseball Prospectus is known for their systems trying to predict breakouts and collapses. My system seems obviously to be the opposite. However, those age adjustments can add up over a few years, especially for younger and older players, so a lot of times I see projections well above or below what the player has done in recent years.
Note that EqA has the nice property that a 1% difference in Runs/Outs typically causes about 1 point change in EqA, so for example, saying that a 21-year-old will be 10% better than a 20-year-old is roughly the same as saying a 20-year-old's EqA will go up by 10 points (e.g. from .250 to .260). For older players (Harold Baines), I'm essentially assuming a drop of 6 points per year (so deduct 12 points from 2 years ago, and 18 from 3 years ago, it adds up).
I've left out a few details (e.g. I actually have another minor factor which adjusts both projected Runs & Outs). Some day I hope to evaluate the system and see if other weights or age adjustments will work better, and also to document the system fully (it's not meant to be a secret).
This is all just for fun of course.
STATS Major League Handbook 2000, http://www.stats.com/
Baseball Propectus 2000, http://www.baseballprospectus.com/
Sean Lahman's Baseball Database, http://www.baseball1.com/
-- Stephen Tomlinson http://www.stephent.com/jays/ mailto:stephent@ottawa.com Ottawa, Ontario "What is the evidence, and what does it mean?" (Bill James)
Access count for this page:
Last Updated: 2000 Apr 3
Comments are welcome at comments@stephent.com.