"What is the evidence, and what does it mean?" Bill JamesThis article was mailed to the Blue Jays' mailing list Feb 17, 1998.
Date: Tue, 17 Feb 1998 23:40:05 -0500 From: Stephen Tomlinson To: Jays List Subject: Jays Projected Record for '98
I finally got a copy of this year's STATS Major League Handbook, which
includes projected stats for many hitters and pitchers. So it's time for
my 2nd annual "Jays Projected Record" post. If you want to know how I did
last year, well, I've left that information to the end of the article.
------------------------------------------------------------------------------
'98 Offense
------------------------------------------------------------------------------
For estimating the Jays' offense, I assumed an average number of total plate
appearances based on a '97 A.L. team. On average, spots in the lineup will
have 18 more plate appearances than their following spot (162/9), so the
lineup spots got this number of plate appearances:
757, 739, 721, 703, 685, 667, 649, 631, 613
Then I guessed TJ's starting lineup, and for the leftover plate appearances,
guessed the most likely backup. Here are the projections:
{Note: PA is Plate Appearances, estimated from At Bats + Walks;
ERP is Estimated Runs Produced, calculated by me, not STATS
(ERP = 0.16*(3H+2D+4T+6HR+2BB+SB-0.605(AB-H+CS))),
R27 is ERP/(27 outs), the '97 A.L. average was around 5.0,
the comment in curly brackets is my opinion of STATS' projection}
Age
1. LF 24 Stewart 661 PA, 82 ERP, 4.77 R27, {R27 too low} (STATS projection)
LF Backup 96 PA, 11 ERP, 4.00 R27
2. 2B 36 Fernandez 416 PA, 46 ERP, 4.08 R27, {PA was too low} (STATS, PA*1.84)
2B 30 Kelly 212 PA, 21 ERP, 3.55 R27, {PA too high} (STATS)
2B 33 Grebeck 111 PA, 11 ERP, 3.88 R27, (STATS)
3. CF 24 Cruz 643 PA, 97 ERP, 5.99 R27, {okay} (STATS)
CF Backup 78 PA, 9 ERP, 4.00 R27
4. DH 33 Canseco 441 PA, 66 ERP, 5.78 R27, {okay} (STATS for OAK - 2% )
1B 26 Delgado 418 PA, 63 ERP, 5.88 R27, {R27 low} (STATS, PA*(2/3))
5. 1B 35 Stanley 457 PA, 66 ERP, 5.83 R27, {okay} (STATS for NYY)
1B 25 Crespo 72 PA, 9 ERP, 4.52 R27, {okay} (STATS, PA*0.27)
6. RF 25 Green 551 PA, 77 ERP, 5.39 R27, {R27 low} (STATS)
RF Backup 116 PA, 13 ERP, 4.00 R27
7. 3B 30 Sprague 578 PA, 68 ERP, 4.39 R27, {okay} (STATS)
1B 25 Crespo 71 PA, 9 ERP, 4.52 R27, {okay} (STATS, PA*0.27)
8. C 31 Fletcher 346 PA, 42 ERP, 4.52 R27, (STATS MON-2.5%, PA*(346/417))
C 33 Santiago 285 PA, 33 ERP, 4.24 R27, {R27 low} (STATS, PA*(285/428))
9. SS 25 Gonzalez 526 PA, 61 ERP, 4.63 R27, {okay} (STATS)
SS 24 TPerez 87 PA, 7 ERP, 2.77 R27, {okay} (STATS, PA*(87/192))
---
Total 791 ERP, roughly league average
Some notes on the above:
* I don't know who the Jays' backup outfielder will be, so I used
Brumfield's projected R27 of 4.00 to fill in these slots.
* I just used 2/3rds of Delgado's plate appearances to allow for missing
the first 1/3rd of the season.
* Rather than split up the plate appearances of one of Stanley, Delgado and
Canseco over two slots, I put both Delgado and Canseco in the #4 slot,
but made sure the #4 and #5 added up the right number together.
* What happens when Delgado comes back? Well, STATS' projections for both
Stanley and Canseco has them missing lots of plate appearances themselves,
so I actually needed a backup to fill in all of the PAs for these slots.
* I couldn't fit in all of the projected PAs for both Fletcher and Santiago,
so I cut them both back, especially Santiago's.
* STATS didn't have a projection for Tom Evans, and really I didn't have room
for him anywhere because of all the Sprague PAs, so I left him out. This
just makes the estimate more conservative.
* I don't know if the Jays will really keep both of Kelly and Grebeck around
but the projected PAs for Fernandez were so low that I decided to put them
both in, and then inflated Fernandez's PAs to fill in the slot. Including
both Kelly and Grebeck makes the projection more conservative. I could
have put in Crespo or Patzke but again I stayed conservative and left them
out.
* By 'conservative' I mean my estimate for Runs is likelier to be too low
than too high.
* Potentially a bad thing is that I don't project many poor performances --
usually a team has a few guys with a handful of PAs who don't do much.
However, I think a lot of the STATS projections are too pessimistic anyway,
so I've decided not to further compensate.
So, the Jays project to have a league average offense in '98, which is a
big improvement on last year (last year's offense was projected to be 12th
out of 14, and we all know it was actually 14th out of 14).
------------------------------------------------------------------------------
'98 Pitching
------------------------------------------------------------------------------
STATS doesn't run nearly as many pitcher projections so I have to fill in
more on my own. I found that I was estimating something close to a 4.50 ERA
for the Jays' pitchers with no projections, so I decided to project 4.50 ERA
for all of them, which meant my Innings Pitched projections became irrelevant.
Note that STATS projects an IP and an ERA (among other things). I've
estimated RA (Runs Allowed per 9 Innings, including unearned runs) as
1.09*ERA (that's based on the '97 A.L. average). Then I calculated Runs (R)
from RA and IP. If you reverse the calculation (calculate RA from IP and R),
there might be slight inconsistencies due to rounding.
The average A.L. team in '97 had 1445 IP, so I assumed that below:
------------
'98 Starters
------------
Age
35 Clemens 257 IP, 2.91 ERA, 3.18 RA, 91 R, 34 GS (STATS)
29 Hentgen 265 IP, 3.77 ERA, 4.11 RA, 121 R, 35 GS (STATS)
31 Guzman 146 IP, 4.81 ERA, 5.25 RA, 85 R, 25 GS (STATS)
33 Hanson 120 IP, 4.65 ERA, 5.07 RA, 68 R, 20 GS (STATS)
23 Carpenter 200 IP, 4.50 ERA, 4.91 RA, 109 R, 31 GS (me)
31 Williams 127 IP, 4.50 ERA, 4.91 RA, 69 R, 19 GS (me)
---
starters 1115 IP, 4.01 ERA, 4.38 RA, 543 R, 162 GS
The STATS projections left 50 starts to fill in, so I assumed Carpenter
and Williams would make them up. STATS projects Clemens to be great again,
but Guzman and Hanson to be worse than average, and miss a lot of the season.
(Remember, an RA of 5.0 was average in the '97 American League.)
-------------
'98 Relievers
-------------
Age
29 Quantrill 103 IP, 4.89 ERA, 5.34 RA, 61 R (STATS)
35 Myers 59 IP, 3.71 ERA, 4.05 RA, 27 R (STATS for BAL + 1.5%)
36 Plesac 57 IP, 3.47 ERA, 3.79 RA, 24 R (STATS)
22 Escobar 63 IP, 4.50 ERA, 4.91 RA, 34 R (me)
28 Crabtree 48 IP, 4.50 ERA, 4.91 RA, 26 R (me)
---
relievers 330 IP, 4.29 ERA, 4.69 RA, 172 R
You'll see that I left out guys like Carlos Almanzar, Robert Person,
Bill Risley and Luis Andujar. I just ran out of innings. Since I was
assuming a 4.50 ERA for anybody not projected by STATS (and I used all
the STATS projections), this doesn't matter a lot.
Total: 1445 IP, 4.08 ERA, 4.45 RA, 715 R
So the Jays' project to allow 4.45 runs per 9 innings, which would've ranked
4th in the A.L. last year. This is a decline from last year, when I projected
the Jays to be 1st in pitching, and they were actually 3rd.
------------------------------------------------------------------------------
'98 Projected Record
------------------------------------------------------------------------------
If we apply the Pythagorean formula we get the following projection:
791^1.83 / ( 791^1.83 + 715^1.83 ) == .546 WPct ==> 88 wins, 74 losses
Last year, 88 wins would've ranked 4th in the A.L., ahead of Cleveland, who
made it to the Series, but behind the wildcard team (the Yanks with 96 wins
last year).
I think the hitting could be a lot better than the projections, so I wouldn't
be surprised if the Jays win 95 games or more. I would be surprised if they
were under .500 again.
------------------------------------------------------------------------------
Review of Last Year's Projection
------------------------------------------------------------------------------
Last year, using a similar approach, (actually I did things in a bit more
detail last year), I projected 83 wins for the Jays, and they actually won
76, an error of 7 wins. So what happened?
Before breaking it down, one thing to note is that the league average offense
fell 8% from '96 to '97, from 5.39 runs per game in '96 to 4.98 runs per game
in '97. For projecting wins and losses this is irrelevant, but for seeing
whose projections were too high or too low, I decided to divide everyone's
projections last year by 1.08 to allow for the change in the league average.
With that adjustment, it turns out that last year's pitching projection was
only 1% too optimistic, while the hitting projection was 7% too optimistic.
If you look at the details, you'll see that a lot of the pitcher projections
weren't that close, but optimistic projections for guys like Guzman and
Crabtree were balanced by Clemens having a much better than projected year.
So the pitching estimate ended up okay; it was responsible for just 1 win
of the 7-win error.
If you look at the details of the hitting projections, they were actually
more accurate on a player-by-player basis than the pitcher projections,
except for a few key errors, all in the unfortunate direction:
* The biggest error was at second-base. Garcia's projected R27 (dividing by
1.08 as per above) was just 3.80, but his actual was 2.50, a significant
error. Then the Jays managed to find a replacement who was even worse
(Duncan, 1.97) while I had projected Crespo as the backup, a good hitter.
I figure of the 7 wins I was in error, 4 of them were at second-base.
* The next biggest error was in backup-outfield. Brumfield was projected
to be a 4.38, but was actually 2.33, and while he didn't play as much as
projected, the other outfield backups were mostly bad too (Sierra 2.94,
RPerez 2.02). This caused almost 2 more wins of error.
The other errors were smaller and cancelled out. For example, Santiago and
TPerez were each about half-a-win worse than expected, but Delgado and Green
were each about half-a-win better than projected.
Were the significant errors avoidable? I think the projection for Garcia was
reasonable considering past performance -- he was projected to be his usual
worse than average, but turned out much worse. I probably shouldn't have put
in Crespo as the second-base backup, and this year I resisted the temptation
to put him in there again, or put in Patzke. I don't feel bad about not
predicting the Jays would think Duncan was the saviour.
The backup outfield projection, well, that projection wasn't far off what's
usually considered "replacement level". I've been more conservative this
year, though I'm still not predicting the awful levels of last year.
By the way, last year's projection is still available in full on the web at
http://www.stephent.com/jays/proj97.html
------------------------------------------------------------------------------
Disclaimer
------------------------------------------------------------------------------
As I noted last year, even if you know the final stats of all the players,
studies show if you plug them into a computer and play a bunch of simulated
seasons, you can still get wildly different numbers of win totals.
We all know that you have to poll 1100 people randomly to get a result
that's considered accurate within +/- 3 percentage points, 19 times out of
20. A way to approximate the accuracy of a sample is 1/sqrt(n), e.g.
1/sqrt(1100) is roughly 0.03. Well, that suggests if you take a poll of
162 games, the accuracy is 1/sqrt(162), which is almost 8 percentage points,
or +/- 13 wins. i.e. if a team finishes with .500 record, that's only
evidence that they were "really" a 68-94 win team -- in other words, they
were somewhere between pretty bad and pretty good. How insightful.
Both of the above may suggest that predictions of season win-loss records
are essentially a lost cause. But I'd still rather go into the season with
a team projected to win 75-101 games than one projected to win 70-96. I
think that's the difference between how things looked before last year to
how things look before this year. This is a better team, but anything can
happen.
--
Stephen Tomlinson http://www.stephent.com/jays/
mailto:stephent@ottawa.com Ottawa, Ontario
"What is the evidence, and what does it mean?" (Bill James)
Access count for this page:
Last Updated: 1998 Feb 18
Comments are welcome at comments@stephent.com.