On Saturday, when describing Steamer’s methodology at Saber Seminar, Dash presented the following graph:
Referring to this graph, he suggested that, for a hypothetical left-handed reliever who throws 95 mph, our best “prior” would be the 27% strikeout of his “peers” rather than a league average rate of 18%. Dave Cameron pointed out that a left-handed reliever who throws 95 mph actually has few peers and asked how narrowly we defined our peer groups. This got us wondering, how narrowly *should* we define our peer groups? Does it matter? How large is the uncertainty in priors for pitchers with extreme velocities?
The four charts above show LOESS smoothing lines that attempt to describe the relationship between fastball velocity and strikeout rate for left handed relievers. The gray regions around the lines show the uncertainty in each model. As you can see, the uncertainly is dramatically larger for extreme velocities. We have a very good sense of the best prior for a pitcher throwing 90 mph and little idea for someone throwing more than 98 mph. The charts on the left have a “span” of 0.5 and the ones on the right have a span of 1.0. The smaller the span, the narrower the peer group, thus the line with a span of 1.0 is smoother and less sensitive to individual data points. Choosing a span of less than about 0.4 creates a noisy graph that fails the eye ball test. Steamer is currently using a smoother fit, much like the one on the upper right.
When our hypothetical pitcher becomes less hypothetical, say the we’re forecasting Matt Thornton, we stumble upon another issue. A number of the data points in the neighborhood of 95 mph are, in fact, Matt Thorntons of years past. These points are shown in blue in the top row and have been removed from the data set in the bottom row.
Here’s the good news (for lazy forecasters): neither changing the span nor removing Matt Thornton makes much of a difference. In all four scenarios, our prior for Matt Thornton lingers near a 27% strikeout rate. This is, however, something we’ll need to be wary of in even more extreme cases. We shouldn’t let Aroldis Chapman’s peer group be dominated by Aroldis Chapman and we should recognize that his prior will always come with a great deal of uncertainty.