Thursday, August 19, 2004

Yet another attempt to predict NFL winners

(from the Post Gazette)

I’m a little late on this, but this may be more relevant now than when I originally got the idea if you’re a Steelers fan. Charlie Batch is out for the season and Ben Roethlisberger has graduated to the backup role with Brian St. Pierre now a bona fide third-stringer.

Ok, this is my last attempt (at least this week) at making predictions about how the Steelers will do in 2004. I hit on the subject a few weeks ago with a couple of posts (here and here) and got some comments recommending I use more data (last time I only looked at the AFC North for no other reason than laziness).

Well this time, I looked at every NFL team over the last 15 seasons. Specifically, I had each teams record, offensive rank, defensive rank, wins, losses and whether the starting QB was in his first season as a starter (meaning that he doesn’t necessarily have to be a rookie. In fact, when I first started collecting the data, I was interested in seeing what we could expect out of Carson Palmer this season). Now, with Batch down, this could also apply to a first sighting of either Roethlisberger or St. Pierre for the Steelers – and given that Boller only played nine games last season we could loosely apply the results to the Ravens as well.

Anyway, on to the good stuff. If you’re interested (if you're not interested -- skip this paragraph), I ran a cross-sectional time series model to predict team wins in a season based on offensive rank (points scored), defensive rank (points allowed) and whether the QB was in his first season as a starter (before you get all fired up, yes there are a million other variables that predict winning, but I considered offensive and defensive rank as a good proxy for most of these variables and I’m interested in the effect of first-year QB’s on wins). Here are the results:

Wins per season =
13.7 – (0.19*off. rank + 0.18*def. rank + 0.63*firstyrQB)
So what does this mean? Well, as a team’s offensive rank and defensive ranks get worse (gets bigger) they can expect to win fewer games. Specifically, every time a team drops one place in the offensive rankings, they can expect to win 0.19 fewer games a season. Likewise, every time a team drops one place in the defensive rankings, they can expect to win 0.18 fewer games a season.

To make this more concrete, for every 5 places a team drops in the offensive rankings (defensive rankings), they can expect to win (lose), on average, one fewer game a season.

In terms of starting a first-time QB (I’m trying to distinguish between a “first-time” QB and a rookie “QB” here. Kyle Boller was both last season. Carson Palmer is the former this season because obviously he didn’t play in 2003), a team can off the bat expect to lose 0.63 more games than a team that has a veteran QB under center (and if your next question is whether these variables are significantly significant, the answer is yes – and the r-squared – how well the model fit the data – was 0.78).

One measure of how good a model you have is how well the predicted wins match the actual wins. This model does pretty well. Here are the results for all 32 NFL teams in 2003 (both actual wins and wins predicted by the model):

Team won pred diff
NE 14 11.2 2.8
KC 13 10.0 3.0
TEN 12 10.4 1.6
PHI 12 10.3 1.7
STL 12 10.0 2.0
CAR 11 9.0 2.0
BAL 10 10.4 -0.4
DEN 10 10.2 -0.2
MIA 10 9.9 0.1
DAL 10 9.4 0.6
GB 10 10.9 -0.9
SEA 10 9.5 0.5
MIN 9 8.4 0.6
CIN 8 6.2 1.8
NO 8 8.5 -0.5
CHI 7 5.2 1.8
SF 7 8.2 -1.2
TB 7 9.6 -2.6
PIT 6 7.4 -1.4
BUF 6 6.9 -0.9
NYJ 6 7.9 -1.9
CLE 5 6.0 -1.0
HOU 5 3.5 1.5
JAX 5 5.1 -0.1
ATL 5 4.5 0.5
DET 5 3.9 1.1
WAS 5 5.2 -0.2
OAK 4 4.2 -0.2
SD 4 5.0 -1.0
ARI 4 1.9 2.1
NYG 4 2.8 1.2

*wins: actual wins, pred: predicted wins from model, diff: difference between actual and predicted wins.
This model does a surprisingly good job of predicting wins using the existing data. But let’s see how it does predicting wins for the 2004 season. Before I go any further, I’ll have to make one assumption about the 2004 season. I assumed that each team’s 2004 offensive and defensive rank will be a weighted average of their previous performances (I know, this is seldom the case, but other than spending about 50 hours collecting more data, this is what I’m stuck with).

Ok, if you’ll concede me that last point, let’s look at what this model predicts:

Team 03wins 04wins
NE 14 10
KC 13 10
IND 12 9
TEN 12 10
PHI 12 10
STL 12 9
CAR 11 8
BAL 10 9
DEN 10 10
MIA 10 10
DAL 10 9
GB 10 10
SEA 10 8
MIN 9 9
CIN 8 6
NO 8 8
CHI 7 6
SF 7 9
TB 7 8
PIT 6 8
BUF 6 8
NYJ 6 8
CLE 5 6
HOU 5 4
JAX 5 6
ATL 5 6
DET 5 5
WAS 5 7
OAK 4 6
SD 4 6
ARI 4 3
NYG 4 5

Of course the first thing that sticks out is that there isn’t much variation in what teams actually did in 2003 and what they’re predicted to do in 2004. And the reason for that is because of how I estimated offensive and defensive rankings (obviously, we know if a team will be starting a first-time QB). But what may be mildly interesting is that of the seemingly 4 million methods I’ve looked at when trying to predict how the Steelers will fare in 2004, they all seem to come in around eight or nine wins (see here, here and here for my other attempts at prognostication).

In terms of the AFC North, the Ravens are again the favorites (9 predicted wins) followed by the Steelers (8 predicted wins), and then the Bengals and Browns (both have 6 predicted wins). As I’ve said before, I certainly hope the Steelers do much better than 8 wins, and I think having Deion Sanders sign with the Ravens is a start.