Friday, March 05, 2004

Is there a better measure of team defense in pro football?
After hearing every week during the NFL season how Bill Belichick, often described as a defensive mastermind, was terrorizing opposing offenses with his innovative schemes, I got to thinking, "Is New England really the best defense in the AFC?" At first glance I wasn't convinced. Their roster is filled with players taken from teams that didn't think twice about letting them leave. And while Ty Law, Richard Seymour and Willie McGinest are all impact players, other teams weren't clamoring for the Teddy Bruschis, Mike Vrabels, and Asante Samuels--despite their importance to the success of the Patriots.

Was the Patriots defense that much better than other teams? That's what I aimed to find out.

When thinking about what makes a defense successful, for me four things come to mind. First, how many yards do they give up; second, how long are they on the field; third, how many turnovers do they create; and finally, how many points do they allow?

Currently, the NFL ranks team defense by looking at total yards allowed per game (YPG) and total points allowed per game (PPG). While this lets you to compare one team to another, it doesn't offer any insight into why some defenses are successful and others struggle. I hope to shed some light on this problem.

Every week during football season, most newspapers publish NFL team defensive rankings by both YPG and PPG. Without much thought, it certainly seems reasonable that defensive rankings should probably entail more than these two statistics. So what statistics provide a good measure of a defense's success?

Yards allowed per Game
First, let's start with YPG. This is pretty simple--the more yards a defense allows, the more likely it is the offense will score. In 2003 for example, for every seven points allowed, AFC defenses gave up 100 yards, on average.

But what if it was the case that a defense known for giving up huge chunks of yards routinely stopped the offense from scoring (maybe through turnovers, missed field goals or loss of downs)? For example, the New York Jets allowed 323 YPG (seventh highest in the AFC), but only gave up 18.7 PPG (fifth lowest in the AFC). Is there one statistic that gives us an idea of how many yards a defense allowed that resulted directly in points for the opposition?

If we take the yards allowed per scoring drive and divide it by the total yards allowed per game, we have an idea of the percentage of yards directly resulting in points (I'll call it Yards Given Up Resulting in Points--YGURP). For example, in week one, the Pittsburgh Steelers defense allowed 228 total yards against the Baltimore Ravens offense. However, only 82 yards contributed directly to Baltimore points (the Steelers won 34-15).

The Steelers YGURP = 82/228 = 0.36

Stated differently, 36 percent of all the yards allowed by the Steelers resulted in points for the Ravens. Now, instead of just having the raw number of yards a defense allows, we can see what percentage of yards contributed to points being scored. YGURP has its shortcomings, however. What if a team has an YGURP of 0.90--that means 90 percent of all yards allowed contributed to points being scored. On the surface that sounds horrible. But let's say a defense gives up one 90-yard drive and then plays stifling defense the rest of the game and only gives up ten more yards. You have an YGURP of 0.90 (90/100), but any defense that only gives up one score and 100 yards has put forth a pretty solid performance.

One way around this problem is to also consider the defense's average starting field position. Including field position in the YGURP calculation helps account for situations cited in the example above. Revisiting that example, let's assume that the defense's Average starting Field Position (AFP) was 68 yards from their own goal line (which was the AFC's average starting field position in 2003); we will still assume that they gave up 100 total yards, only one 90-yard scoring drive and have a YGURP of 0.90 (I divided AFP by 100, to get a number between 0 and 1).

If we recalculate YGURP when now considering AFP we get the following:

YGURP*AFP = 0.90*0.68 = 0.63

This new number now represents the yards given up that result in points when accounting for the average starting field position. 0.63 seems much more reasonable (and probably more reliable) than the 0.90 estimated above. I'll discuss this in more detail later.

Time on the Field
The total time a defense spends on the field is very highly correlated with the number of points they allow, but it is seldom mentioned when discussing good defenses. We can take an approach similar to the one used to estimate YGURP when talking about getting a meaningful metric for time on the field.

If we take the ratio of the time on the field during scoring drives to the total time on the field, we have an idea of the percentage of time on the field directly resulting in points (call this Time on Field Resulting in Points--TOFRP). Revisiting the week one game between the Steelers and the Ravens, the Steelers defense was on the field a total of 27:20. Of that time, only 3:00 were on scoring drives.

The Steelers TOFRP = 3:00/27:20 = 0.11

That means that 11 percent of the time spent on the field was directly related to scoring drives. Just like YGURP, TOFRP also has some drawbacks. One can create a scenario where a defense is consistently starting at its own 10-yard line and as a result the offense scores quickly. This will lead to very high values of TOFRP, but would not necessarily be indicative of defensive performance--especially as it related to time on the field. I will come back to ways around this limitation of TOFRP.

Turnovers Created
Surprisingly, turnovers do not correlate that highly with PPG. You would expect that the more turnovers a defense created, the fewer points they would allow. San Diego's defense allowed almost 28 points per game in 2003 (highest in the AFC), but were third in the AFC in turnovers created (30). The New York Jets's defense was fifth in the conference in fewest points allowed (18.7), but was 12th in the league in turnovers created (18). Why this might be the case raises some interesting questions, but I'll save that discussion for another time.

(I should note that even though turnovers were not highly correlated with PPG, team winning percentage was pretty highly correlated with turnovers. This indicates that turnovers do contribute team wins, but it doesn't necessarily reduce the number of points the opposition scores.)

Points Allowed per Game
The statistic that most obviously portrays how well a defense is performing is PPG. San Diego gave up an AFC worst 28 PPG and finished at 4-12. New England gave up an AFC best 15 PPG and won the Super Bowl. So it's probably safe to say that the fewer points a team allows, the better their chances of winning. But is PPG the best measure to compare defensive effectiveness?

What if the San Diego defense had an average starting position on their own twenty yard-line and the New England defense had an average starting position on their opponents twenty yard-line? That would mean that offenses would in all probability score more often on San Diego's defense than on New England's defense. So part of the blame for San Diego's high PPG should fall on San Diego's offense and special teams (since they are responsible for the lousy field position).

Is there a way to account for field position when determining PPG? If we weight the points allowed per game by where on the field the defense starts, we will have a much more accurate portrayal of PPG. The formula I used to obtain the Weighted Pointe Per Game (WPPG) is as follows:

(Points scored per Drive*Yards Allowed per Drive)/100

Returning to our ubiquitous week one example, the Ravens had thirteen offensive drives. Of those drives, they scored on two of them. Their first scoring drive was 80 yards and resulted in a touchdown. There final scoring drive was two yards and also resulted in a touchdown. To estimate the Steelers WPPG using the formula above, we get:

Drive 1: (7*80)/100 = 5.60

Drive 2: (7*02)/100 = 0.14

Because drive 1 went 80 yards and resulted in a TD (and an extra point) we multiply the yards allowed by the defense by the points scored by the offense. This number divided by 100 gives us a weighted score of 5.60. Drive 2, on the other hand started at the Steelers two yard line but also resulted in a TD. Doing the arithmetic shows that the weighted score is only 0.14.

The benefit of WPPG is that the score is now dependent on where the defense starts. The Ravens second scoring drive was a result of a Pittsburgh turnover deep in their own territory. Weighting the ensuing touchdown by field position prevents the Pittsburgh defense from being 'penalized' by the Pittsburgh offense's miscue (as would be the case with the traditional measure of PPG).

Creating an Overall Measure of Team Defense
We now have three statistics that give us some insight into yards allowed, time on the field and points allowed. But how can we aggregate these numbers into one meaningful statistic of team defense? I took several steps to arrive at one measure, so let's go through them, one-by-one:

First, by multiplying YGURP by TOTYDS we get the total yards given up resulting in points (let's call this TYGURP). For example, New England gave up 290.9 total yards per game (6th fewest in the AFC), but their TYGURP (this is simply 0.46*290.9) is 135.0 (which was a league best). TYGURP gives us a better sense of how many yards, of the total yards allowed, resulted in points. Next, we multiply TYGURP by AFP (call it T_TYGURP). Remember, AFP accounts for the defense's average starting field position. This metric will alleviate the problems detailed earlier concerning spurious values of YGURP.

We can also use the same methods for TOFRP. Multiplying TOFRP by TOT_TOF gives the total time on the field that resulted in points being scored (call it T_TOFRP). For example, New England's defense averaged 29:10 minutes on the field per game (this ranked fifth lowest in the AFC). However, New England's T_TOFRP was 9:49 minutes per game (ranking them second lowest in the AFC). The calculation, as above, is straightforward (0.34*29:10).

With these new statistics, we're almost there. We still need to account for the weighted score. We can do this by multiplying T_TYGURP*T_TOFRP*WPPG (call it ADM--Aggregated Defensive Metric). ADM now gives us a way to look at several important variables that contribute to team defensive success with one number. These numbers, by themselves aren't very intuitive, but we can rescale them and make them more meaningful. Using a 0.0-100.0 scale where 100.0 is a defense that gives up 0 PPG, 0 YPG and 0 TOFRP (the most dominant defense possible) and 0.0 is a defense that gives up 10 percent over the: maximum PPG, maximum YPG and maximum TOFRP (a sufficiently bad defense that allows for variation with the worst 'real' defense in the AFC. Also specifying the scale this way allows for comparison of team defenses across years). Here are the results:

team adm nflavg ypg ppg wppg
BUF 84.3(4) 2.5(1) 276.6(2) 17.4(3) 9.3(4)
BAL 89.1(2) 2.5(1) 271.6(1) 17.6(4) 7.9(2)
NE 91.0(1) 3.5(3) 290.9(6) 14.9(1) 7.3(1)
DEN 81.8(5) 4.5(4) 277.6(3) 18.8(6) 9.9(5)
MIA 86.3(3) 5.5(5) 304.1(9) 16.3(2) 8.8(3)
CLE 78.2(7) 7.5(6) 299.1(8) 20.1(7) 10.1(6)
JAC 75.9(8) 7.5(6) 282.6(5) 20.7(10) 11.0(9)
NYJ 72.1(11) 8.0(8) 322.9(11) 18.7(5) 11.0(8)
PIT 81.4(6) 8.0(8) 295.6(7) 20.4(9) 10.7(7)
IND 75.3(9) 8.0(8) 279.1(4) 21.0(12) 11.9(11)
TEN 74.8(10) 9.0(11) 304.3(10) 20.3(8) 11.3(10)
KAN 69.0(12) 11.5(12) 339.5(12) 20.8(11) 12.7(12)
CIN 52.2(14) 14.0(13) 340.9(13) 24.0(15) 14.2(14)
OAK 53.3(13) 14.5(14) 364.8(16) 23.7(13) 13.8(13)
HOU 43.7(16) 14.5(14) 362.9(15) 23.8(14) 14.4(15)
SD 50.1(15) 15.0(16) 342.4(14) 27.6(16) 14.9(16)


The table above lists ADM, NFL AVG, YPG, PPG and WPPG, with each team's ranking in parentheses (notice that the teams are ranked in ascending order by NFL AVG and all other categories are compared to this one). For example, in the PPG column, Tennessee ranks 8th (20.3 PPG), but in the WPPG column, Tennessee ranks 10th (11.3 WPPG). Additionally, I took the average of the rankings for PPG and YPG as kind of an ad hoc measure of defensive success (Note: the NFL has a total team defense statistic--I've called it YPG here, but I'll use the NFL AVG for the sake of comparison).

A couple of things stand out as you look at the table. First, PPG and WPPG are for the most part very similar. In fact, their correlation coefficient is 0.95 (where 1.0 means two variables are perfectly correlated). The largest difference in rankings when looking at PPG compared to WPPG is with the New York Jets; they go from fifth in PPG (18.7) to eighth in WPPG (11.0). The fact that PPG and WPPG are very similar implies that field position across teams was relatively similar. Houston's defense had the best AFP (their opponent's 29-yard line) while Jacksonville's defense had the worst AFP (their opponent's 35-yard line). Six yards separates the best team from the worst team when comparing AFP and because of the similarities, PPG and WPPG turn out to not be that different. Nonetheless, WPPG is a more precise measure of points allowed, because it also accounts for field position.

ADM also correlates highly with NFL AVG. But what is important when considering ADM is how each team obtained its ADM ranking. For example, the NFL AVG ranks Buffalo and Baltimore as the two best defensive teams (based on YPG and PPG), with New England ranked third. ADM however, has New England ranked first and Buffalo ranked fourth. Why? Well remember that ADM is made up of several variables (YGURP, AFP, TOFRP, TOTYDS and WPPG) that account for defensive success. One reason the ADM ranked New England ahead of Buffalo was that New England led Buffalo in all the categories that comprise the ADM. In fact, when looking at the NFL AVG, Buffalo only faired better than New England in YPG (see the table above). To be more specific, out of all the AFC teams, New England ranked first in TYGURP and WPPG, second in T_TOFRP and third in AFP, while Buffalo ranked fourth, fourth, seventh and fourteenth, respectively.

Another team that dropped in the rankings when using ADM was the New York Jets. Their NFL AVG was eighth and the ADM rank was eleventh. This was primarily the result of the Jets being eleventh in the AFC in YPG, tenth in TYGURP, twelfth in T_TOFRP, eighth in WPPG and ninth in AFP. Despite the Jets being fifth in the conference in PPG, all of the factors contributing to the ADM resulted in their defense being no better than eleventh overall.

So after going through all this, what's the big deal about ADM? Well, on the surface there doesn't seem to be much difference between the ADM and the NFL AVG. But it's important to consider a couple of things:

First, ADM offers a more precise estimate of team defense because it considers variables that determine how successful a team performs. Yards given up that result in points, time on the field, average starting field position and weighted points per game are all important factors that affect how a defense will perform. Finding a way to incorporate all of these factors into one measure is a good thing (although it can certainly be argued that there are better measures than ADM).

Next, it's important to remember that the ADM estimated above looks only at AFC teams for the 2003 season. Perhaps looking at both conferences, or looking at data from previous years might yield more variation between the ADM and the NFL AVG (I'll put this on my to-do list). But as I stated above, the ADM does consider many variables when constructing a measure of team defense. Additionally, it more precisely portrays the overall effectiveness of team defense when compared to the statistical measures used before.

So to finally answer the question I posed at the beginning of this post: "Is New England really the best defense in the AFC?" Well, based on the ADM, New England gets a score of 91.0 with the Baltimore Ravens close behind with a score of 89.1. So as an unbiased observer (in reality I'm a Steelers fan so I don't even like the Patriots) I guess New England can claim the crown as the best defense in 2003. While the Patriots ranked near the top in most defensive categories, they also did the little things that translated into wins on the field. For example, they were sixth in the AFC in YPG, but were first in the conference in TGURP. They ranked fifth best in time on the field, but ranked second best in TOFRP. They ranked ninth in turnovers created, but ranked first in WPPG. Maybe Belichick new all of this years ago and fashions teams that emphasize these parts of the game. Whatever the case, the Patriots have proven over the last few years that a defense can be great without a surplus of great players--apparently what's more important is great strategy.

[A quick note: Hopefully this exercise had shed some light on new ways to measure team defensive success and has offered some insight into why some defenses succeed while others continue to struggle. Please send me an email about any questions or comments you might have because this is version 1.0 (like why did I include 'X' or exclude 'Y' in the estimates). I'm sure there are a lot of ways this metric can be improved.]