For a decade, Football Outsiders has been using advanced analytics to measure and predict team performance. And since the Football Outsiders database now goes back to 1989, I thought it would be worthwhile to test the predictive power of Football Outsiders’ ratings.
If you’re not familiar, FO uses DVOA as its base measure of team strength. The goal here is to use DVOA ratings in Year N to predict win totals in Year N+1. Now, what expectations should we have for DVOA? The fact that the team with the best DVOA in history — Washington in 1991 — won only 9 games the following season is not a knock on DVOA. That was an outstanding Super Bowl team that declined significantly the following year. Ditto the 16-0 Patriots looking less impressive without Tom Brady in 2008. But at a minimum, DVOA must do better at predicting future wins than say, just wins. And it should also do better than Pythagenpat ratings, which only incorporate points scored and points allowed. So does it?
Let’s start with the basics. The best-fit formula [1]Over the period 1989 to 2012, excluding the 1994, 1998, and 2001 seasons. to project wins in Year N+1 using *only* wins in Year N is:
5.343 + 0.332 * Year N Wins (Correlation Coefficient: 0.32)
And, as shown last week, by using Pythagenpat wins, we get a correlation coefficient of 0.36. So what happens if we instead use Year N DVOA as our input? We get the following best-fit formula:
8.01 + 6.378 * DVOA (Correlation Coefficient: 0.39)
As a result, DVOA does beat both regular wins and the Pythagenpat ratings. Now, what if we use both DVOA ratings and number of wins to predict future wins? As it turns out, the wins variable was nowhere near significant (p = 0.61), which means once we know the DVOA ratings, knowing the number of wins adds no predictive power. In other words, the evidence doesn’t prove that a team with a lot of wins but an average DVOA rating is better than a team with an average number of wins and an average DVOA rating.
But can we improve on DVOA? What if instead of using Team DVOA as our input, we use Offensive DVOA, Defensive DVOA, and Special Teams DVOA? Team DVOA obviously incorporates all three of these elements, but perhaps analyzing team strength on a more granular level will tell us more about the appropriate weights. Keeping in mind that for defenses, a negative DVOA grade means an above-average defense, here is the best-fit formula to predict future wins with those three inputs:
8.01 + 6.779 * OFFDVOA – 5.642 * DEFDVOA + 6.518 * STDVOA
All four variables are statistically significant, although as you might suspect the special teams one is the least statistically significant (at only p = 0.024; the other three have infinitely low p-values). As it turns out, this gives us the same correlation coefficient of 0.39, but I prefer looking at teams using this formula. Also, there’s something else to keep in mind when looking at the weights on the coefficients. In generally, range of DVOA grades is wider for offenses (removing outliers, about -35% to 35%) than it is for defenses (about -30% to 25%) or special teams (-8% to 10%). So even though the weight on the special teams variable is larger than the weight on the defense variable, this doesn’t mean special teams is more important than defense.
Continuing the momentary diversion, the standard deviation of DVOA grades from 1989 to 2012 [2]Excluding ’94, ’98, and ’01, was 14.5% for offense, 10.4% for defense, and 3.9% for special teams. From those numbers, one could put forth the argument that offense is roughly 50% of the game, defense is about 36% of the game, and special teams is around 14%.
What if we instead break down DVOA into five parts: Pass DVOA, Rush DVOA, Pass Defense DVOA, Rush Defense DVOA, and Special Teams DVOA? [3]I’ll leave it to someone with more time or inclination to break down the relationship between kicking, kickoffs, punting, kickoff returns, and punt returns re: special teams data. The correlation coefficient does not change, but all variables wind up being statistically significant. The best-fit formula is:
7.64 + 3.069 * PASS OFF + 4.297 * RUSH OFF – 2.231 * PASS DEF – 3.990 * RUSH DEF + 6.368 * ST DVOA
The intercept drops to 7.64 because, on average, passing offenses have above-average ratings compared to the baseline FO is using. This is not the case with offenses as a whole, as team offensive DVOA had an average of zero throughout the period. That difference is due to the fact that false starts and delays of game are counted against the offense, but neither the pass offense nor the rush offense. [4]As explained to me via e-mail by Aaron Schatz, the average of passing offense and run offense will always be higher than the average of pass defense and run defense because the offensive ratings … Continue reading The standard deviations for pass offense, rush offense, pass defense, and rush defense, are 22.0%, 11.1%, 15.1%, and 9.4%. Add in the 3.9% for special teams, and here’s another potential conclusion: pass offense is 33% of the game, rush offense is 17%, pas defense is 22%, rush defense is 14%, and special teams remains at 14%. Those numbers sound and feel appropriate to me.
So what does this mean for 2013? We can use the formula [5]With one adjustment. When I ran the numbers, the average team was only winning 7.944 games, so I increased the constant from 7.642 to 7.697. and the 2013 DVOA grades to project the number of wins for each team in 2014. Note that all the numbers for the five team grade columns should really be represented with a % sign, but including non-numeric data prevents the user from having the ability to sort the table.
Rk | Team | Pass O | Rush O | Pass D | Rush D | ST | Proj 2014 Wins |
---|---|---|---|---|---|---|---|
1 | SEA | 27.7 | 6.2 | -34.3 | -15.1 | 4.8 | 10.5 |
2 | DEN | 60.7 | 4.3 | 10.2 | -14.4 | -1.1 | 10 |
3 | PHI | 29.9 | 23.7 | 16.6 | -11.3 | -2.8 | 9.5 |
4 | CAR | 11.5 | 9.4 | -15.4 | -16.3 | 1 | 9.5 |
5 | SF | 31.8 | 2.2 | -2.1 | -8.1 | 3.7 | 9.4 |
6 | KC | 5.5 | 12.3 | -6.9 | -6.4 | 7.8 | 9.3 |
7 | NE | 28.1 | 7 | 3.9 | 4.3 | 6.7 | 9 |
8 | NO | 35.9 | -5.3 | -9.2 | -1.5 | -2.5 | 8.7 |
9 | CIN | 13.7 | -5.7 | -14.5 | -9.9 | 1.3 | 8.7 |
10 | ARI | 10.9 | -9.7 | -11.3 | -24.9 | -4 | 8.6 |
11 | SD | 51.3 | 3 | 23.9 | 8.6 | 0.8 | 8.6 |
12 | CHI | 28.6 | 1.4 | 7.5 | 9.8 | 2.1 | 8.2 |
13 | DAL | 15.8 | 7.9 | 20.8 | 4.3 | 3.5 | 8.1 |
14 | IND | 8.2 | 3.1 | 1.5 | -0.1 | -0.1 | 8 |
15 | STL | 0.1 | -14.9 | 4.8 | -17.1 | 6.3 | 8 |
16 | DET | 9.8 | -11.8 | 9.9 | -16.9 | -0.5 | 7.9 |
17 | GB | 12.2 | 10.9 | 21.3 | 6.4 | -0.4 | 7.8 |
18 | NYJ | -15.8 | -7.9 | 7.4 | -23.2 | 2.1 | 7.8 |
19 | MIN | -8.3 | 5.7 | 22 | -6.2 | 3.9 | 7.7 |
20 | TB | -1.5 | -11.6 | -0.1 | -15.4 | -1.5 | 7.7 |
21 | PIT | 23.5 | -14.9 | 8 | -0.9 | 0.6 | 7.7 |
22 | TEN | 5.5 | 2.7 | 6.3 | 1.3 | -3.2 | 7.6 |
23 | BUF | -14 | -3.9 | -22.8 | -3.2 | -5.6 | 7.4 |
24 | BAL | -8.4 | -27.2 | -4.8 | -13.6 | 6.4 | 7.3 |
25 | MIA | 3.8 | -4.1 | 0.1 | 4.8 | -2.3 | 7.3 |
26 | ATL | 14.6 | -6.9 | 24.6 | 1.3 | -0.1 | 7.2 |
27 | CLE | -9.1 | -7.1 | 14.1 | 0.6 | 1 | 6.8 |
28 | NYG | -18.9 | -20.3 | -6.9 | -17.2 | -5 | 6.8 |
29 | WAS | -14 | 4.5 | 12.7 | -5.4 | -12 | 6.6 |
30 | HOU | -19.4 | -9.4 | 15.9 | -11.4 | -5.1 | 6.5 |
31 | OAK | -20.4 | 0.8 | 22.1 | -3.6 | -7.1 | 6.3 |
32 | JAC | -24.2 | -27.2 | 20.1 | 1.1 | 2.6 | 5.5 |
It’s important not to infer too much from tables like these. The only inputs are 2013 DVOA grades, and the formula doesn’t know that Robert Griffin III should be a lot better this year or about Houston’s draft picks, Oakland’s cap room, or Green Bay getting twice as much Aaron Rodgers in 2014. But what’s interesting to me is the teams that stand out as different from their Pythagenpat ratings. Here are some thoughts:
The Eagles are projected for nearly one full win more using the DVOA projection (9.5) than Pythagenpat (8.6). That makes some sense, I think, because Philadelphia had excellent offensive pass and rush DVOA grades, and the below-average special teams grade doesn’t mean much. Philadelphia did rank 4th in points, but I think their DVOA grades are farther from the mean than their points scored number indicates.
The Jets are a less intuitive example. I think part of the reason for optimism here is that New York forced 18 fumbles but recovered only two of them! That’s an absurd result, and one that would make the Jets look much better in DVOA grades than points differential. Also, while pass defense is more important than run defense generally, perhaps it’s not as predictive: the regression has a significant weight on rush defense, where New York excels. That’s why Football Outsiders has the Jets as nearly an 8-win team, compared to the 7-win team from Pythagenpat.
The Buccaneers are another team that is projected for nearly one more full win (7.7 vs. 6.8) in 2014. Tampa Bay was 30th in points scored and 32nd in NY/A, but 22nd in Football Outsiders’ pass DVOA. I’m not sure the reason for the discrepancy — perhaps a higher weight on completion percentage? — but it does mean FO is more optimistic on the Bucs passing game, and by extension, the team. But perhaps the biggest reason is because of strength of schedule. Tampa’s schedule was the hardest in the league at 3.6 points tougher than average according to the Simple Rating System, and Football Outsiders also graded the Bucs as having the hardest schedule in the league.
Those three teams stand out as the biggest beneficiaries when using Football Outsiders’ analysis as opposed to just straight points differential. There’s no team significantly harmed by the analysis, although the Bengals, Colts, and Jaguars come closest. The Bengals lose about two-thirds of a win, and I suspect it’s because of the Andy Dalton effect. Cincinnati ranked 6th in points scored but just 12th in pass offense DVOA and 20th in rush offense DVOA. In other words, the Football Outsiders analysis is not nearly as high on the Bengals offense, which would reduce their expected wins total in 2014. The Bengals also had the 8th easiest schedule in 2013 by DVOA.
Anyway, I think these are a pretty useful starting point for your 2014 team projections rather than say, last year’s standings. Even Football Outsiders won’t use these for more than a starting point — their preseason projections will have the customary tweaks for things like teams getting new quarterbacks, injuries (or the lack thereof) in 2013, rookies, offensive line continuity, etc. Everyone will handle those questions differently, but I do think the table above presents a nice base for everyone’s team projections.
References
↑1 | Over the period 1989 to 2012, excluding the 1994, 1998, and 2001 seasons. |
---|---|
↑2 | Excluding ’94, ’98, and ’01 |
↑3 | I’ll leave it to someone with more time or inclination to break down the relationship between kicking, kickoffs, punting, kickoff returns, and punt returns re: special teams data. |
↑4 | As explained to me via e-mail by Aaron Schatz, the average of passing offense and run offense will always be higher than the average of pass defense and run defense because the offensive ratings account of things like false start and delay of game penalties, which are all negative. They’re also not included in defensive rating. So when running my regression, this means I’ve basically given teams a free pass when it comes to false start and delay of game penalties. I’m okay with that, but wanted to make the reader aware of this issue. |
↑5 | With one adjustment. When I ran the numbers, the average team was only winning 7.944 games, so I increased the constant from 7.642 to 7.697. |