≡ Menu

In Buffalo’s loss to Tennessee on Sunday, Chan Gailey faced an interesting decision. Buffalo trailed 28-27 in the final seconds of the third quarter when Ryan Fitzpatrick hit Steve Johnson for a 27-yard touchdown. Now up 33-28, Gailey chose to kick the extra point, and ultimately saw his team lose, 35-34.

Why did Gailey choose to go for 1? Bill Barnwell has his theory:

[The next mistake was] Gailey’s decision to kick an extra point on a touchdown at the end of the third quarter, which created the margin of victory. By going for one with seconds left in the third and a five-point lead (pending the extra point), Gailey paid tribute to the long-standing rule that teams shouldn’t go for two and try to create a seven-point lead before the fourth quarter. It’s an absurd rule, of course, that breaks down when you ask anybody to explain at any length why it makes sense. The two-point conversion chart at footballcommentary.com suggests that the Bills should have tried to tack a two-pointer onto their 33-28 lead if their chances of converting were better than 24 percent. Because the clock hadn’t ticked for 10 additional seconds and bumped the decision into the fourth quarter, though, the Bills kicked and ended up losing by one.

When I read that, my reaction was “yep, that sounds about right.” Up 5 with just over 15 minutes left, it seems like the “stats-geek” move is to go for two while the “conservative old school train of thought” says it’s “too early” to go for two. Of course, if that’s all there was to the story, you wouldn’t be reading this post right now. Take it away, Jason Lisk:

When I look at the game winning probabilities at Advanced NFL Stats, though, Gailey’s decision was different [than Mike Tomlin’s]. It pains me to say that conventional wisdom is right here, but it is. With 15 minutes left, being up 5 is more costly than up 7 is beneficial with all the permutations. There are enough possessions that you can get beat by two field goals gained, or not extend the lead with another field goal.

When is it too late to go for one point in either of these situations, though? As it turns out, the answer is roughly between the 6 and 7 minute mark of the fourth quarter. That’s when possessions become more limited and you must try to tie, or make it where a touchdown doesn’t beat you.

A little surprised, I went over to Advanced NFL Stats and entered the numbers into Brian Burke’s Win Probability Calculator. Up 5, at the start of the 4th quarter, with the opponent having 1st and 10 at the 22 yard line, yields a 72% win probability to the leading team. Up 6 translates to a 77% win probability and up 7 increases it to 80%. That’s what Lisk meant when he said that difference between being up 5 and up 6 — 5% — is greater than the difference between being up 6 and up 7 — 3%.

Nerd Fight! Brian is a good friend of the site and one of the smartest minds out there, but he’d be the first to tell you that his Win Probability model is not perfect. So the question we have to ask is, is this a situation where his Win Probability Model breaks down?

Let’s not forget what Barnwell noted: according to footballcommentary.com, going for 2 is the obvious call here. And let’s used my tried-and-true method for making any football decision. If you were a Titans fan, now trailing by 5 at the end of the 3rd quarter, would you have been happy to see Buffalo’s kicking team run onto the field, or would you have wished that instead they went for it? My gut tells me — and let’s stipulate that the Bills would have had a 50% chance of converting the 2-point attempt — that as a hypothetical Titans fan, I’d want Buffalo to kick the extra point. Being down 7 sounds really bad, while the difference between 5 and 6 seems pretty negligible to my Nashville gut.

The fact that this score happened at the end of the quarter is very convenient, because we can measure how frequently teams win games when leading after the third quarter by 5, 6, or 7 points. [1]To be clear, this isn’t perfect; in these examples, we don’t know who has possession, what down it is, or where the ball is placed. But instead of just giving you the answer, I want you to take a second and guess as to what each answer is.

  • In 54 games from 2000 to 2011 (including playoffs), what percentage of the time did teams leading by 5 after the 3rd quarter ultimately win the game?

Click 'Show' for the Answer. Show

  • In 129 games from 2000 to 2011 (including playoffs), what percentage of the time did teams leading by 6 after the 3rd quarter ultimately win the game?

Click 'Show' for the Answer. Show

  • In 323 games from 2000 to 2011 (including playoffs), what percentage of the time did teams leading by 7 after the 3rd quarter ultimately win the game?

Click 'Show' for the Answer. Show


Pretty interesting results, eh? And they would certainly support Lisk’s and Burke’s idea that the difference between 5 and 6 is much greater than the difference between 6 and 7.

Hey, you went to Harvard, help me out on this.

If you’ve made it this far, I only have bad news for you: it’s about to get more complicated.

There several ways we can go from here if we want to argue the Barnwell/Footballcommentary/gut theory: we could say that the data above undervalues teams leading by 5, overvalues teams leading by 6, or undervalues teams leading by 7. I’m going to choose door #2.

To me, the surprise in the data isn’t that “5” is so low but rather that “6” is so high. In fact, teams had a higher winning percentage when leading by 6 than by 7. We can write that off based on small sample size due to the miniscule difference, but what of the larger point that 6 is significantly closer to 7 than it is to 5?

There are two notable things going on here.

Sample size problems

Burke’s data is from 2000 to 2011, the era for which we have play by play logs, which form the backbone of his Win Probability Model. From 2000 to 2011, in regular season games, teams leading by 6 after three quarters won 78.4% of the time (98 out of 125). But from 1994 (the start of the two-point conversion era) to 1999, teams up by 6 after three quarters won only 62.5% of the time (45 out of 72). The table below shows team success rate when trailing by 6 after three quarters for each year since 1990:

Year#Gm#WinsWin%
20118450%
20109888.9%
200910880%
2008131076.9%
20077571.4%
2006141071.4%
20057685.7%
20041212100%
200310990%
2002171164.7%
2001111090.9%
20007571.4%
1999141178.6%
19988675%
199718844.4%
1996151066.7%
19957571.4%
199410550%
1993201890%
1992121083.3%
1991141392.9%
199011654.5%
Total25419074.8%

The results vary from year to year, as would be expected with tiny sample sizes, but on average, teams up by 6 entering the 4th quarter win 75% of the time. Since Burke was using data from games where teams won 79% of the time, he was essentially using a deck of games where teams didn’t come back as frequently as they are apt to do. As a result, I have to assume his win probability model overstates the likelihood of success when up by 6 entering the 4th quarter by a couple of percentage points.

Is that significant? Recall that the win probabilities when up by 5, 6, and 7 points were 72, 77 and 80. If that 77 was really, say, 74, that changes the analysis significantly, and therefore, the difference betwen 5 and 6 would in fact be much smaller than the diffference between 6 and 7. Therefore, I feel pretty comfortable siding with Barnwell and footballcommentary over Lisk and Burk. Both theory and my gut tell me that the difference between being up by 6 vs. 7 is larger than the difference between being up by 5 vs. 6. And while some data says otherwise, I think a larger sample size presents a clearer picture.
[Update: Per pt’s comment, I went back and checked how teams up by 5 and 7 after 3 quarters fared going back to 1990, too. The data looks largely the same when looking at 7, but from 1994 to 1999, teams up by 5 after 3 quarters won 77.3% of the time, which is a much higher rate than from 2000 to 2011. From 1990 to 2011, teams up by 5 after 3 quarters won 69.2% of the time, so the data from 2000 to 2011 may slightly undervalue the value of being down by 5, too. If that’s the case, it decreases the difference between 5 and 6 even further, and makes it more clear that going for 2 is the correct call.]

But I discovered something else in the data that I can’t quite understand.

Does the type of 6-point deficit matter?
When leading after three quarters by a score of 6-0, 13-7, 20-14, or 27-21, teams won 77.6% of the time since 1994 (90 out of 116). When leading after three quarters by a score of 9-3, 16-10, 23-17, or 30-24, teams won just 61.9% of the time (39 out of 63). You might think that just means that lower scoring games are less likely to see a comeback, but that’s not the case. In reality, the former set of games — we can call them the “opponent has zero field goals” games (“0FG”) — simply correlates better with winning than the latter group, the “opponent has one field goal games.” Being up 27-21 (100% in 8 games) is better than being up 23-17 (53.8%), being up 20-14 (78.1%) is better than being up 16-10 (69.7%), and being up 13-7 (75.4%) is better than being up 9-3 (50%).

Now, why in the world would this be the case? Small sample size is always a possible answer here, as our N is pretty small. But the fact that the winning percentage of the leading teams in the 0FG games holds consistent across a sliding scale makes me at least consider the possibility that something is actually going on; the p-value is less than 3%, which makes the results statistically significant, although that is far from conclusive.

Note that Gailey in this case was in one of those 0FG games, so obviously that means he should have won. So far in 2012, in 0FG games, teams are 3-3 when leading by 6, with the three losses coming in memorable games (Bills-Titans, Patriots-Ravens, and Dolphins-Cardinals). In 1FG games, teams up by 6 are 2-0. So maybe there is nothing to the data. But I can’t help but wonder if there is.

Do you remember my Snake Eyes Post, where I noted that teams trailing by 8 win less frequently than teams trailing by 9? I wonder if the same effect could possibly be going on here, where teams up 23-17 feel like they’re in a better position to win than teams up 20-14 (although that doesn’t really pass the smell test). The reason teams in the 0FG have a higher winning percentage is mostly because they score more points in the 4th quarter — on average, 5.6 points (while allowing 6.3 points) vs. 4.6 points (and allowing 6.5 points). Teams in 0FG games were shutout in the 4th quarter 26% of the time, while teams in the 1FG games were shutout 35% of the time. Who knows, there may be nothing there, but it’s yet another odd quirk in the fun annals of NFL history.

References

References
1 To be clear, this isn’t perfect; in these examples, we don’t know who has possession, what down it is, or where the ball is placed.
{ 11 comments }