One year ago, I started a Twitter account to share the football stats that I was constantly aggregating (@ButkusStats). I did not think this account would go anywhere, but I figured that I might as well share the mass amounts of data and spreadsheets that I had been accumulating over the last few years with no purpose besides arguing with people on Reddit. Surprisingly, the account gained some traction over the next 12 months, and I am thankful for each follower gained.
The point of this article is to share and explain some of my favorite stats for anybody out there who may not know about these resources or fully understand the calculations behind these stats. Links for where to find these stats are within this article.
“Football isn’t played on spreadsheets.”– Ben Baldwin
Key Football Stats Sources and Follows
- Pro Football Reference: Database of traditional football stats, with some limited advanced stat functionality. A great resource for volume stats, draft data, Hall of Fame data, and general NFL stats.
- Stathead: A search function for Pro Football Reference. Allows you to create searches by player, season, team, games, draft classes, among other options. Requires subscription for use.
- Football Outsiders: In-depth stats that cannot be found anywhere else, using methods for analyzing skill players, linemen, special teams, and total team efficiency. @FO_ASchatz serves as the Editor-in-Chief, with many other contributors.
- Pro Football Focus: Premium stats included through the PFF Elite subscription.
- nflfastR: A set of functions to efficiently scrape NFL play-by-play data dating back to the year 1999. Developed by @MrCaseB (Author) and @BenBBaldwin (Maintainer, Author). Additional contributors include @LeeSharpeNFL, @BklynMaks, @Stat_Ron, @Stat_Sam, @_TanHo, and @John_B_Edwards.
- RBSDM.com: Advanced Box Scores for games that include win probability, EPA, CPOE, Success Rates, among other items for entire games and play-by-play data. Additionally, includes links to tools for calculating team offensive and defensive tiers, rushing and passing EPA, fumble recovery luck, third down conversions rates over expected, QB heat maps, fourth down decision calculator, among other items. All data scraped from nflfastr. Created by @BenBBaldwin.
- Sharp Football Stats: Data on personnel groupings, offensive and defensive efficiency, strength of schedule, play frequencies, among other items. Created by @SharpFootball.
- RYOE Calculator: Calculator and graphics for Rush Yards Over Expectation dating back to 1999. Data scraped from nflfastr. Created by @TejFBAnalytics.
- Inside the Pylon: A great source for their football term glossary and explanations.
Team Advanced Stats
Defensive-adjusted Value Over Average (DVOA)
Source: Football Outsiders
- This is a metric developed by Football Outsiders that measures a team’s efficiency by comparing success on every single play to the league average based on situation and opponent. This stat shows as a percentage above or below average, with 0% representing league average.
- For Total DVOA, Offensive DVOA, and Special Teams DVOA, a positive number represents being better than league average. For defensive DVOA, a negative number represent being better than league average.
- Offensive DVOA – Defensive DVOA + Special Teams DVOA = Total DVOA
For example, here are 2020’s Top 10 Teams by DVOA:
“I never wanted to revolutionize the way football teams were run—I wanted to revolutionize the way they were covered.”– Aaron Schatz
Expected Points Added (EPA)
- A statistic that aims to measure the value of individual plays in terms of points.
- This is done by calculating “Expected Points” for every down. Through the use of historical data, “Expected Points” are calculated for any given play based on down, distance, and field position. Then, expected points is contrasted against the actual result of each play in order to determine the expected points added on the play.
- In 1969, Cincinnati Bengals QB Greg Cook won the AFL Rookie of the Year award. To this day, Cook still owns rookie records for yards per attempt (9.4) and yard per completion (17.5). However, Cook’s career was riddled with injuries beyond his rookie season. In 1970, backup QB Virgil Carter started 11 games for legendary head coach Paul Brown and offensive coordinator Bill Walsh.
- Carter was athletic with short accuracy and a quick processor, but lacked the arm strength of Cook. In order to take advantage of Carter’s abilities, the west coast offense was born, utilizing an attack focused on the short passing game. Carter would go on to earn a master’s degree from Northwestern while he was playing and co-publish a study with Northwestern professor Robert Machol that reviewed 8,000 NFL plays and assigned point values to various field positions.
- In 1988, Bob Carroll, Pete Palmer, and John Thorn used Carter and Machol’s research in order to develop an updated Expected Points framework for their book, “The Hidden Game of Football”.
- In the mid-2000s, Brian Burke created a more in-depth model for EPA and shared the calculations on the internet. ESPN, Football Outsiders, FiveThirtyEight, and others have since furthered their EPA applications.
“Really what happened was there was this confluence, this lucky moment in time where we had the internet, the data became freely available at the same time, computing horsepower got good enough, you could do these sorts of things in a reasonable amount of time on your own PC.”– Brian Burke
- This is the sum of Expected Points added over the course of a game, season, or any other volume divided by the number of plays. This can utilized for an entire offense, defense, or a specific player.
For example, here are the Bears top three offensive plays by EPA in the 2020 season:
Win Probability (WP)
- Win Probability measures each play of a game in terms of how much it increased or decreased a team’s chance of winning that game.
- Bob Carroll, Pete Palmer, and John Thorn authored the book “The Hidden Game of Football”, previously mentioned in the EPA discussion. These men also developed a win probability framework that calculated a team’s chance of winning based on EP, point differential, and time remaining.
- Win Probability has been part of baseball sabermetrics for years. Being a little behind on the trend, football mainstream began to adopt the stat in the mid-2010s.
Win Probability Added (WPA)
- WPA measures the impact of each play towards winning or losing a game. This stat is not predictive, and does not measure the true ability of any individual player or team. It is a measure of what actually happened on any given play.
- Most notably, WPA tells us which plays were most critical to each game. Through this data, we can determine which players shine the brightest in the biggest moments. Additionally, this stat can help us measure risk and reward in specific situations. Whether taking a sack was better than throwing up a prayer into coverage. Or whether a team should go for it on fourth down, punt, or kick a field goal.
For example, here are the Bears top three offensive plays by WPA in the 2020 season:
“We confirmed that there’s a competitive advantage in analytics in a league that is structured to prevent you from having a competitive advantage.”– Joe Banner, former Eagles President
- This stat is simple once you grasp EPA. All plays that result in EPA greater than zero are “successful”. Therefore, if a team has 80 offensive plays in a game and 40 of those plays resulted in EPA greater than zero, then they would generate a success rate of 50% (40/80).
2020 Top 5 Offenses and Defenses in Success Rate
QB Advanced Stats
“When you’re trying to answer a question, you’re probably the first person in the history of the world to ever answer that exact question with this exact data.”– Michael Lopez, NFL Director of Football Data & Analytics
Completion Percentage Over Expectation (CPOE)
- This stat measures a QB’s performance on any given throw relative to the difficulty of that throw. Expected completion percentage is determined by air yards, target distance from closest defender and sideline, QB distance from closest pass rusher, speed at time of the release (throwing on the run), and time to throw the ball.
- Next Gen Stats has collected this data dating back to the 2016 season.
- The top throws in CPOE tend to be some of the best passing play highlights of the season.
- CPOE has proven to be the most predictive QB stat from success in NCAA football to NFL football. Chicago Bears rookie QB Justin Fields was the most accurate QB in the PFF College Football era, by CPOE.
For example, here are the Bears top three offensive plays by CPOE in the 2020 season:
- This stat is a composite metric of EPA versus CPOE.
- By combining EPA with CPOE, we can find the effectiveness of a QB’s style of play and what results it renders.
2020 QB’s with Above Average EPA/Play and Below Average CPOE
- These QB’s consistently generated value for their teams by EPA standards, but did not perform well in CPOE. The weak CPOE figures may be due to issues with downfield accuracy, a tendency to take difficult shots that fall incomplete, or a tendency to fall behind early leading to the need for hero ball.
2020 QB’s with Below Average EPA/Play and Above Average CPOE
- These QB’s failed to consistently generate value for their teams by EPA standards, but performed well by CPOE standards. The strong CPOE figures may be due to a high checkdown rate or a lack of difficult throws attempted.
2020 QB’s EPA + CPOE Composite
Other Advanced Stats
“If you love something or are passionate about something, you will do things that most people would find ridiculous.”– Neil Hornsby (PFF Founder)
Rushing Yards Over Expected (RYOE)
Source: RYOE Calculator
- This stat begins with Expected Rushing Yards. Expected Rushing Yards tells us how many rushing yards a ball-carrier is expected to gain on any given carry based on the location of the handoff, as well as the speed and direction of the blockers and defenders. All data used for this stat is based on historical data in similar circumstances.
- RYOE is the difference between actual rushing yards on a carry and expected rushing yards on a carry.
For example, here is every 2020 Bears RB Carry by RYOE, and Montgomery vs. Elliott vs. Chubb by Cumulative DYAR:
Defensive-adjusted Yards Above Replacement (DYAR)
Source: Football Outsiders
- This stat gives the yardage value of performance on plays where the WR/TE/RB caught or carried the football, versus the performance expected from a replacement level player in the same situation.
- First downs, touchdowns, and turnovers all have an estimated yardage value in this stat.
- Estimates for replacement level production are computed differently for each position.
- The ultimate goal of this stat is to evaluate players independent from the quality of their teammates.
WR’s with over 1,000 Receiving Yards (Yards vs. DYAR Ranks)
TE’s with over 600 Receiving Yards (Yards vs. DYAR Ranks)
RB’s with over 900 Rushing Yards (Yards vs. DYAR Ranks)
Adjusted Line Yards (ALY)
Source: Football Outsiders
Adjusted Line Yards (ALY) aims to separate the abilities of a running back from the abilities of his offensive line. The calculation assigns every yard gained as follows:
- The offensive line is assigned 120% of the yardage that goes for a loss of yards.
- They gets credit for 100% of the yards gained up to four yards beyond the line of scrimmage.
- 50% of every yard from five to 10 yards beyond the line of scrimmage is assigned to the offensive line.
- The offensive line does not get any credit for yardage gained 11-plus yard beyond the line of scrimmage.
Power Success Rate
Source: Football Outsiders
- Power success rate measures the success of run plays, rather than the yardage gained. The sample is based on third- and fourth-down attempts with two or fewer yards to go. Converting a first down means that the play was successful. This stat does include QB sneaks, as they are heavily dependent on the offensive line.
- This stat ultimately shows an offensive line’s ability to impose their will on an opponent when everybody knows that the offense will likely run the ball.
2020 Top 10 Offensive Lines in ALY, and their rank in Power Success Rate
Thank you for reading, and hopefully you can find some use for all these stats, and enjoy them as much as I do. If you are interested in previous articles I have written, all of my football stat articles are included here as well. In the words of Peyton Manning:
“God bless you, and God bless football.”– Peyton Manning
And Go Bears!