nflfastR
is a set of functions to efficiently scrape NFL
play-by-play data. nflfastR
expands upon the features of
nflscrapR:
cp
), completion
percentage over expected (cpoe
), and expected yards after
the catch (xyac_epa
and xyac_mean_yardage
) in
play-by-play going back to 2006update_db()
that creates and
updates a databaseWe owe a debt of gratitude to the original nflscrapR
team, Maksim Horowitz, Ronald Yurko, and Samuel Ventura, without whose
contributions and inspiration this package would not exist.
The easiest way to get nflfastR is to install it from CRAN with:
install.packages("nflfastR")
To get a bug fix or to use a feature from the development version, you can install the development version of nflfastR either from GitHub with:
if (!require("pak")) install.packages("pak")
::pak("nflverse/nflfastR") pak
or prebuilt from the development repo with:
install.packages("nflfastR", repos = "https://nflverse.r-universe.dev")
We have provided some application examples in the Getting Started article. However, these require a basic knowledge of R. For this reason we have the nflfastR beginner’s guide, which we recommend to all those who are looking for an introduction to nflfastR with R.
You can find column names and descriptions in the Field
Descriptions article, or by accessing the
field_descriptions
dataframe from the package.
Even though nflfastR
is very fast, for
historical games we recommend downloading the data from here.
These data sets include play-by-play data of complete seasons going back
to 1999 and we will update them in 2020 once the season starts. The
files contain both regular season and postseason data, and one can use
game_type or week to figure out which games occurred in the postseason.
Data are available as .csv.gz, .parquet, or .rds.
nflfastR
uses its own models for Expected Points, Win
Probability, Completion Probability, and Expected Yards After the Catch.
To read about the models, please see this
post on Open Source Football. For a more detailed description of the
motivation for Expected Points models, we highly recommend this paper from the nflscrapR team
located here.
Here is a visualization of the Expected Points model by down and yardline.
Here is a visualization of the Completion Probability model by air yards and pass direction.
nflfastR
includes two win probability models: one with
and one without incorporating the pre-game spread.
nflfastR
uses this source for 1999 and 2000 and previously
also used it for 2001-2010)nflfastR
modelsnflfastR
1.0nflscrapR
team, Maksim Horowitz, Ronald Yurko, and Samuel Ventura, whose work
represented a dramatic step forward for the state of public NFL
research