GitHub - samhigh/Adv-ML-NFL

#Purpose The NFL is highly unpredictable, being highly notorious for unexpected results. As a consequence, NFL prediction is a fertile ground for assessing the performance of various machine learning tools.

The purpose of this project is to explore the effectiveness of advanced machine learning techniques using multiple tiers of NFL data. This project proposes machine learning techniques can be used to predict the next play given a series of previous plays, the winner of a game, and the performance of a player in a single game. These predictions could be useful when applied to live play calling strategy, personnel analysis, sports betting, and fantasy football.

This project will also compare various machine learning algorithms and assess their strengths and weaknesses. We also examine the effect of data aggregation as data can be trained on a play level, drive level, quarter level, half level, game level, or season level.

#Previous Work

Las Vegas correctly selects the correct favorite at a rate of 61.2-74.7% from 1989 - 2012 [3]. Previous work [1] has acheived a success rate of 64%. The NFL also has one of the lowest rates of upsets among all sports at 36.4% [4].

There is no history of research for play prediction based a play by play data. These is also no research predicting the performance of players all though many fantasy football pundits may offer projections. This has been ommitted as aggregating a useful dataset for comparison is not feasible.

#Data Sets

Play by play Data: Consists of summary of each play in each game including, time, down, yards to go, pass/run, distance, current score, as well as other metrics. From 2009-present.
NFL Combine data: A set of metrics used to measure the athleticism of players as well as one test to measue mental aptitude.
Coach history: Head coach, offensive coordinator, defensive coordinator
Rosters: The players that make up a team
Weather: Weather of games
Injury History: Football is a contact sport with high frequency of injuries.
Team Offense and Defensive Aggregated Data
Temperature differences of team's cities. Used in previous work [1]

#Algorithms

Deep Belief Nets - Structure of deep belief nets will be varied and compared to determine the most useful structure. Use in play by play data can be represented as a time-series. Application of deep belief nets has been successfully applied to time-series data in previous work using time-series data [8] by treating the previous historical data as the feature set.
SVMs with Kernel Transformations - The NFL game prediction has been previously done using SVM and kernel transformations in [5] and it it uses linear, polynomial and tangent kernels. This can be applied to perform play prediction.
Hidden Markov Model - Markov Models can be used to predict plays by assessing the current state if the game. This can be used as another strategy to predict plays wihtout considering the sequence of previous plays. A Markov model of football has be described in [9].
Logistic Regression - This will be used as a benchmark as previous work has mainly used logistive regression for prediction.
Ensemble of the above - After optimization of the the above models an enseble will be explored using weights.

#Optimizations

Particle Swarm Optimization - This optimization technique to determine the most useful structure for deep beleif nets was used for optimizing deep belief nets in previous work [8]. We will apply it to deep belief nets as well as the other techniques listed above.
Differentiating Evolution - This optimization technique will be used on the above algorithms. Comparisons will be made between this and particle swarm optimization.

#Study Plan

Data aggregation
Data pre-processing: Parsing of play by play data, mapping coaches to teams
ML Implementation
Optimization Implementation
Ensemble Implementation
Evalutation of models
Comperison of performance of different tiers of data

#Testing Validity

Accuracy of predicting type of next play(run, pass, punt, field goal), strategic location of next play(left, right, inside, deep, short, middle)
Accuracy of predicting player performance in game
Accuracy of predicting winners to be measured against the previous research of 64% [1], the historic rate for bookies, as well as the upset rate for teams.
Regression accuracy of predicting scores.
Analysis of biad and variance of various models.

#Potential Challenges The size of the dataset is relatively small as each teams plays 16 regular season games and run approximantly 50-60 playes per game. We hope to mitage this issue by training on the historical data going back 6 years. It could be assumed coaches talented enough to earn multi million dollar contracts do not have predictable tendencies as they would be exploited by opposing coaches.

#Deliverables

A report, source code, and data will be delivered. This project will be tracked and managed at a publicly available Github repository: https://github.com/aisobran/Adv-ML-NFL

#References

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
LaTex/Proposal		LaTex/Proposal
annResults		annResults
logisticResults		logisticResults
notebooks		notebooks
svmResults		svmResults
.gitignore		.gitignore
NFLparse.csv		NFLparse.csv
README.md		README.md
annAnalysis.py		annAnalysis.py
annStructureResults.txt		annStructureResults.txt
chartTemporalDistance.py		chartTemporalDistance.py
logisticAnalysis.py		logisticAnalysis.py
logisticplot.py		logisticplot.py
output.txt		output.txt
overNightLog.txt		overNightLog.txt
parseCSV.py		parseCSV.py
playByPlay.csv		playByPlay.csv
playByPlayDataset.csv		playByPlayDataset.csv
plotAcc.py		plotAcc.py
scoring.py		scoring.py
svmAnalysis.py		svmAnalysis.py
svmPlot.py		svmPlot.py
temporalPivot.py		temporalPivot.py
temporalPivot.pyc		temporalPivot.pyc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

samhigh/Adv-ML-NFL

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages