-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathproject.html
163 lines (151 loc) · 14.9 KB
/
project.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<!-- Title for the page -->
<title>The role of Salary within the NBA</title>
<!-- Link to my style sheet, fonts -->
<link rel="stylesheet" href="index.css">
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Roboto+Mono">
<!--The next three lines allow the Vega embed-->
<script src="https://cdn.jsdelivr.net/npm/[email protected]"></script>
<script src="https://cdn.jsdelivr.net/npm/[email protected]"></script>
<script src="https://cdn.jsdelivr.net/npm/[email protected]"></script>
</head>
<body>
<!-- CONDITIONAL SCRIPT TO EMBED BASED ON SCREEN WIDTH -->
<script>
// Find the current screen width:
let width = screen.width;
// Use an if function to pick the approprite visualisation:
if (width > 450) {
var myChart1 = "nba_project/charts/chart_1/count_contracts.json";
var myChart2 = "nba_project/charts/chart_2/payrolls_adj_cpi_bri.json";
var myChart3 = "nba_project/charts/chart_3/correlation_heatmap.json";
var myChart4 = "nba_project/charts/chart_3/predicted_playoff_payroll_2022_23.json";
var myChart5 = "nba_project/charts/chart_4/salary_vorp_regression.json";
var myChart6 = "nba_project/charts/chart_4/correlation_heatmap.json"
var myChart7 = "nba_project/charts/chart_1/salary_distribution.json"
} else {
var myChart1 = "nba_project/charts/chart_1/count_contracts_narrow.json";
var myChart2 = "nba_project/charts/chart_2/payrolls_adj_cpi_bri_narrow.json";
var myChart3 = "nba_project/charts/chart_3/correlation_heatmap_narrow.json";
var myChart4 = "nba_project/charts/chart_3/predicted_playoff_payroll_2022_23_narrow.json";
var myChart5 = "nba_project/charts/chart_4/salary_vorp_regression_narrow.json";
var myChart6 = "nba_project/charts/chart_4/correlation_heatmap_narrow.json"
var myChart7 = "nba_project/charts/chart_1/salary_distribution_narrow.json"
}
</script>
<!-- END - CONDITIONAL SCRIPT TO EMBED BASED ON SCREEN WIDTH -->
<!-- HEADER -->
<header class="header" id="webHeader">
<h2 class="visually-hidden">Header</h2>
<div class="wrapper">
<nav class="header__nav">
<h2 class="visually-hidden">Navigation</h2>
<a href="/" class="header__home">
Sam Blundell: Portfolio
<span class="visually-hidden">(to home page)</span>
</a>
<a href="https://github.com/SLBlundell" target=”_blank” class="header__social">
<svg
xmlns="http://www.w3.org/2000/svg"
width="25"
height="24"
aria-labelledby="socialGitHub"
role="img"
>
<title id="socialGitHub">GitHub</title>
<path
fill="#FFF"
fill-rule="evenodd"
d="M12.304 0C5.506 0 0 5.506 0 12.304c0 5.444 3.522 10.042 8.413 11.672.615.108.845-.261.845-.584 0-.292-.015-1.261-.015-2.291-3.091.569-3.891-.754-4.137-1.446-.138-.354-.738-1.446-1.261-1.738-.43-.23-1.046-.8-.016-.815.97-.015 1.661.892 1.892 1.261 1.107 1.86 2.876 1.338 3.584 1.015.107-.8.43-1.338.784-1.646-2.738-.307-5.598-1.368-5.598-6.074 0-1.338.477-2.446 1.26-3.307-.122-.308-.553-1.569.124-3.26 0 0 1.03-.323 3.383 1.26.985-.276 2.03-.415 3.076-.415 1.046 0 2.092.139 3.076.416 2.353-1.6 3.384-1.261 3.384-1.261.676 1.691.246 2.952.123 3.26.784.861 1.26 1.953 1.26 3.307 0 4.721-2.875 5.767-5.613 6.074.446.385.83 1.123.83 2.277 0 1.645-.015 2.968-.015 3.383 0 .323.231.708.846.584a12.324 12.324 0 0 0 8.382-11.672C24.607 5.506 19.101 0 12.304 0Z"
/>
</svg>
</a>
<a href="https://www.linkedin.com/in/sam-blundell-7608b7196/" target=”_blank” class="header__social">
<svg
xmlns="http://www.w3.org/2000/svg"
width="25"
height="24"
aria-labelledby="socialLinkedIn"
role="img"
>
<title id="socialLinkedIn">LinkedIn</title>
<path
fill="#FFF"
fill-rule="evenodd"
d="M5.551 3.304c-1.14 0-2.067.926-2.067 2.064 0 1.14.928 2.066 2.067 2.066a2.066 2.066 0 0 0 0-4.13ZM3.767 8.998v11.453h3.562L7.33 8.998H3.767Zm5.798 0V20.45l3.554.002.002-5.668c0-1.454.253-2.941 2.132-2.941 1.851 0 1.851 1.755 1.851 3.036v5.571l3.559-.001v-6.28c0-2.834-.517-5.457-4.27-5.457-1.763 0-2.916.997-3.368 1.85h-.05V8.997h-3.41ZM22.435 24H1.982c-.976 0-1.77-.777-1.77-1.732V1.731C.212.776 1.006 0 1.982 0h20.453c.98 0 1.777.776 1.777 1.73v20.538c0 .955-.797 1.732-1.777 1.732Z"
/>
</svg>
</a>
<a href="assets\Sam Blundell - Academic CV.pdf" class="header__social">
<svg
xmlns="http://www.w3.org/2000/svg"
width="40"
height="40"
viewBox="0 0 30 24"
aria-labelledby="socialCV"
role="img"
>
<title id="socialCV">CV</title>
<path
fill="#FFF"
fill-rule="evenodd"
d="M19.71,7.29l-5-5A1,1,0,0,0,14,2H5A1,1,0,0,0,4,3V21a1,1,0,0,0,1,1H19a1,1,0,0,0,1-1V8A1,1,0,0,0,19.71,7.29ZM19,21H5V3h9V7a1,1,0,0,0,1,1h4ZM8,6.75A1.64,1.64,0,0,1,9.5,5,1.64,1.64,0,0,1,11,6.75,1.64,1.64,0,0,1,9.5,8.5,1.64,1.64,0,0,1,8,6.75ZM6,12H18v1H6Zm0,2H18v1H6Zm0,2H18v1H6Zm0,2H16v1H6Zm7-7H6V10A5.6,5.6,0,0,1,9.5,9,5.6,5.6,0,0,1,13,10Z"
/>
</svg>
</a>
</div>
</nav>
</div>
</header>
<!--END OF HEADER -->
<main id="main">
<div class="wrapper">
<h1>The Midas Touch:<br>What is the on-court impact of salary within the NBA?</h1>
<p><img src="assets/mj_layup_crop.jpg" atl="Michael Jordan"><br>This project aims to determine whether NBA franchises are allocating their salary caps effectively in order to achieve better playoff performance and win titles. The motivation of this question is illustrated by chart one, showing that basketball is an outlier in how many highly valuable stars the NBA contains compared to other sports.</p>
<div class="figures"><div class="chart" id="Chart1"></div><br><br><a href="https://github.com/SLBlundell/slblundell.github.io/blob/main/nba_project/charts/chart_1/sports_contracts.csv" class="button" target="_blank">Data</a> <a href="https://github.com/SLBlundell/slblundell.github.io/blob/main/nba_project/charts/chart_1/scraper.ipynb" class="button" target="_blank">Scraper</a></div>
<p> Furthermore, chart two demonstrates that whilst many NBA players are amongst the world’s highest paid athletes, the vast majority of players are paid far less. This begs the question, does this practice of significant spending on few stars lead to better playoff performance? The third chart seems to suggest such, showing that even when adjusting for inflation and increased league revenue, teams are spending more of their revenue on player salaries in recent years.
The answer to these questions both shed light on the efficiency of fiscal decision making within the NBA, and also provide answers of how salary rule composition within the NBA influence on-court outcomes.</p>
<div class="figures"><div class="chart" id="Chart7"></div><br><br><a href="https://github.com/SLBlundell/slblundell.github.io/blob/main/nba_project/data/chart_4_player_data.csv" class="button" target="_blank">Data</a> <a href="https://github.com/SLBlundell/slblundell.github.io/blob/main/nba_project/charts/chart_3/2023_scraper.ipynb" class="button" target="_blank">Scraper</a></div>
<div class="figures"><div class="chart" id="Chart2"></div><br><br><a href="https://github.com/SLBlundell/slblundell.github.io/blob/main/week4/team_active_payroll.csv" class="button" target="_blank">Data</a> <a href="https://github.com/SLBlundell/slblundell.github.io/blob/main/week4/payroll_scraper.ipynb" class="button" target="_blank">Scraper</a></div>
</p>
<h2>Data Used</h2>
<p>For the primary data sources, two scraper scripts were constructed in Python: one for scraping data between 2013-2021 and another for current season data. In the case of the 2013-21 scraper, a for loop was utilised to maximise efficiency in gathering data across multiple years, combining basic and advanced statistics from Basketball-Reference with salary data from HoopsHype. This approach can be replicated to scrape data from additional years up to the earliest data is published.
The 2023 scraper has been separated from the former scraper in order to allow data to be updated periodically in response to new games this season, which feeds through to the linear regression models and provides up-to-date playoff predictions.
</p>
<h2>Challenges in cleaning data</h2>
<p>The most significant cleaning challenge was merging data from Basketball-Reference and HoopsHype. The two sites used differing team names, one shortened. In order to overcome this, I defined a list of full team names and replaced the shortened team names with the full team names, based on how similar the shortened 2 team name strings were to string entries in the full team name list. For this, I used the SequenceMatcher method from the difflib package (I found through trial and error that a similarity of 53% ensured correct matching of names without causing incorrect matches).
Secondly, player naming conventions differed between datasets, causing hundreds of rows being dropped during merging of player and player salary data. To combat this, I created two lists of player name strings that were not found in both datasets (player_diff_1 and player_diff_2). Through browsing this list, I found common naming convention. I corrected this by using if statements. For remaining misnamed players, I was forced to correct these names manually using the .replace() method. This method ensures automation is maintained when data from new games is scraped.
There was also an issue with accented characters in player names causing errors, which was corrected by applying a unidecode function to all rows in the player name column.
</p>
<h2>Conclusions</h2>
<p>Considering chart four, we can see that out of all measurable game statistics, overall shooting efficiency ratings (such as TS% and eFG%) and margin of victory indicators (SRS, MOV) are the best indicators of a team’s success in the playoffs (P_W). However, team payrolls (Pay) are only somewhat correlated with efficiency ratings, and extremely uncorrelated with both margin of victory indicators and playoff performance. In short, teams within the modern NBA that spend more do not win more</p>
<div class="figures"><div class="chart" id="Chart3"></div><br><br><a href="https://github.com/SLBlundell/slblundell.github.io/blob/main/nba_project/data/correlations_playoff.csv" class="button" target="_blank">Data</a> <a href="https://github.com/SLBlundell/slblundell.github.io/blob/main/nba_project/charts/chart_3/2013_to_2021_scraper.ipynb" class="button" target="_blank">Scraper</a> <a href="https://github.com/SLBlundell/slblundell.github.io/blob/main/nba_project/charts/chart_3/model.ipynb" class="button" target="_blank">Model</a></div>
<p>Utilising a linear regression model, we can see that this season there seems to be a much more correlated relationship between payroll and predicted playoff performance, with a correlation of .53 (compared to a correlation of .13 historically). I hypothesise this change in relationship is as a result of teams with spare salary-cap space ‘tanking’ (i.e. losing on purpose) in order to achieve better odds in the draft lottery for generational prospect Victor Wembanyama – this type of opportunity has not occurred within our dataset before.</p>
<div class="figures"><div class="chart" id="Chart4"></div><br><br><a href="https://github.com/SLBlundell/slblundell.github.io/blob/main/nba_project/data/team_per_game_2023.csv" class="button" target="_blank">Data</a> <a href="https://github.com/SLBlundell/slblundell.github.io/blob/main/nba_project/charts/chart_3/2013_to_2021_scraper.ipynb" class="button" target="_blank">Scraper</a> <a href="https://github.com/SLBlundell/slblundell.github.io/blob/main/nba_project/charts/chart_3/model.ipynb" class="button" target="_blank">Model</a></div>
<p>Chart six breaks down player-level relationships this season. Here we can see that player salary (pay) is far more correlated with highly observable characteristics – points and free-throws, rather than factors we observed were correlated with playoff performance in chart 4 – inferring that ‘flashier’ players out-earn players that actually generate winning potential. </p>
<div class="figures"><div class="chart" id="Chart6"></div><br><br><a href="https://github.com/SLBlundell/slblundell.github.io/blob/main/nba_project/data/chart_4_correlations.csv" class="button" target="_blank">Data</a> <a href="https://github.com/SLBlundell/slblundell.github.io/blob/main/nba_project/charts/chart_3/2023_scraper.ipynb" class="button" target="_blank">Scraper</a> <a href="https://github.com/SLBlundell/slblundell.github.io/blob/main/nba_project/charts/chart_4/model_players.ipynb" class="button" target="_blank">Model</a></div>
<p>Chart seven shows us that on a player level, this season higher paid salaries typically underperform relative to their salary compared to their lower-paid peers...</p>
<div class="figures"><div class="chart" id="Chart5"></div><br><br><a href="https://github.com/SLBlundell/slblundell.github.io/blob/main/nba_project/data/player_per_game_salary_2023.csv" class="button" target="_blank">Data</a> <a href="https://github.com/SLBlundell/slblundell.github.io/blob/main/nba_project/charts/chart_3/2023_scraper.ipynb" class="button" target="_blank">Scraper</a></div>
<p>In combination, the charts tell a story that franchises pay for highly observable characteristics in their stars over efficient players – as such leading to little relationship between spending and wins. The exception to this rule this season may be a result of teams losing on purpose with hopes of drafting a guaranteed future star. Such a hypothesis would need to be tested by examining past years of similar draft prospects, such as LeBron James.
</p>
<h2>Word Count: 799</h2>
</div>
</main>
<footer class="footer">
</footer>
</body>
<script>
// Embed charts:
vegaEmbed('#Chart1', myChart1);
vegaEmbed('#Chart2', myChart2);
vegaEmbed('#Chart3', myChart3);
vegaEmbed('#Chart4', myChart4);
vegaEmbed('#Chart5', myChart5);
vegaEmbed('#Chart6', myChart6);
vegaEmbed('#Chart7', myChart7);
</script>