Project 1 : NHL Goals and Assists
- hlfwatson
- Feb 8
- 6 min read
Updated: Feb 13
Given the opportunity to explore my interests using Data Science and Analytics, I looked for datasets that detail National Hockey League (NHL) athlete performance in a time frame. My partner got me interested in watching hockey shortly after we began dating, and I became intrigued by the immense data gathered on a player nightly basis. Others have as well; several hockey data-centric websites have amassed their followings on social media: @MoneyPuckdotcom (124k), @MBOnHockey (46.4k), @BigHeadHcky (35.4k), @DataDrivenHockey (26.9k). In the data field, there was a similar appreciation for ubiquitous NHLer data to model and examples of others’ individual projects. To make my spin on the project, I combined some popular statistic categories to make my work somewhat unique and comprehensible to non-hockey fans.
A consistent aspect of play is scoring, as individuals can contribute to goals or assists. Since hockey is a team-oriented sport, athletes talented enough to reach the NHL are used to passing the puck to create chances. Goals result from precise passing and rebound plays in front of the net to beat the opposing goalie. While scoring a goal involves one athlete beating the goalie, primary and secondary assists are given on every scoring drive. For those reasons, it is more common to see athletes contributing more assists than goals during a season. There are exceptions but I will visualize assist and goal differentials for certain athletes over their career. By highlighting Stanley Cup-winning seasons, we can consider the significant effects on athletes following seasons' performance.
Currently, the oldest players playing in the NHL are Marc-Andre Fleury (MIN), Ryan Suter (DAL), and Brent Burns (CAR). Fleury and Burns began their NHL career at the start of the 2003-04 seasons, so that is the range I focused on. With the exception of Jaromir Jagr, who began his career prior to the 2003 season but played in the league until 2017-18. I used the web scraping tool Patrick Bacon5 of TopDownHockey developed to gather data between 2003 and the current season, ending with a couple thousand data points. The TopDownHockey EliteProspects scraper I utilized takes two string arguments or series: season and league. For my visualizations, I will be looking at NHL league data from 2003-04, 2004-05, 2005-06, 2006-07,2007-08, 2008-09, 2009-10, 2010-11, 2011-12, 2012-13, 2013-14, 2014-15, 2015-16, 2016-17, 2017-18, 2018-19, 2019-20, 2020-21, 2021-22, 2022-23, 2023-24, 2024-25.
Without a clear direction for a research question, I cleaned off all the unknown values and discrepancies in spacing for uniform data points. I eliminated the ‘position’ column produced by the scraper and allowed two position categories for multi-position players allowed by coaches, ensuring the accuracy of my new column. The position listed under a player's profile is not guaranteed accuracy (i.e., unchanged since the draft year), and the second option is a backup. I replaced the blank (-) values with zeroes because the missing value indicates a player was injured and did not log ice time (nor change their plus-minus). Similarly, I used the strip function on strings in the playername column to ensure uniformity when locating by name. Many strings in the playername column failed to appear in the .loc function because diacritics (accent marks) are not read as typed keyboard letters. I amended this by using the unidecode package and method in playername column. Nearing the end of cleaning, I dropped the league column because all the data points are from the NHL. To make my dataset resemble ESPN and NHL layouts, I rearranged vital information to be readily available in the front columns. With this last step, I felt good about cleaning my dataset.

The first visualization example is Steven Stamkos drafted 1st in the first round by the Lightning in 2008. I wanted to explore this team because athletes who grow with a team often reflect their front offices' organizational performance. In Stamkos' second season, the Bolts built a strong defense core with the first-round addition of defenseman, Victor Hedman. A defenseman can increase the assists and goal ratio as they are not known for precision shots (with exception) but rather for protecting offensive and defensive possession of the puck. Stamkos ' performance increased after building team chemistry and adding Hedman, resulting in more goals than assists in his second season. It should be noted that Hedman's first season as a defenseman was substantial as he contributed four goals and sixteen assists. These effects could be attributed to one another's growth in the league as their defense allowed Stamkos more offensive freedom.
I would hypothesize that the center position, played by Stamkos, is more likely to create scoring drives, as they distribute the puck from the middle of the ice outward. Wingers are more likely to be snipers, such that their shots hit the net front to either be gloved by the goalie or batted for a rebound.

Victor Hedman, as mentioned above, was the second overall pick in the 2009 NHL Rookie draft. As an athlete, Hedman excelled defensively moving the puck quickly out of opponents' zones and turning it over for a scoring chance. This can be observed in his continual increase of assists throughout his first seasons with the team. Positioning his large frame in the crease, Hedman could aid the offense by covering the goalie's line of sight and hoping for a goal. Sharp-shooting wingers and centers are essential because you will never win a hockey game if you can't score a goal. Hedman's tendencies to join the rush created additional offensive pressure by keeping the puck in the opponent's zone and creating more scoring opportunities in high-danger areas, especially when there has not been a line change.
Over time, Stamkos and Hedman have developed strong on-ice chemistry fixated around each athlete's strengths. Stamkos consistently ranked in the NHL's top 50 scorers between 2010 and 2015, including a 60-goal season in 2011. Tampa's decision to put offensive trust in Stamkos and defensive coordination in Hedman paid off for multiple seasons, as both had record-holding seasons. A record of sixty goals in the 2011 season remains the highest by a Lightning athlete.

Solidifying their potential, the partnership between Victor Hedman and Steven Stamkos during the early 2010s was vital to future success. The athletes themselves also aided in developing a team culture that prioritized skill, resilience, and chemistry - qualities that came to exhibit in 2020 & 2021 Stanley Cup Teams. The emergence of Nikita Kucherov as a dynamic athlete who could create scoring drives brought another layer of dimension to the Lightning core. Kucherov's ability to visualize playmaking and execute precision passing additionally contributes to Stamkos' chances and allows Hedman leeway to not guard the blue line on every shift. The 'core' players of Tampa Bay would drive the team to new heights in the coming seasons.
2019-2020 was not the turning point we would expect as teams played about 15-16 fewer regular-season games because the full length was 82 games and the league was suspended in March. Turning toward Stanley Cup-winning seasons, Tampa Bay's first cup was during this season when the reduced regular season games played a role in athletes' total goals and assists. Each group of playoff teams stayed in the 'Bubble' which included hotels, restaurants, and the iceplex surrounding the host cities' Arenas. Without a consistent schedule, teams had over four months off before resuming play which likely affected their conditioning, chemistry, and season momentum. Tampa Bay proved their resilience to these changes by beating out Presidents Trophy winner Boston Bruins in the Eastern Conference final, before advancing to the Stanley Cup Final. The team's ability to adapt and rely on cohesive playmaking became their greatest asset.
The 2020 Stanley Cup was a testament to the solid core of players and consistent playmaking that Tampa Bay's coaches had emphasized all year. After lifting the Cup in celebration, the team quickly carried their momentum and trophy back to the state of Florida. You may notice that presented players increased their performance statistics after winning the Cup. The following season was closer than ever recorded to preseason and regular season games, which could have played a role in their continued success. The defending Stanley Cup champions did not move much of their roster in the 2020-21 preseason, excluding Nikita Kucherov on IR with a hip injury. The season was shortened to a 56-game schedule and realigned divisions to minimize national travel. Reigning Stanley Cup champions were expected to lead the Central Division, against the Carolina Hurricanes, Dallas Stars, and Florida Panthers. Despite little adjustments to their Cup-winning team, Tampa Bay finished third in the Central, with a record of 36-17-3.
Several factors translated to the Lightning's successful Cup run this season. Their offensive depth had been resupplied with the acceleration of Brayden Point's (23 G, 25 A, 56 GP) pace comparatively to other seasons. Their defense was anchored by alternate captains, Victor Hedman and Ryan McDonagh, for a well-structured three-man pairing. Combining young playmakers and veteran leadership, Tampa Bay capitalized on a shortened pre and regular-season to further their momentum into another playoff run.

@MoneyPuckdotcom : Instagram 67.8k followers, X 56.2k followers
@MBonHockey : Instagram 38.3k followers, X 8.1k followers
@BigHeadHcky : X 35.4k followers
@DataDrivenHockey : Combined: Instagram 29.7k followers, X 225 followers
@TopDownHockey on GitHub : Made possible by Marcus Sjölin and Harry Shomer.
Komentáře