DatsyukToZetterberg
Alligator!
Hello everyone, today I am releasing an updated version of the Hockey Marcels I posted 2 years ago. The Hockey Marcels are based on the baseball version created by Tangotiger. His original baseball version was created to be the simplest and most basic forecasting system. It creates a weighted average, where more recent seasons are weighted heaviest, and then regresses the average towards the league average. The Marcels were extremely effective at creating an accurate forecasting system, so much so that they can act as a barometer for determining how effective more complex systems are.
I’ve started a blog for my hockey projects so you can get a longer breakdown of the changes made to the Marcels there. You can check out the post here. In this post I’ll just provide a quick breakdown of the major changes and the model performance.
To evaluate the model, I’ve decided to use the Mean Absolute Error, MAE, and Coefficient of Determination, R^2. Below are the MAE and R^2 for each 82-game season since 2011.
The MAE for most categories fall within a relatively stable range. There are some outliers, such as points or PIM, but this is due to how the NHL game has changed over time.
The R² values were also in a relatively stable range, the exceptions being GP and PIM. This is not unexpected, as these are the two more volatile stats in the data set. I should also note that while the GP R² number may look underwhelming; however, in my previous research, a MLR approach produces an R² of about 0.05.
Overall, I believe that the newer version of the “Hockey Marcels” does an excellent job as a base-level model. They’re simple, efficient, and straightforward to create. A more tuned model should be able to beat it; if that model can’t, it indicates that some tweaks are needed.
Lastly, if you’d like to see the Marcel projections for any of the 2011-2023 seasons, check out this google doc. If you want to edit the sheet just click the “make a copy” tab under the file heading.
As well, I’ve put the code I used to create these projections here on my github. The program is built using R, so having R and an IDE, such as RStudio, will be needed to use it. To produce a projection you need to load the libraries, the function "proj_season", and enter the year you wish to project. I did my best to comment the code, but I’ve never shared a project like this before, so I hope you can follow the logic behind some of my decisions.
If something isn’t clear, don’t hesitate to ask, and I’ll do my best to explain it. If you want to tinker with the weights or create your own forecasting model, feel free to use any of the code used in the Marcels as the base.
I’ve started a blog for my hockey projects so you can get a longer breakdown of the changes made to the Marcels there. You can check out the post here. In this post I’ll just provide a quick breakdown of the major changes and the model performance.
The original version of my “Hockey Marcels” was pretty basic and lacked some key features, most notably a games played projections. The older model was created under the assumption that all players would play a full season, but we all know that is not the case. The updated version of the Marcels now has GP forecasts and this allows for more accurate forecasts overall. Each player is sorted into 1 of 16 potential bins, depending on how many games they've played in over the past 2 seasons and what their projected TOI is. These bins share a similar formula structure, what changes is the amount of regression to the mean that occurs or what number anchors the regression to the mean.
While the method to produce these GP forecasts is a little more involved than the typical Marcel projection, I believe it’s still within the spirit of the system. There are no advanced statistical methods used and the GP forecasts still involve a regression towards the mean component.
The other major change is how TOI is projected. The previous version had just two methods for projecting TOI, one for Forwards and one for defenceman. The new method once again involves binning players based on which seasons they've played in and what position they are. The new method has less regression to the mean than the previous version had. NHL players have very little change in their TOI, and you can base almost all of the projected toi off of the previous year’s number. The new binning projections work better for all players but are much better for those with only have 1 year of data.
While the method to produce these GP forecasts is a little more involved than the typical Marcel projection, I believe it’s still within the spirit of the system. There are no advanced statistical methods used and the GP forecasts still involve a regression towards the mean component.
The other major change is how TOI is projected. The previous version had just two methods for projecting TOI, one for Forwards and one for defenceman. The new method once again involves binning players based on which seasons they've played in and what position they are. The new method has less regression to the mean than the previous version had. NHL players have very little change in their TOI, and you can base almost all of the projected toi off of the previous year’s number. The new binning projections work better for all players but are much better for those with only have 1 year of data.
To evaluate the model, I’ve decided to use the Mean Absolute Error, MAE, and Coefficient of Determination, R^2. Below are the MAE and R^2 for each 82-game season since 2011.
The MAE for most categories fall within a relatively stable range. There are some outliers, such as points or PIM, but this is due to how the NHL game has changed over time.
The R² values were also in a relatively stable range, the exceptions being GP and PIM. This is not unexpected, as these are the two more volatile stats in the data set. I should also note that while the GP R² number may look underwhelming; however, in my previous research, a MLR approach produces an R² of about 0.05.
Overall, I believe that the newer version of the “Hockey Marcels” does an excellent job as a base-level model. They’re simple, efficient, and straightforward to create. A more tuned model should be able to beat it; if that model can’t, it indicates that some tweaks are needed.
Lastly, if you’d like to see the Marcel projections for any of the 2011-2023 seasons, check out this google doc. If you want to edit the sheet just click the “make a copy” tab under the file heading.
As well, I’ve put the code I used to create these projections here on my github. The program is built using R, so having R and an IDE, such as RStudio, will be needed to use it. To produce a projection you need to load the libraries, the function "proj_season", and enter the year you wish to project. I did my best to comment the code, but I’ve never shared a project like this before, so I hope you can follow the logic behind some of my decisions.
If something isn’t clear, don’t hesitate to ask, and I’ll do my best to explain it. If you want to tinker with the weights or create your own forecasting model, feel free to use any of the code used in the Marcels as the base.
Last edited: