Melvin
21/12/05
I am back and playing with hockey data again. I've been working on some stuff for the upcoming season and it doesn't quite fit anywhere in existing threads so here is a thread to bounce ideas off each other about doing some projections of the upcoming season.
Obviously, none of us know what the season is even going to look like, how many games will be played, if there's division re-alignment, and so on. I am ignoring all of that for now.
I am sticking my fingers in my ears and assuming an 82 game season with the normal divisions. LA LA LA LA LAAAA.
What I am doing is pretty rudimentary which makes it easier to discuss. Please view it as a work in progress!
Here is what I am starting with:
1) Take all of the rosters as of this weekend from CapFriendly.com. Obviously rosters are still in flux and I will update these as they go along. I will probably update them weekly unless something major happens, but for example I doubt Jayce Hawerlyck is going to move the needle all that much so he will go in with the weekly update (probably Saturday.) If the Canucks trade for McDavid or something I'll manually update it right away.
2) I had to do my best to project usage. This meant basically assigning for each player whether he will get 1st line ice time, 4th line ice time, 1st unit PP time, etc. And for goalies it's starter or backup (although some teams are 50/50.) This was very subjective although I did my best to try to have it make sense. I already know that I screwed a few of them up though so will be going back to re-assign a few players (mostly guys who kill a lot of penalties who I don't have on a PK unit.)
3) For skaters, I am using the xGF/60 and xGA/60 numbers from evolving-hockey.com. I have taken each of these metrics for each player over the last 3 years to project the same for next season, using a TOI-weighted average at 3-2-1 (so last season weighted 3x 2018.) I might have to bump that up to account for the shortened-season, not sure. I am then applying an age-adjustment based on this research, and then regressing towards the mean. I have split the factor in half to apply a + to the GF and a - to the GA. This is probably dumb.
I know there's an issue with using this metric, which is that it's tightly coupled with team-mates. I decided I am mostly OK with this, because most players have basically the same teammates. Like, Boeser's numbers might be inflated due to Pettersson... but he'll be playing with Petterssson again next year so I don't think it's a big problem? Does anyone have any thoughts on this? And for the player's switching teams, yes Nate Schmidt will probably suffer a bit due to playing for a worse team, but I'm not sure if it's a big enough problem to worry about.
Anyway, once I have this I can project team xGF/60 and xGA/60 using the projected figures for the individual players and estimated usage of said players. Oh yes, I forgot to mention that I am doing this individually for ES/PP/PK (for now just assuming 48 min ES, 6 PP and 6 PK.) See TODO(a)
4) After I have xGF/xGA for each team, I can do the goaltending. I explained this in another post but basically, I am then applying the goaltending by projecting the Goals Saved/60 of the two goalies and estimating the usage. This number is then added/subtracted from the GA column. So each team has a GF that assumes facing average goaltending and then a GA that accounts for the projected goaltending of the team. I hope that makes sense. I explained in more detail here.
5) I'm not factoring injuries at all in the sense of trying to project playing time, but, players who are currently significantly injured are not included. So Boston is missing Pastrnak and Marchand which is pretty amazing because they still project really well...
~~TO DO~~
a) I want to factor in penalties. My idea here is that it should be easy to project how often each player on the team takes or draws penalties and then I can estimate PP opportunities/PK opportunities. I have not done this yet, so each team is basically assuming the average of 3 mins/game on PP and PK.
b) Obviously, what I would like to do next is plug in the NHL schedule and run some simulations, but that won't be possible until we see the schedule and LA LAA LA ALAL.
c) Any other ideas?
OK, so here is what I have right now, with all of the above caveats that rosters are still changing, season still in flux, and I have some errors in my usage patterns like whoops I can't believe I forgot to put Colton Sceviour on the PK for the Penguins what was I thinking!!!
Since I can't run any sims without the schedule the best I can do is rank teams by their goal differential.
[TBODY]
[/TBODY]
The way to read the above is, for example Dallas have an expected goals-for of 192. They have an expected goals-allowed of 180, but then their goaltending is expected to save them 22 goals compared to expected, so the actual projected goals allowed is 158. Thus, their differential is 192-158 = 34.
OK, so the first thing that jumps out is the generally low numbers across the board. I can explain part of this which is just that I am projecting 82 regulation games so no OT or SO. I also wouldn't have any empty-netters really, but I don't think that fully explains it. I need to do look into this but I'm guessing it's something systemic that affects all teams equally as I doubt it's something that would greatly bias for or against particular teams. But not sure.
....and yeah, this doesn't look too good for Vancouver. Their ES scoring numbers are just...very bad, and the goaltending also projects to be bad. Sorry.
Any other surprises? Dallas comes out looking like they might not be the fluke that everyone thought, and Vegas is still very strong, as is Boston. New Jersey is a pretty big surprise I'd say. My numbers still really like Crawford and Blackwood. Any other surprises? I know I was bullish on MTL earlier but after running the numbers, maybe I had it wrong. Price does poorly with my goalie model though, so idk.
I will be tinkering with this over the coming days and weeks to take my mind off of *waves hand around.*
I welcome any thoughts/suggestions/ideas or to just share your own projections. Would love to compare notes if anyone else has tried anything! I can provide any data if you want to see any of the raw details as well.
Please be nice if I made some dumb errors; that is why I am posting this.
I ... really wasn't expecting the Canucks to look this bad when I started, you guys probably want to know what I did with the usage:
[TBODY]
[/TBODY]To read this, basically it's what you think: Under EV a "1" means I'm giving them 1st line icetime so around 16 ES minutes a game. Under PP and PK I have 3 groupings, "1" for 1st unit PP time which is something like twice as much as "2" for 2nd unit time, and then "4" for players who get just a smattering of PP/PK time, like 30 seconds per game or something.
I know we tend to think of Horvat as the 2C but he actually gets more minutes at ES than Pettersson. Anyway, I can definitely swap this around to what you guys think makes sense but I don't think it will have a significant effect.
For the goalies...I have Holtby getting 65% of the minutes but I have no idea if that's correct. I don't think any of us know right now and also it will certainly depend on performance. I might be better off going 50/50 here. Thoughts?
Obviously, none of us know what the season is even going to look like, how many games will be played, if there's division re-alignment, and so on. I am ignoring all of that for now.
I am sticking my fingers in my ears and assuming an 82 game season with the normal divisions. LA LA LA LA LAAAA.
What I am doing is pretty rudimentary which makes it easier to discuss. Please view it as a work in progress!
Here is what I am starting with:
1) Take all of the rosters as of this weekend from CapFriendly.com. Obviously rosters are still in flux and I will update these as they go along. I will probably update them weekly unless something major happens, but for example I doubt Jayce Hawerlyck is going to move the needle all that much so he will go in with the weekly update (probably Saturday.) If the Canucks trade for McDavid or something I'll manually update it right away.
2) I had to do my best to project usage. This meant basically assigning for each player whether he will get 1st line ice time, 4th line ice time, 1st unit PP time, etc. And for goalies it's starter or backup (although some teams are 50/50.) This was very subjective although I did my best to try to have it make sense. I already know that I screwed a few of them up though so will be going back to re-assign a few players (mostly guys who kill a lot of penalties who I don't have on a PK unit.)
3) For skaters, I am using the xGF/60 and xGA/60 numbers from evolving-hockey.com. I have taken each of these metrics for each player over the last 3 years to project the same for next season, using a TOI-weighted average at 3-2-1 (so last season weighted 3x 2018.) I might have to bump that up to account for the shortened-season, not sure. I am then applying an age-adjustment based on this research, and then regressing towards the mean. I have split the factor in half to apply a + to the GF and a - to the GA. This is probably dumb.
I know there's an issue with using this metric, which is that it's tightly coupled with team-mates. I decided I am mostly OK with this, because most players have basically the same teammates. Like, Boeser's numbers might be inflated due to Pettersson... but he'll be playing with Petterssson again next year so I don't think it's a big problem? Does anyone have any thoughts on this? And for the player's switching teams, yes Nate Schmidt will probably suffer a bit due to playing for a worse team, but I'm not sure if it's a big enough problem to worry about.
Anyway, once I have this I can project team xGF/60 and xGA/60 using the projected figures for the individual players and estimated usage of said players. Oh yes, I forgot to mention that I am doing this individually for ES/PP/PK (for now just assuming 48 min ES, 6 PP and 6 PK.) See TODO(a)
4) After I have xGF/xGA for each team, I can do the goaltending. I explained this in another post but basically, I am then applying the goaltending by projecting the Goals Saved/60 of the two goalies and estimating the usage. This number is then added/subtracted from the GA column. So each team has a GF that assumes facing average goaltending and then a GA that accounts for the projected goaltending of the team. I hope that makes sense. I explained in more detail here.
5) I'm not factoring injuries at all in the sense of trying to project playing time, but, players who are currently significantly injured are not included. So Boston is missing Pastrnak and Marchand which is pretty amazing because they still project really well...
~~TO DO~~
a) I want to factor in penalties. My idea here is that it should be easy to project how often each player on the team takes or draws penalties and then I can estimate PP opportunities/PK opportunities. I have not done this yet, so each team is basically assuming the average of 3 mins/game on PP and PK.
b) Obviously, what I would like to do next is plug in the NHL schedule and run some simulations, but that won't be possible until we see the schedule and LA LAA LA ALAL.
c) Any other ideas?
OK, so here is what I have right now, with all of the above caveats that rosters are still changing, season still in flux, and I have some errors in my usage patterns like whoops I can't believe I forgot to put Colton Sceviour on the PK for the Penguins what was I thinking!!!
Since I can't run any sims without the schedule the best I can do is rank teams by their goal differential.
TEAM | TOT_xGF | TOT_xGA | GSAA | TOT_GA | Diff |
DAL | 192 | 180 | 22 | 158 | 34 |
VGK | 210 | 184 | 6 | 177 | 33 |
CAR | 220 | 192 | 1 | 191 | 29 |
BOS | 185 | 172 | 8 | 164 | 21 |
N.J | 194 | 186 | 12 | 174 | 20 |
T.B | 196 | 179 | 2 | 177 | 19 |
PIT | 198 | 184 | 3 | 181 | 17 |
CGY | 204 | 191 | 3 | 188 | 16 |
STL | 191 | 184 | 8 | 176 | 15 |
ARI | 181 | 187 | 20 | 167 | 14 |
COL | 190 | 186 | 10 | 176 | 14 |
TOR | 208 | 193 | -4 | 197 | 11 |
NSH | 189 | 186 | 2 | 183 | 6 |
ANA | 183 | 192 | 13 | 179 | 4 |
FLA | 178 | 186 | 10 | 176 | 2 |
EDM | 195 | 193 | 0 | 193 | 2 |
PHI | 193 | 190 | -1 | 192 | 1 |
CBJ | 181 | 175 | -5 | 180 | 1 |
WSH | 192 | 193 | 0 | 193 | -1 |
MTL | 199 | 191 | -10 | 202 | -3 |
NYR | 188 | 197 | 5 | 192 | -4 |
WPG | 189 | 200 | 6 | 194 | -5 |
NYI | 189 | 194 | 0 | 194 | -5 |
MIN | 178 | 173 | -16 | 190 | -12 |
L.A | 180 | 187 | -6 | 193 | -13 |
OTT | 182 | 184 | -12 | 197 | -15 |
S.J | 190 | 188 | -22 | 210 | -20 |
BUF | 179 | 188 | -12 | 200 | -21 |
CHI | 184 | 198 | -8 | 206 | -22 |
VAN | 182 | 195 | -10 | 205 | -23 |
DET | 172 | 204 | 3 | 201 | -29 |
The way to read the above is, for example Dallas have an expected goals-for of 192. They have an expected goals-allowed of 180, but then their goaltending is expected to save them 22 goals compared to expected, so the actual projected goals allowed is 158. Thus, their differential is 192-158 = 34.
OK, so the first thing that jumps out is the generally low numbers across the board. I can explain part of this which is just that I am projecting 82 regulation games so no OT or SO. I also wouldn't have any empty-netters really, but I don't think that fully explains it. I need to do look into this but I'm guessing it's something systemic that affects all teams equally as I doubt it's something that would greatly bias for or against particular teams. But not sure.
....and yeah, this doesn't look too good for Vancouver. Their ES scoring numbers are just...very bad, and the goaltending also projects to be bad. Sorry.
Any other surprises? Dallas comes out looking like they might not be the fluke that everyone thought, and Vegas is still very strong, as is Boston. New Jersey is a pretty big surprise I'd say. My numbers still really like Crawford and Blackwood. Any other surprises? I know I was bullish on MTL earlier but after running the numbers, maybe I had it wrong. Price does poorly with my goalie model though, so idk.
I will be tinkering with this over the coming days and weeks to take my mind off of *waves hand around.*
I welcome any thoughts/suggestions/ideas or to just share your own projections. Would love to compare notes if anyone else has tried anything! I can provide any data if you want to see any of the raw details as well.
Please be nice if I made some dumb errors; that is why I am posting this.
I ... really wasn't expecting the Canucks to look this bad when I started, you guys probably want to know what I did with the usage:
PlayerName | Pos | EV_Line | PP_Line | PK_Line |
J.T. Miller | F | 1 | 1 | 4 |
Bo Horvat | F | 1 | 1 | 4 |
Elias Pettersson | F | 1 | 1 | |
Tyler Myers | D | 1 | 2 | 2 |
Alex Edler | D | 1 | 4 | 1 |
Braden Holtby | G | 1 | ||
Nate Schmidt | D | 2 | 2 | 1 |
Brock Boeser | F | 2 | 1 | |
Tanner Pearson | F | 2 | 2 | 4 |
Jake Virtanen | F | 2 | 4 | |
Quinn Hughes | D | 2 | 1 | |
Thatcher Demko | G | 2 | ||
Jordie Benn | D | 3 | 2 | |
Loui Eriksson | F | 3 | 2 | |
Tyler Motte | F | 3 | 2 | |
Brandon Sutter | F | 3 | 1 | |
Micheal Ferland | F | 4 | 4 | |
Antoine Roussel | F | 4 | 2 | |
Sven Baertschi | F | 4 | 4 | |
Adam Gaudette | F | 4 | 2 | |
Jay Beagle | F | 4 | 1 | |
Zack MacEwen | F | 4 |
I know we tend to think of Horvat as the 2C but he actually gets more minutes at ES than Pettersson. Anyway, I can definitely swap this around to what you guys think makes sense but I don't think it will have a significant effect.
For the goalies...I have Holtby getting 65% of the minutes but I have no idea if that's correct. I don't think any of us know right now and also it will certainly depend on performance. I might be better off going 50/50 here. Thoughts?
Last edited: