2018-19 stats and underlying metrics thread

Whileee

Registered User
May 29, 2010
46,072
33,114
I don’t know much about climate or weather, obviously, hahah!

I do know weather doesn’t have a known objective though, which is one difference.
But weather models have the same general objectives as hockey or other predictive models. Obviously, they differ in terms of how much randomness, the best way to fit models to data, etc.
 

Board Bard

Dane-O-Mite
Jun 7, 2014
7,888
5,055
Climate vs weather. Past a week or so weather is nearly impossible to predict, but the average follows strong underlying patterns. Even while getting almost none of the daily predictions right the average tracks these underlying patterns closely which allows you to make very good predictions over longer periods (typically 10-30 years)

Hockey is similar. Smaller samples are dominated by luck but as you look at larger sample sized the underlying patterns emerge. Trying to predict an individual hockey game is like trying to predict the weather a week out. There is a good chance you will be wrong even with a model that provides good predictions, that doesn't mean there are not larger patterns to be discovered.

A further similarity is that for both you get a group of people utterly convinced the lager patterns show by the model are wrong because reasons.

I don't know if there's much using of Corsi etc. around here to predict individual game results. Seems what most people are doing with those stats is trying to deduce a likely trend for the team, looking ahead generally, and which players best support a rise of fortunes. They can also look back to see how the trajectory and its components fare against established trends. All the while knowing that even a strong Corsi case is only going to give you about a 60 percent chance of an unscathed prediction.
 

garret9

AKA#VitoCorrelationi
Mar 31, 2012
21,738
4,380
Vancouver
www.hockey-graphs.com
But weather models have the same general objectives as hockey or other predictive models. Obviously, they differ in terms of how much randomness, the best way to fit models to data, etc.

Aye.

I was just pointing out that, while hockey has more influence from "randomness" than sports like basketball and such, the sport differs in those things you are pointing out.
 

winnipegger

Registered User
Dec 17, 2013
8,128
6,354
Shot quantity isn't the whole picture, but it's a BIG part of the picture. If you look only at it, you miss some of the picture... but if you ignore it you miss an even greater part of the picture:
jets.png

I feel like I've asked you this before but how is this graph derived? We can't really know if it's robust right? How many seasons of data is it based on?

In a general sense I can see how this works but how in the world do you isolate the component of luck to 33%. To suppose something is luck means you know what "should" happen absent luck right? There is no reference data of a "perfect game" absent any element of luck from which to calibrate your luck meter. Does that make sense.
 

garret9

AKA#VitoCorrelationi
Mar 31, 2012
21,738
4,380
Vancouver
www.hockey-graphs.com
I feel like I've asked you this before but how is this graph derived? We can't really know if it's robust right? How many seasons of data is it based on?

In a general sense I can see how this works but how in the world do you isolate the component of luck to 33%. To suppose something is luck means you know what "should" happen absent luck right? There is no reference data of a "perfect game" absent any element of luck from which to calibrate your luck meter. Does that make sense.

Luck is variance outside of the expected value given to limited sampling size.

You flip a coin 100 times, you probably won't get exactly 50 heads. You'll probably get something close but not quite.

If you keep flipping the coin 100 times, and keep recording how many heads you get, you'll end up with a normal distribution centred around 50. 50 will be the most common outcome, but not remotely the only outcome.

Now, let's say the NHL had perfect parity... Would every team end up with a 0.500 record? No. Just like the coin tosses, you'd have some teams with winning records and some teams with losing records.

We already expect, before accounting for some teams being better than others, for there to be a certain degree of spread in the standings due to luck.

Now, using this distribution and the real distribution, you can estimate how much luck is in the standings:
luck_vs_actual_medium.jpg


Gabriel Desjardins used this method comes out to about 38%, or 62% non-luck. This also gives us a theoretical limit to how well the best analyst or the best statistical model -or best combination- would be able to predict team and player performance.

Josh Weissbock as part of his thesis re-looked at this and found 38% as well (well, 0.376419753) and then used ML techniques to look at what is the predicted upper limit to predicting win% to see if it matched... and found: 62% upper limit. (Fun aside: he found the CHL leagues all being just over 70%, suggesting there is less luck at those levels, which makes sense as there is less parity).

Tore Purdy looking at team-to-team variation in point totals rather than record obtained similar results. (Fun aside: he looked at pre-2007 lockout seasons and found less luck (25-30%), which makes sense again as less parity).

I can break down how I determined all the other factors as well, but in each case it's something like this where multiple people used different methods and different tools to find similar results.
 
Last edited:

lomiller1

Registered User
Jan 13, 2015
6,409
2,967
I don’t know much about climate or weather, obviously, hahah!

I do know weather doesn’t have a known objective though, which is one difference.
I didn’t intend for the analogy to go beyond both effectively being best modeled as random variation around an underlying trend. :) There is a still a lot of commonality in the types of mistakes people make when looking at this type of data, specifically the tendency to mistake random variation for a trend that isn’t really there when you do the math.


In terms differences, climate models are physical models with statistical output while hockey models have more in common with those used in economics and finance. The latter are usually purely statistical models, and what can happen is that using the model can change the behavior it’s modeling. This tends to create a "they work until they don’t" phenomenon. We could be seeing something similar wrt Corsi. As teams recognize it’s importance and start using it to evaluate players and systems, it becomes more difficult for teams to gain an advantage which in turn reduces it’s usefulness.
 

lomiller1

Registered User
Jan 13, 2015
6,409
2,967
Well ya. Of course shot quality would not factor when trying to estimate how often a goalie has to react to a shot. :P

As to shots not being created equal, that doesn't necessarily make it a bad thing.

You want to be able to break down the game into its individual components sometimes. A team or player that is outscoring but failing in the shot quantity portion of the game has an area they are failing in. As we've noted, those players and teams on average fall more often than not.

As to the style thing, don't forget what is explanative is not necessarily predictive.
Example: Adding shot location information to Corsi for an xGoal model increases in sample correlation to goals but decreases out of sample correlation to goals. In other words, Corsi without shot location is more predictive than shot location with prediction. My theory has been that this is because players control shot quantity by a far greater degree than they control shot quality, and the distribution in talent is larger for the former than the latter.

There's also a danger to the style thing. People fall for this all the time. Every single time a team or player is "beating Corsi" we see explanations to why being due to their style, but far more often than not, we see these players and teams collapse.
Wrt shot quality, one thing many people seem to miss is that quality shot generation is not independent of shot attempt quantity.

First, the way most NHL teams try to generate quality shots is to attempt lower quality shots and look for tips, rebounds etc. If they don’t get these the failure probably isn’t the shot attempt, rather the failure is likely that players don’t get in position to "create traffic". Since most coaches emphasize this it should not be a surprise that most teams turn shot attempts into quality shots at very similar rates.

As second way in which they are not independent is that a truly bad shot attempt is one that results in a change in possession while having only a small chance to score either directly or via tip/deflection. Since the change in possession reduces your shot attempts (and increases the other teams) "bad" shot attempts are already factored into total shot attempts.

The net results is that shot volume is already contains large amounts of information on shot quality so when you try to add to it there is going to be a lot of redundancy so the improvements people expect may not be there. IMO it would be interesting to see similar analysis for the KHL which has a reputation for taking a different approach to quality shot generation.
 

winnipegger

Registered User
Dec 17, 2013
8,128
6,354
Luck is variance outside of the expected value given to limited sampling size.

You flip a coin 100 times, you probably won't get exactly 50 heads. You'll probably get something close but not quite.

If you keep flipping the coin 100 times, and keep recording how many heads you get, you'll end up with a normal distribution centred around 50. 50 will be the most common outcome, but not remotely the only outcome.

Now, let's say the NHL had perfect parity... Would every team end up with a 0.500 record? No. Just like the coin tosses, you'd have some teams with winning records and some teams with losing records.

We already expect, before accounting for some teams being better than others, for their to be a certain degree of spread in the standings due to luck.

Now, using this distribution and the real distribution, you can estimate how much luck is in the standings:
luck_vs_actual_medium.jpg


Gabriel Desjardins used this method comes out to about 38%, or 62% non-luck. This also gives us a theoretical limit to how well the best analyst or the best statistical model -or best combination- would be able to predict team and player performance.

Josh Weissbock as part of his thesis re-looked at this and found 38% as well (well, 0.376419753) and then used ML techniques to look at what is the predicted upper limit to predicting win% to see if it matched... and found: 62% upper limit. (Fun aside: he found the CHL leagues all being just over 70%, suggesting there is less luck at those levels, which makes sense as there is less parity).

Tore Purdy looking at team-to-team variation in point totals rather than record obtained similar results. (Fun aside: he looked at pre-2007 lockout seasons and found less luck (25-30%), which makes sense again as less parity).

I can break down how I determined all the other factors as well, but in each case it's something like this where multiple people used different methods and different tools to find similar results.

This was very good thank you. I am slightly surprised to hear people doing a thesis on hockey lol but hey I like reading it.

If you composed two teams with identical hockey robots and robot referees and made them play 1000 games or so you would find a robust number for "luck" in hockey because we know the inputs from every individual consistently over time. Time to apply for funding for my thesis I need 8 million dollars per hockey robot at minimum.
 

SUX2BU

User of registers
Feb 6, 2018
17,884
38,982
Canada
Have a stat for measuring heart both during the season and playoffs ?

Stats can only go so far ....
 

garret9

AKA#VitoCorrelationi
Mar 31, 2012
21,738
4,380
Vancouver
www.hockey-graphs.com
Wrt shot quality, one thing many people seem to miss is that quality shot generation is not independent of shot attempt quantity.

First, the way most NHL teams try to generate quality shots is to attempt lower quality shots and look for tips, rebounds etc. If they don’t get these the failure probably isn’t the shot attempt, rather the failure is likely that players don’t get in position to "create traffic". Since most coaches emphasize this it should not be a surprise that most teams turn shot attempts into quality shots at very similar rates.

As second way in which they are not independent is that a truly bad shot attempt is one that results in a change in possession while having only a small chance to score either directly or via tip/deflection. Since the change in possession reduces your shot attempts (and increases the other teams) "bad" shot attempts are already factored into total shot attempts.

The net results is that shot volume is already contains large amounts of information on shot quality so when you try to add to it there is going to be a lot of redundancy so the improvements people expect may not be there. IMO it would be interesting to see similar analysis for the KHL which has a reputation for taking a different approach to quality shot generation.

Yup! It is true that in general there is correlation. For the most part, there is correlation in all stat qualities (some more than others), because if you are good at hockey you are likely good at *most* aspects of hockey, and similarly if you are bad.

Statistically speaking, this means that while the public (and even private) data doesn’t cover everything, it covers slices of even what it doesn’t cover.

So, for each new additional area we can account for, we reduce the marginal gain we will get from the next discovery.

This is why when I argued with Whilee and Jet about Byfuglien defensive play I had fairly high confidence. After adjusting for the environment, the stats suggest Byfuglien has been essentially the same poor defensively defenseman in terms of shot quality throughout his entire career as a Jet. Whilee pointed -correctly- that these models don’t cover everything that goes into shot quality. However, I am entirely skeptical that if Byfuglien were being significantly better and different defensively it would occur entirely and solely in the aspects we don’t account for and not at all in the aspects we can (shot distance, shot angle, type of shot (wrist, slap, etc), rebound, rush shot) that is likely correlated to those we can’t.
 

garret9

AKA#VitoCorrelationi
Mar 31, 2012
21,738
4,380
Vancouver
www.hockey-graphs.com
Graphs for your asssssss:

Forwards
Connor
sixfold-connoky96-1618.png

Scheifele
sixfold-scheima93-1618.png

Wheeler
sixfold-wheelbl86-1618.png


Perreault
sixfold-perrema88-1618.png

Little
sixfold-littlbr87-1618.png

Laine
sixfold-lainepa98-1618.png


Vesalainen

(N/A)
Roslovic (warning, low sample)
sixfold-rosloja97-1618.png

Ehlers
sixfold-ehlerni96-1618.png


Copp
sixfold-coppxan94-1618.png

Lowry
sixfold-lowryad93-1618.png

Tanev
sixfold-tanevbr91-1618.png


Dano
sixfold-danoxma94-1618.png

Lemieux

(N/A)


^Note: This is not a WAR statistic, but just looking at how a player impacts shot differentials, relative to their placement, and how much threat that equates to.
 
  • Like
Reactions: Mathil8

garret9

AKA#VitoCorrelationi
Mar 31, 2012
21,738
4,380
Vancouver
www.hockey-graphs.com
How good are the Jets forwards?

Well to answer that question I took how well each forward ranked in 4 different stats (EW's WAR, Corisica's WAR, Corsica/DailyFaceoff Star Rating, and Game Score). I took the average of each ranking, and voila:

Screen_Shot_2018-10-04_at_10.42.16_AM.png


According to statistical performances, Laine, Scheifele, Wheeler, Ehlers, and Perreaul are all top line talents. Connor, Lowry, Little, Roslovic, and Copp are considered 2nd line talents. Tanev is a third line talent.

Now, let's be honest, some of this is a "rising tide raises all boats". I bet the Jets lower ranked players would rank lower if they weren't used in an optimal role and play with such strong linemates (two of the stats <Game Score and Star Rating> don't account for usage or team strength) -- I'm looking at you Tanev and Copp. But still, that's pretty impressive.

*Roslovic I used the stat per GP instead of the stat. Why I normally don't do that is because we want to account for health factors, like for example Perreault ranks higher in per game but we know he's likely to miss games.
 
Last edited:

surixon

Registered User
Jul 12, 2003
48,897
69,657
Winnipeg
How good are the Jets forwards?

Well to answer that question I took how well each forward ranked in 4 different stats (EW's WAR, Corisica's WAR, Corsica/DailyFaceoff Star Rating, and Game Score). I took the average of each ranking, and voila:

Screen_Shot_2018-10-04_at_10.42.16_AM.png


According to statistical performances, Laine, Scheifele, Wheeler, Ehlers, and Perreaul are all top line talents. Connor, Lowry, Little, Roslovic, and Copp are considered 2nd line talents. Tanev is a third line talent.

Now, let's be honest, some of this is a "rising tide raises all boats". I bet the Jets lower ranked players would rank lower if they weren't used in an optimal role and play with such strong linemates (two of the stats <Game Score and Star Rating> don't account for usage) -- I'm looking at you Tanev and Copp. But still, that's pretty impressive.

*Roslovic I used the stat per GP instead of the stat. Why I normally don't do that is because we want to account for health factors, like for example Perreault ranks higher in per game but we know he's likely to miss games.

This is neat, are you going to be playing one for the defense?
 

Whileee

Registered User
May 29, 2010
46,072
33,114
How good are the Jets forwards?

Well to answer that question I took how well each forward ranked in 4 different stats (EW's WAR, Corisica's WAR, Corsica/DailyFaceoff Star Rating, and Game Score). I took the average of each ranking, and voila:

Screen_Shot_2018-10-04_at_10.42.16_AM.png


According to statistical performances, Laine, Scheifele, Wheeler, Ehlers, and Perreaul are all top line talents. Connor, Lowry, Little, Roslovic, and Copp are considered 2nd line talents. Tanev is a third line talent.

Now, let's be honest, some of this is a "rising tide raises all boats". I bet the Jets lower ranked players would rank lower if they weren't used in an optimal role and play with such strong linemates (two of the stats <Game Score and Star Rating> don't account for usage or team strength) -- I'm looking at you Tanev and Copp. But still, that's pretty impressive.

*Roslovic I used the stat per GP instead of the stat. Why I normally don't do that is because we want to account for health factors, like for example Perreault ranks higher in per game but we know he's likely to miss games.
Here's my hot take... Copp is going to take another step forward this season. That CLT line might be even better by adding a bit of offense.
 

garret9

AKA#VitoCorrelationi
Mar 31, 2012
21,738
4,380
Vancouver
www.hockey-graphs.com
And now for defense:

Screen_Shot_2018-10-04_at_11.09.40_AM.png


That defense could use Enstrom. He'd rank in the low 100s like Myers, and without a career PP year to do it.

It has three 1st pairing defenders, but all three of them are below average.
Myers is considered a below average 2nd pairing defender, but almost exclusively because of his PP points pulling him up. I'm skeptical he repeats that production.
Chiarot, Morrow, and Kulikov: having only 1 would be optimal, and 2 would still be good. Having all three of them is not that great.
 
Last edited:
  • Like
Reactions: Trilliann

Weezeric

Registered User
Jan 27, 2015
4,475
6,550
Here's my hot take... Copp is going to take another step forward this season. That CLT line might be even better by adding a bit of offense.

I could see copp having an Andrew Ladd-like career. He may be a guy the jets won’t be able to keep after this season and has more success with more opportunity in another city.
 

surixon

Registered User
Jul 12, 2003
48,897
69,657
Winnipeg
And now for defense:

Screen_Shot_2018-10-04_at_11.09.40_AM.png


That defense could use Enstrom.

It has three 1st pairing defenders, but all three of them are below average.
Myers is considered a below average 2nd pairing defender, but almost exclusively because of his PP points pulling him up. I'm skeptical he repeats that production.
Chiarot, Morrow, and Kulikov: having only 1 would be optimal, and 2 would still be good. Having all three of them is not that great.

Well hopfuly Niku can come in and perform around where Myers is currently ranked when he gets a chance. There is still growth opportunity for Morrissey and Trouba who I feel are below average largwly due to next to no PP time.
 

Ducky10

Searching for Mark Scheifele
Nov 14, 2014
19,809
31,386
And now for defense:

Screen_Shot_2018-10-04_at_11.09.40_AM.png


That defense could use Enstrom. He'd rank in the low 100s like Myers, and without a career PP year to do it.

It has three 1st pairing defenders, but all three of them are below average.
Myers is considered a below average 2nd pairing defender, but almost exclusively because of his PP points pulling him up. I'm skeptical he repeats that production.
Chiarot, Morrow, and Kulikov: having only 1 would be optimal, and 2 would still be good. Having all three of them is not that great.


Hmm, I said that in another thread, and didn't even need a chart. I got roundly mocked, but I'm ok with that.


(I like the charts though, neat).
 

Ad

Upcoming events

Ad

Ad