News Article: "Fun With Numbers" - Advanced Stats Talk Here

operasen

Registered User
Apr 27, 2004
5,681
346
Advanced stats are only an advantage to a club if they find an angle that other teams have not considered. If they are doing everything the same way as everyone else the game is about not falling behind instead of getting ahead. That is the big push toward innovation and development of new metrics.

I would caution against accepting teams at face value when they say they are already doing advanced stats and it is hush-hush. There is a discussion of this earlier in the thread.

This is so new, I agree that there is a lot of misinformation as teams get up to speed on what its all about.

Plus the idea of learning how to use them could completely redefine the requirements of hiring a cutting edge coach and associates.

This could require a changing of the guard in places like Toronto as compared to Edmonton. Both have openly hired people to develop and learn to use them. The coach in Toronto is pretty old school while Eakins in Edmonton is quite pleased with the idea.
 

Micklebot

Moderator
Apr 27, 2010
53,913
31,129
Not an article on the Sens per say, but Sportsnet publish a list of D with > 400 touches in the defensive zone last year, and from their Data, Karlsson has the highest success rate exiting the zone (not sure what definition they used in determining this), I believe Wiercioch is 5th in that metric.

Full list:
http://www.sportsnet.ca/hockey/nhl/datat-for-all-d-with-450-touches-in-the-dz/

The list is sorted by touches per Corsi against, which seemed an odd metric to me but he explains in another article here

The data is not kind to Cowen; he had a low exit success %, high turnover %, and low touches/CA though still above 1, which is the break even point.

Overall, it's an interesting attempt at breaking down defensive zone only stats.

edit: First criticisms; Ignores that teams may target or avoid certain players. Weber is likely to get less touches because the opposition dumps or carries it in on his partners side, likewise, a lesser player may be targeted.

Also, I'd prefer to see a touches per TOI (in the DZ), total TOI in the DZ and total TOI. As it stands, it completely ignores how good a player is at denying entry altogether (which would result in no touches and no CA). To me, that's the best defensive play you can make.
 
Last edited:

FlyingJ

Registered User
Feb 25, 2014
841
148
Not an article on the Sens per say, but Sportsnet publish a list of D with > 400 touches in the defensive zone last year, and from their Data, Karlsson has the highest success rate exiting the zone (not sure what definition they used in determining this), I believe Wiercioch is 5th in that metric.

Full list:
http://www.sportsnet.ca/hockey/nhl/datat-for-all-d-with-450-touches-in-the-dz/

The list is sorted by touches per Corsi against, which seemed an odd metric to me but he explains in another article here

The data is not kind to Cowen; he had a low exit success %, high turnover %, and low touches/CA though still above 1, which is the break even point.

Overall, it's an interesting attempt at breaking down defensive zone only stats.

edit: First criticisms; Ignores that teams may target or avoid certain players. Weber is likely to get less touches because the opposition dumps or carries it in on his partners side, likewise, a lesser player may be targeted.

Also, I'd prefer to see a touches per TOI (in the DZ), total TOI in the DZ and total TOI. As it stands, it completely ignores how good a player is at denying entry altogether (which would result in no touches and no CA). To me, that's the best defensive play you can make.

An advanced stat that doesn't make Cowen look like a solid top 4 D on any team in the league? Say it ain't so! I thought blatant cognitive bias on only 1 or 2 bad plays was the reason for some people ****ing on him? Next thing you'll tell me is that there are advanced stats that make the GSN line's performance last year look like garbage :sarcasm:
 

Caeldan

Whippet Whisperer
Jun 21, 2008
15,459
1,046
Well... the other part of those stats show that Phillips comes in 2nd on the team behind Karlsson and that Wiercioch was the worst (maybe there was something to Maclean keeping him off the ice?)
 

Micklebot

Moderator
Apr 27, 2010
53,913
31,129
An advanced stat that doesn't make Cowen look like a solid top 4 D on any team in the league? Say it ain't so! I thought blatant cognitive bias on only 1 or 2 bad plays was the reason for some people ****ing on him? Next thing you'll tell me is that there are advanced stats that make the GSN line's performance last year look like garbage :sarcasm:

Well... the other part of those stats show that Phillips comes in 2nd on the team behind Karlsson and that Wiercioch was the worst (maybe there was something to Maclean keeping him off the ice?)

For what it's worth, anything over 1.0 in their touches/Corsi Against is considered good by the author, and both Cowen and Wiercioch meet that mark, though I'm not sure why >1.0 is the benchmark.
 

Caeldan

Whippet Whisperer
Jun 21, 2008
15,459
1,046
For what it's worth, anything over 1.0 in their touches/Corsi Against is considered good by the author, and both Cowen and Wiercioch meet that mark, though I'm not sure why >1.0 is the benchmark.

Well they explain it as being if you're skilled, you'll have more touches than you'll have shots attempts against (CA)

If you're weak they'll have more shots attempts against you than you'll touch the puck.

Though, I'm not sure that bears out exactly... especially when the definition of touch is 'possession of the puck with the intention of exiting the zone'.

Because a heavy shot blocking defender would allow several shot attempts before ever actually gaining the puck. In fact, more often than not it might be their more skilled puck handling defensive partner who ends up picking up the deflection or wide shot (due to the defender being in position) and so the 'Volchenkov' could look significantly worse by this metric compared to the 'PrimePhillips'.
 

Micklebot

Moderator
Apr 27, 2010
53,913
31,129
Well they explain it as being if you're skilled, you'll have more touches than you'll have shots attempts against (CA)

If you're weak they'll have more shots attempts against you than you'll touch the puck.

Though, I'm not sure that bears out exactly... especially when the definition of touch is 'possession of the puck with the intention of exiting the zone'.

Because a heavy shot blocking defender would allow several shot attempts before ever actually gaining the puck. In fact, more often than not it might be their more skilled puck handling defensive partner who ends up picking up the deflection or wide shot (due to the defender being in position) and so the 'Volchenkov' could look significantly worse by this metric compared to the 'PrimePhillips'.

I think the first step to improving it would be to switch from corsi against to Fenwick, though it would still favour puck movers over shot blockers.
 

Caeldan

Whippet Whisperer
Jun 21, 2008
15,459
1,046
I think the first step to improving it would be to switch from corsi against to Fenwick, though it would still favour puck movers over shot blockers.

That and/or possibly include blocked shots as 'touches' to buoy their ratio slightly.
 

dumbdick

Galactic Defender
May 31, 2008
11,353
3,774
For those interested in this stuff I ran an advanced prediction model forecasting player data going back to 1998 to forecast points in this upcoming season. It was pretty quick and dirty just to practice the technique. I read the first name on the list of predictions and was impressed - it was Tyler Seguin, which matches what the hockey forecaster is predicting.

Then I saw number two -- Kyle Turris -- and I immediately changed my mind. Also interesting to note that it predicted Da Costa to make the biggest year over year gains in point totals. :laugh:

Here's the top 20 predictions from the model:
Seguin, Turris, Kane, Duchene, Toews, Kessel, Johansen, Hall, Tarasenko, O'Reilly, Perry, Okposo, Pavelski, Giroux, Crosby, Steen, Vanek, Skinner, Couture, Kopitar

Crosby's way too low and some big time omissions from the top 20 (no Tavares, no Malkin, no Stamkos). Back to the drawing board I guess but not a bad start.
 

StefanW

Registered User
Mar 13, 2013
6,286
0
Ottawa
www.storiesnumberstell.com
For those interested in this stuff I ran an advanced prediction model forecasting player data going back to 1998 to forecast points in this upcoming season. It was pretty quick and dirty just to practice the technique. I read the first name on the list of predictions and was impressed - it was Tyler Seguin, which matches what the hockey forecaster is predicting.

Then I saw number two -- Kyle Turris -- and I immediately changed my mind. Also interesting to note that it predicted Da Costa to make the biggest year over year gains in point totals. :laugh:

Here's the top 20 predictions from the model:
Seguin, Turris, Kane, Duchene, Toews, Kessel, Johansen, Hall, Tarasenko, O'Reilly, Perry, Okposo, Pavelski, Giroux, Crosby, Steen, Vanek, Skinner, Couture, Kopitar

Crosby's way too low and some big time omissions from the top 20 (no Tavares, no Malkin, no Stamkos). Back to the drawing board I guess but not a bad start.

You can rule out SDC, but with the others you never know. I think you should save this iteration and see how it matches up against reality when it happens. If predictive modeling was only about lining things up with common sense there would be no point to the models.
 

dumbdick

Galactic Defender
May 31, 2008
11,353
3,774
You can rule out SDC, but with the others you never know. I think you should save this iteration and see how it matches up against reality when it happens. If predictive modeling was only about lining things up with common sense there would be no point to the models.

Yeah, it's one of these "black box" type machine learning techniques, so it's difficult to break out which variables are really driving the predictions. I would be pleasantly surprised if Turris is runner up to the Art Ross though. If it happens, you can be sure I'll be bumping that post.
 

Caeldan

Whippet Whisperer
Jun 21, 2008
15,459
1,046
Yeah, it's one of these "black box" type machine learning techniques, so it's difficult to break out which variables are really driving the predictions. I would be pleasantly surprised if Turris is runner up to the Art Ross though. If it happens, you can be sure I'll be bumping that post.

I think it's injuries being a major factor throwing off your model if the big names out of place are Crosby, Malkin, Tavares and Stamkos.
That's something they all have in common - at least one season with significant number of games missed due to injury.
 

dumbdick

Galactic Defender
May 31, 2008
11,353
3,774
I think it's injuries being a major factor throwing off your model if the big names out of place are Crosby, Malkin, Tavares and Stamkos.
That's something they all have in common - at least one season with significant number of games missed due to injury.

That makes sense for some of the players that were injured last season. It's a simple-ish test model based on about 20+ variables trained on 10 years of data looking only at a player's previous-year stats to predict the current season point totals. The fact that it only uses the previous year's stats has me really scratching my head how it didn't predict Crosby higher. He had 104 points last season and the model doesn't have any information to identify which player is Crosby or the fact that he has an injury history. It's almost like the model "recognized" Crosby directly from his unique stat signature, and inferred the injury potential directly based on his previous seasons (probably means my model is overfit). Pretty cool the way these models pull out a bunch of patterns in the interactions between variables that would be impossible to see from the numbers themselves.
 

StefanW

Registered User
Mar 13, 2013
6,286
0
Ottawa
www.storiesnumberstell.com
That makes sense for some of the players that were injured last season. It's a simple-ish test model based on about 20+ variables trained on 10 years of data looking only at a player's previous-year stats to predict the current season point totals. The fact that it only uses the previous year's stats has me really scratching my head how it didn't predict Crosby higher. He had 104 points last season and the model doesn't have any information to identify which player is Crosby or the fact that he has an injury history. It's almost like the model "recognized" Crosby directly from his unique stat signature, and inferred the injury potential directly based on his previous seasons (probably means my model is overfit). Pretty cool the way these models pull out a bunch of patterns in the interactions between variables that would be impossible to see from the numbers themselves.

That settles it, I am dropping Crosby from my pool. :sarcasm:

I do think the model is taking his injury potential into account, which is kind of cool. Injuries are as much a part of the game as progression and regression.
 
Last edited:

Caeldan

Whippet Whisperer
Jun 21, 2008
15,459
1,046
To test the injury hypothesis... What's it done with Spezza?
Considering he only had 5 games in the lockout season and a few 60 game seasons?

Also, where's Ovechkin? He's been top 10 in scoring pretty much every season but 11-12 and plays nearly every game.
 

dumbdick

Galactic Defender
May 31, 2008
11,353
3,774
Here's what the model predicts for points ranks next year compared to actual points ranks last year for the top 100 players. It's predicting regression from players like Ovie and Spezz. (Big year for Turris, but regression for Karlsson). I actually agree with a lot of the players that it's predicting for improvement here. Points at a young age is obviously an important variable for predicting upward trends, but there are a lot of exceptions like Hossa, Palat, etc... Again worth noting that the model doesn't know which players are which or anything at all about a player's performance before 2013-2014. It's just backing this stuff out of their stats line for a single season. My apologies if nobody else finds this stuff interesting, but I think this stuff is pretty cool (even if I doubt it works).

PredictedPointsLeaders2014-15_zpsf4b80de6.jpg
 

StefanW

Registered User
Mar 13, 2013
6,286
0
Ottawa
www.storiesnumberstell.com
Here's what the model predicts for points ranks next year compared to actual points ranks last year for the top 100 players. It's predicting regression from players like Ovie and Spezz. (Big year for Turris, but regression for Karlsson). I actually agree with a lot of the players that it's predicting for improvement here. Points at a young age is obviously an important variable for predicting upward trends, but there are a lot of exceptions like Hossa, Palat, etc... Again worth noting that the model doesn't know which players are which or anything at all about a player's performance before 2013-2014. It's just backing this stuff out of their stats line for a single season. My apologies if nobody else finds this stuff interesting, but I think this stuff is pretty cool (even if I doubt it works).

PredictedPointsLeaders2014-15_zpsf4b80de6.jpg

I think this is really cool, thanks for sharing. :handclap:

Do you mind a few questions? I was wondering how many years of data you fed into it to establish the regression line. Also, it seems like the residuals would be pretty brutal because of the difference in talent levels of NHL players. How did you deal with those types of issues?
 

dumbdick

Galactic Defender
May 31, 2008
11,353
3,774
I think this is really cool, thanks for sharing. :handclap:

Do you mind a few questions? I was wondering how many years of data you fed into it to establish the regression line. Also, it seems like the residuals would be pretty brutal because of the difference in talent levels of NHL players. How did you deal with those types of issues?

It's not regression based, it's a classification model. It's trained on 13 years of data, going back to 2001. The model fit isn't that great overall, but not that bad considering injuries, trades, linemates, etc, and all the other noise in the data. This was really just meant to be a test of the procedure. Next phase is to beef up the model a bit more and try to fix some underlying problems.
 

StefanW

Registered User
Mar 13, 2013
6,286
0
Ottawa
www.storiesnumberstell.com
It's not regression based, it's a classification model. It's trained on 13 years of data, going back to 2001. The model fit isn't that great overall, but not that bad considering injuries, trades, linemates, etc, and all the other noise in the data. This was really just meant to be a test of the procedure. Next phase is to beef up the model a bit more and try to fix some underlying problems.

Ok, gotcha. I'm more familiar with SEMs that combine regression and factor analysis, so I can't give you much insight beyond that it looks cool.
 

dumbdick

Galactic Defender
May 31, 2008
11,353
3,774
Is that Joel Ward at the bottom there?

That's a weird name to see on a graph like this.

:laugh: Yeah, there are some weird ones in there. He had a big season last year with 49 points and played all 82 games, which is probably how he snuck in there. There are probably some really big names that were injured last year that the model didn't predict well.
 
Last edited:

dumbdick

Galactic Defender
May 31, 2008
11,353
3,774
Ok, gotcha. I'm more familiar with SEMs that combine regression and factor analysis, so I can't give you much insight beyond that it looks cool.

Cool. Can't say I know much about either of those (SEM or Factor Analysis), but I find a lot of times different disciplines will use different names for similar approaches.
 
Last edited:

BonkTastic

ಠ_ಠ
Nov 9, 2010
30,901
10,092
Parts Unknown
:laugh: Yeah, there are some weird ones in there. He had a big season last year with 49 points and played all 82 games, which is probably how he snuck in there. There are probably some really big names that were injured last year that the model didn't predict well.

Yeah, I guess it's hard to account for some of those statistical outliers like Ward's freakishly odd career season at 33 years old.



Fun Fact: Joel Ward's 18.0 shooting percentage last season was 63% better than his career average (11%). Regressing to his career average would mean a drop in goals from 24 over the course of an 82 game season, to ~14... which would still be the 3rd highest goal total of his NHL career.
 

Ad

Upcoming events

Ad

Ad