What to Look for when Finding Comparable Players?

silverfish

got perma'd
Jun 24, 2008
34,644
4,353
under the bridge
Pretty simple ask here, but as with all things I put my hands on, I'm going to make it more complicated than it probably is :)

I'm trying to find the best way to compare players to one another by looking at players with very similar games. Obviously, trying to do this in a data based way than a subjective way. For instance, I know I could probably compare JT Miller and Brandon Dubinsky, but does the data back that up? Who is a surprise comparison to JT Miller? Things like that...

The metrics I'm looking at are a bit more "tendency" based than statistically based. For instance, I'd rather look at how often a player shoots the puck rather than how many goals he scores. Is that ridiculous? Am I being too "nitpicky"?

Right now, here are the metrics I'm looking at:

  • Age
  • Primary Points per 60
  • Individual Shot Attempts per 60
  • Percent of shot attempts that are unblocked
  • Relative scoring chances for

And then a few things that are out of the players control that may provide an underlying meaning to their metrics:

  • tCF60
  • Relative Zone Starts

Curious what you guys think of this approach, and if you have any other recommended metrics that may be worthwhile?

Also, a bit OT, but is anyone a member of any good Slack channels that are dedicated to talking Hockey Analytics? I'd love to get involved in something like that, even if it's just being a fly on the wall. I think I account for 25% of the posts in the HFNYR Advanced Stats thread, and I think everyone there is tired of my crazy/preposterous projects that I take on (including this one) :laugh:
 

Canadiens1958

Registered User
Nov 30, 2007
20,020
2,779
Lake Memphremagog, QC.
Considerations

Pretty simple ask here, but as with all things I put my hands on, I'm going to make it more complicated than it probably is :)

I'm trying to find the best way to compare players to one another by looking at players with very similar games. Obviously, trying to do this in a data based way than a subjective way. For instance, I know I could probably compare JT Miller and Brandon Dubinsky, but does the data back that up? Who is a surprise comparison to JT Miller? Things like that...

The metrics I'm looking at are a bit more "tendency" based than statistically based. For instance, I'd rather look at how often a player shoots the puck rather than how many goals he scores. Is that ridiculous? Am I being too "nitpicky"?

Right now, here are the metrics I'm looking at:

  • Age
  • Primary Points per 60
  • Individual Shot Attempts per 60
  • Percent of shot attempts that are unblocked
  • Relative scoring chances for

And then a few things that are out of the players control that may provide an underlying meaning to their metrics:

  • tCF60
  • Relative Zone Starts

Curious what you guys think of this approach, and if you have any other recommended metrics that may be worthwhile?

Also, a bit OT, but is anyone a member of any good Slack channels that are dedicated to talking Hockey Analytics? I'd love to get involved in something like that, even if it's just being a fly on the wall. I think I account for 25% of the posts in the HFNYR Advanced Stats thread, and I think everyone there is tired of my crazy/preposterous projects that I take on (including this one) :laugh:

Perhaps consider the following for skaters.

Position - LW/C/RW/LD/RD. Handedness - LHS/RHS.
Provenance. All of the game metrics above have elements of provenance. Not an exhaustive list but one tha serves as an example of the approach.

Relative Zone Starts. In either the offensive or defensive zone which centers get the left circle draws? which ones get the right circle draws? Is there a difference in performance by a specific center in the left circle or right circle in each zone.

Shooting related. Individual skater shooting skills are evaluated using a grid. A LW will take shots mainly from the left wing in the offensive zone, RW from the left wing. But where exactly on each wing are they shooting from? How successful are they from the various areas? Are the shots coming of a rush, set play following a faceoff, skill related shots, re-direct, deflections, wrap-arounds,etc.

So forth....
 

silverfish

got perma'd
Jun 24, 2008
34,644
4,353
under the bridge
Bumping this back up... what do you guys think about these stats and the weight I am applying to them? I'm not in love with it, considering some of the results I'm getting

(ie. right now, Derick Brassard's top comparable for last season is Paul Gaustad's 2010-2011 season. Both of them having 10 5v5 goals is playing too big of a part maybe, but that's important, no?

iCF60 (weight: 2.5)
PST (Percent of shot attempts for that belong to the player - allows me to figure out the shooting tendencies | weight: 2.5)
PrimaryPoints per 60 (weight: 3.5)
Unblocked Shot Attempt Success (Individual Fenwick divided by Individual Corsi | weight: 2)
SCF.Relative (weight: 1.5)
TOI/Gm (weight: 2)
tCF60 (weight: 1)
cCF60 (weight: 1)
Goals (weight: 5)
Primary assists (weight: 4)

I'm not sure if there is a statistically relevant way to gauge what the correct weights should be? And I'm afriad the more I manipulate the data, the more it just becomes me making it what I want it to be.

Maybe Brassard last season and Gaustad in 2010-2011 are comparable :dunno:
 

36kap36

Registered User
Jan 21, 2011
882
0
Ohio
Bumping this back up... what do you guys think about these stats and the weight I am applying to them? I'm not in love with it, considering some of the results I'm getting

(ie. right now, Derick Brassard's top comparable for last season is Paul Gaustad's 2010-2011 season. Both of them having 10 5v5 goals is playing too big of a part maybe, but that's important, no?

iCF60 (weight: 2.5)
PST (Percent of shot attempts for that belong to the player - allows me to figure out the shooting tendencies | weight: 2.5)
PrimaryPoints per 60 (weight: 3.5)
Unblocked Shot Attempt Success (Individual Fenwick divided by Individual Corsi | weight: 2)
SCF.Relative (weight: 1.5)
TOI/Gm (weight: 2)
tCF60 (weight: 1)
cCF60 (weight: 1)
Goals (weight: 5)
Primary assists (weight: 4)

I'm not sure if there is a statistically relevant way to gauge what the correct weights should be? And I'm afriad the more I manipulate the data, the more it just becomes me making it what I want it to be.

Maybe Brassard last season and Gaustad in 2010-2011 are comparable :dunno:

Are these just arbitrary weights? I get the point of them all, but what do they actually mean on their own? And better yet, how can you convince someone that, for example, TOI/Gm is twice as important as tCF60 on its own? It seems obvious, yeah, but it's still important to reason the weights.
 

silverfish

got perma'd
Jun 24, 2008
34,644
4,353
under the bridge
Are these just arbitrary weights? I get the point of them all, but what do they actually mean on their own? And better yet, how can you convince someone that, for example, TOI/Gm is twice as important as tCF60 on its own? It seems obvious, yeah, but it's still important to reason the weights.

Completely arbitrary.
 

36kap36

Registered User
Jan 21, 2011
882
0
Ohio
Completely arbitrary.

I'd suggest some sort of method to get non-arbitrary weights. Maybe run a regression on a list of players you already think are similar, based on your current stats plus more that you might not think are necessary? That is still somewhat arbitrary to start, yes, and you need to have a large enough set to run the regression, but then I believe you'd be on the right track.

I'd be happy to help out on compiling the data set, if you'd want help. If not, just my two cents!
 

silverfish

got perma'd
Jun 24, 2008
34,644
4,353
under the bridge
I'd suggest some sort of method to get non-arbitrary weights. Maybe run a regression on a list of players you already think are similar, based on your current stats plus more that you might not think are necessary? That is still somewhat arbitrary to start, yes, and you need to have a large enough set to run the regression, but then I believe you'd be on the right track.

I'd be happy to help out on compiling the data set, if you'd want help. If not, just my two cents!

I've got the data in place, AFAIK, I'd be more interested in how to run the regressions to get the weights.

Cordially,
Someone who wishes they went to college after the increase of statistical influence in hockey became mainstream ;)
 

silverfish

got perma'd
Jun 24, 2008
34,644
4,353
under the bridge
I've made some changes in fears of this being an overfit model. Now using:

iCF60 (2.5)
Percent of Shots Taken (2.5)
Primary Points per 60 (3.5)
TOI per game (2)
tCF60 (1)
Goals (5.5)
A1 (4)

I wonder if it would be smarter to use goals per 60 and primary assists per 60.
 

A4T1L6

Registered User
Feb 10, 2015
2,850
1,213
I've made some changes in fears of this being an overfit model. Now using:

iCF60 (2.5)
Percent of Shots Taken (2.5)
Primary Points per 60 (3.5)
TOI per game (2)
tCF60 (1)
Goals (5.5)
A1 (4)

I wonder if it would be smarter to use goals per 60 and primary assists per 60.

that model looks good however
 

eperry

Registered User
Jun 27, 2016
64
9
Given a reasonably reliable categorized subset of players, you could build a simple classification model (K-Nearest Neighbours, multinomial logistic) to define weights. I've tried this in the past with crowd-sourced data using questionnaires.
 

silverfish

got perma'd
Jun 24, 2008
34,644
4,353
under the bridge

Ad

Upcoming events

Ad

Ad