Series Discussion: My Completely Statistical NHL Ranking Input please

SKRusty

Napalm
Jan 20, 2016
2,611
1,062
Okay I have been playing with statistics trying to get a ranking system that is based completely on Stats.

I have incorporated PDO, Save percentage, Advanced stats and of course points.

PDO and save percentage are weighted to games played.

Please give constructive input don't give me subjective I don't think they belong there because I don't think they are that good.

Give me a statistical basis of error.


RankTeamGPPointsProjected Pts
1Tampa Bay Lightning1421127
2Nashville Predators1422124
3San Jose Sharks1417112
4Minnesota Wild1318112
5Edmonton Oilers1317109
6Calgary Flames1519106
7Winnipeg Jets1417103
8Toronto Maple Leafs1418103
9Montreal Canadiens1316103
10Boston Bruins1316100
11Dallas Stars131698
12Arizona Coyotes121498
13Pittsburgh Penguins121597
14Carolina Hurricanes141496
15New York Islanders131794
16Colorado Avalanche141793
17Vancouver Canucks151892
18Buffalo Sabres151688
19Vegas Golden Knights141388
20Washington Capitals121386
21Columbus Blue Jackets141585
22Chicago Blackhawks151584
23New Jersey Devils111181
24New York Rangers141375
25Philadelphia Flyers141375
26St Louis Blues121173
27Anaheim Ducks151572
28Florida Panthers11972
29Ottawa Senators141366
30Los Angeles Kings13961
31Detroit Red Wings141059
[TBODY] [/TBODY]
 

tfong

HFBoards Sponsor
Sponsor
Sep 29, 2008
10,402
972
www.instagram.com
What I'd like to know is what statistical relevance you put between the relationships between these stats and how you weigh them in terms of 1:1 to each other.
what I would find strange is why would you count PDO and sav% as separate because PDO already takes SAV% into consideration. So you're essentially counting it twice or giving it double the weight are you not?

Furthermore I think if one wants a purely stats based ranking, we're headed towards an ELO type relationship where then goals against weaker teams should server as lesser multipliers against teams above them.
 

SKRusty

Napalm
Jan 20, 2016
2,611
1,062
What I'd like to know is what statistical relevance you put between the relationships between these stats and how you weigh them in terms of 1:1 to each other.
what I would find strange is why would you count PDO and sav% as separate because PDO already takes SAV% into consideration. So you're essentially counting it twice or giving it double the weight are you not?

To a point it is but I factor in the number of games played and the save percentage relative to league norms because early in the season there are really high and really low save percentages that are not going to be maintained and to mitigate the influence it has to be used as a measure to adjust it closer to the expected norms. Save percentage is used to adjust the PDO as accurately as possible.

Unlike team shooting percentages that consisting of 20 players, you are relying in most cases on a primary and secondary goaltender which if they are hot/cold early can give you artificial numbers. I therefore tried to mitigate the effect on the bottom line.
 

SKRusty

Napalm
Jan 20, 2016
2,611
1,062
We can't really give constructive feedback that isn't subjective without being able to see the underlying formula you used for the ranking.
There are about 12 different formulas used-- some are basic ... others are more involved. Was trying to get a sense from some of the stat junkies if it shakes out how they think it should.

I am trying to devise a system that is more accurate and realistic than the TSN or Sportsnet's point standings rankings that are in essence nothing more than winning percentage.
 

tfong

HFBoards Sponsor
Sponsor
Sep 29, 2008
10,402
972
www.instagram.com
There are about 12 different formulas used-- some are basic ... others are more involved. Was trying to get a sense from some of the stat junkies if it shakes out how they think it should.

I am trying to devise a system that is more accurate and realistic than the TSN or Sportsnet's point standings rankings that are in essence nothing more than winning percentage.

I guess my next question would be: what would the purpose be?

Without knowing the intention of your point system, its hard to really relate to its accuracy and/or comment on the possible problems. Like is it a power ranking of some sort? What is it supposed to tell us? I mean having alternate standings based on what you define as "good traits" is fine and all but at the end of the day, the NHL won't be using that to determine who gets into the playoffs.

Manipulation of statistics always comes with a purpose, otherwise there is no direction in the information flow.
 

Kanye

Life of Pablo
Feb 25, 2012
5,618
1,134
Chicago
I still don't get the purpose of this except having way too much downtime and trying to prove to HFFlames you're smarter than you are.

win = 2 points is good enough lol
 

SKRusty

Napalm
Jan 20, 2016
2,611
1,062
Here is the formulas compare to actual results from last season.

TeamGPPointsProjected Points
1Nashville Predators82117118
1Boston Bruins82112116
1Winnipeg Jets82114116
1Tampa Bay Lightning82113115
1Vegas Golden Knights82109110
1Toronto Maple Leafs82105104
1Pittsburgh Penguins82100103
1Washington Capitals82105103
1San Jose Sharks82100101
1Anaheim Ducks82101100
1Columbus Blue Jackets829799
1Minnesota Wild8210199
1Philadelphia Flyers829898
1Los Angeles Kings829898
1St Louis Blues829496
1New Jersey Devils829796
1Florida Panthers829696
1Dallas Stars829293
1Colorado Avalanche829592
1Calgary Flames828487
1Carolina Hurricanes828387
1Edmonton Oilers827879
1Chicago Blackhawks827678
1New York Islanders828077
1New York Rangers827773
1Detroit Red Wings827372
1Montreal Canadiens827171
1Vancouver Canucks827371
1Arizona Coyotes827068
1Ottawa Senators826764
1Buffalo Sabres826260
[TBODY] [/TBODY]

To me it appears as if the formula is pretty accurate for modeling purposes.
 

SKRusty

Napalm
Jan 20, 2016
2,611
1,062
I still don't get the purpose of this except having way too much downtime and trying to prove to HFFlames you're smarter than you are.

win = 2 points is good enough lol

The point is to exercise the brain and do something other than insult other people because I feel inadequate like others choose to do.
 

Kanye

Life of Pablo
Feb 25, 2012
5,618
1,134
Chicago
The point is to exercise the brain and do something other than insult other people because I feel inadequate like others choose to do.
You wanted criticism i'm just giving you what you wanted. To be constructive, this thread is 100% a waste of time and provides pointless information.
 

SKRusty

Napalm
Jan 20, 2016
2,611
1,062
I guess my next question would be: what would the purpose be?

Without knowing the intention of your point system, its hard to really relate to its accuracy and/or comment on the possible problems. Like is it a power ranking of some sort? What is it supposed to tell us? I mean having alternate standings based on what you define as "good traits" is fine and all but at the end of the day, the NHL won't be using that to determine who gets into the playoffs.

Manipulation of statistics always comes with a purpose, otherwise there is no direction in the information flow.

I was looking at the power ranking TSN and others put together and I found them to either be slanted by personal views or simply done with laziness by winning percentage. I decided to do this because I thought there was a better way to assess where the teams will be in 6 months...If there is a way to be somewhat accurate by using advanced stats and other tools. A model will never be perfect but anything over 80% accuracy might be cool to see ... Is your team on the right path?
 
  • Like
Reactions: Johnny Hoxville

tfong

HFBoards Sponsor
Sponsor
Sep 29, 2008
10,402
972
www.instagram.com
I was looking at the power ranking TSN and others put together and I found them to either be slanted by personal views or simply done with laziness by winning percentage. I decided to do this because I thought there was a better way to assess where the teams will be in 6 months...If there is a way to be somewhat accurate by using advanced stats and other tools. A model will never be perfect but anything over 80% accuracy might be cool to see ... Is your team on the right path?

Ok so if we're looking at power rankings to project the strength of a team outside of standings.

So my first thought on it (without seeing your formulas) would be to remove the double count of PDO and SAV%. So in this case probably just remove SAV% since PDO takes that into account. Then I'd add (if you haven't already) possession time metric as a multiplier as well as including multipliers for high risk shots versus low risk shots (which would gauge effectiveness of team defense). If the case is that we want to gauge team strength outside of their point standings to determine a "true" strength then.
 

MNNumbers

HFBoards Sponsor
Sponsor
Nov 17, 2011
7,658
2,536
Obviously, I'm from a different fanbase, but this kind of thing could be interesting in a way. When you post last year's results, the question I have is: Did you use season end stats to calculate your "expected points"?

A couple of other questions:
How does the prediction work? Does it back story the whole season, comparing each opponent, and choosing a certain number of points from that matchup? Or, does it do things in some other way?

I'm interested because I have my own Power Ranking System, which uses game results only (W, L, OTL), but, because of the statistics involved, the strength of schedule comes out inherently in the calculations, and back running the entire season would give you perfect results.

You are obviously doing something different here....
 

SKRusty

Napalm
Jan 20, 2016
2,611
1,062
Obviously, I'm from a different fanbase, but this kind of thing could be interesting in a way. When you post last year's results, the question I have is: Did you use season end stats to calculate your "expected points"?

A couple of other questions:
How does the prediction work? Does it back story the whole season, comparing each opponent, and choosing a certain number of points from that matchup? Or, does it do things in some other way?

I'm interested because I have my own Power Ranking System, which uses game results only (W, L, OTL), but, because of the statistics involved, the strength of schedule comes out inherently in the calculations, and back running the entire season would give you perfect results.

You are obviously doing something different here....

I am in the process of building a scraper to test the stats in past seasons week by week to test the accuracy but that will take me a few weeks because of limited spare time.

The previous season factors in all the stats then models what the final result should be. The model I built is very accurate (largest variance is 4 points in the standings) but it should be considering in essence it has all the data.

Cataloging previous seasons and cataloging players actually build more possibility for errors. With so many moving variables once you get about 8 weeks of stats in theory the formula should work well. Been trying how to wrap my head around injuries but for now it appears to do well.
 

MNNumbers

HFBoards Sponsor
Sponsor
Nov 17, 2011
7,658
2,536
What would be interesting is to look at only the stats from start if last season to Dec 1, and then check the predictions. It's not difficult to tweak the variable weightswe to get good results for a whole year. If you are building a predictive model, that's different.
 
  • Like
Reactions: SKRusty

Flames Fanatic

Mediocre
Aug 14, 2008
13,362
2,904
Cochrane
The point is to exercise the brain and do something other than insult other people because I feel inadequate like others choose to do.

The problem is you literally aren't giving us anything to worth with, you want constructive criticism, but without actually showing your work, you just look like you are looking for people to stroke your ego.
 
  • Like
Reactions: Janks

Volica

Papa Shango
May 15, 2012
21,436
11,110
The problem is you literally aren't giving us anything to worth with, you want constructive criticism, but without actually showing your work, you just look like you are looking for people to stroke your ego.

This is the major flaw of this thread in general.

You can't get published in a scientific journal by just posting what you've found. You have to, at minimum, outline your work; what you've done to get your result.

By simply posting numbers, and saying, wow look at this; essentially we have no idea what you're working on. By saying I've included a bunch of things! It doesn't get us anywhere. Don't worry Rusty; no one on this board is going to steal your life's work.
 
  • Like
Reactions: Janks

SKRusty

Napalm
Jan 20, 2016
2,611
1,062
This is the major flaw of this thread in general.

You can't get published in a scientific journal by just posting what you've found. You have to, at minimum, outline your work; what you've done to get your result.

By simply posting numbers, and saying, wow look at this; essentially we have no idea what you're working on. By saying I've included a bunch of things! It doesn't get us anywhere. Don't worry Rusty; no one on this board is going to steal your life's work.

I am hardly worried about that. The formulas are all broke into separate functions. As they stand right now there nobody here would understand what is going on if I were to copy and paste. I was thinking about writing the mathematical function out but where I started from is far from where the product is and now is substantially different. I added in a fair amount of bell curve analysis with standard deviations in regards to CORSI, shooting percentage, PDO, and save percentage, HDSF, HDSA, SCF and SCA.

400px-Standard_deviation_diagram.svg.png


In essence the formula takes current points and win percentage then it takes the CORSI adjusts the CORSI based on standard deviations anything more than 1 standard deviation and the formula calculates the expected CORSI to end up at 1 standard deviation(this then predicts the likely historical bounce back to the normal range, this is then adjust by the number of games remaining meaning a club that maintains an extraordinary CORSI for longer and longer will be adjusted less and less. The final number generated is an adjustment number to win percentage. In essence it takes the new adjust CORSI subtracts 50 with the weight of 2 and adds it to tenths of the win percentage. So if the team's adjusted CORSI is 55.2 the number is tuned into 5.2 multiplied by 2 for 10.4 then divided by ten and applied to win percentage. If the old win percentage was 0.500 the adjusted will be 51.04% which is 0.5104.

SCF, SCA, ratios to HDSF and HDSA are then run through the same bell curve analysis to reflect any data outside the standard deviation. Where CORSI has failed over the years is the analysis of scoring chances so I had to work in the analysis of scoring chances to extract. When the final number is generated it creates a the average of the 2 ratio's and turns it into a percentage scoring for. Then it subtracts 50 from the percentage with a weight of 0.5 and applies it to the winning percentage. If you have a scoring ration of 55% for it gets turned into 5 with 0.5 weight it ends up being 2.5 then divided by 10 and added to the win percentage. 51.04% plus 0.25 for 51.39% or 0.5139.

PDO is then treated with the same bell curve analysis then adjusted to shooting percentage and save percentage. Historically PDO over or under 1 standard deviations have save percentages move back to the normal less aggressively than shooting percentage and thus why there has to be 2 adjusting formulas. After all is said and done the Final adjusted PDO is multiplied with the win. Now lets say the adjust PDO is 1.01 it is multiplied to the win percentage. 0.5139 * 1.01 = 0.519

If 10 games were played previous it would be 72*2*0.519 + 10 pts for a total of 84.736 then of course rounded up to 85 points. Games played is also weighted in so the formula has less of an impact when more games are played.

The bell curve analysis I use is from another project designed for economics so in essence I took my work there and tried to apply it to hockey.
 
Last edited:

Janks

Pope Janks
Jan 7, 2010
7,731
1,702
Calgary
The formulas are all broke into separate functions. As they stand right now there nobody here would understand what is going on if I were to copy and paste.
All hail the mighty leader, the only one of us that can understand the ancient language of mathematics. :laugh:

I get you’re looking for feedback, but coming here and saying no one will understand your work is a bit of a stretch. I’d say the board is fairly educated, so I’d wager someone understands something.
 

SKRusty

Napalm
Jan 20, 2016
2,611
1,062
All hail the mighty leader, the only one of us that can understand the ancient language of mathematics. :laugh:

I get you’re looking for feedback, but coming here and saying no one will understand your work is a bit of a stretch. I’d say the board is fairly educated, so I’d wager someone understands something.

It is not about understanding mathematics which from your condescending manner you did not pick up in my initial statement. It is that it is all written in Python (A scripting Language). It is not that the math is too difficult it is a matter that 99.99% of people on this planet don't know how to script in Python.

The bell curve analysis portion that I have written is a package with more than 200,000 lines of code with a university grad student guiding and testing. It isn't that I am "GREAT" because I am a great mathematician-- I am great at what I do because there are few people that can do what I do with code.

Despite the rude manner that I have been treated to here I had been willing to let that go until now.

This was supposed to be a fun project to play with my spare time. I was taking my passion of hockey and coding and bringing it here for you guys to use in fantasy leagues. A tool that could make your fantasy leagues more fun, and maybe allow you to make modeled decisions for who to drop, pick-up, or trade for in fantasy leagues.

I didn't take it to the main board because this wasn't about flexing my ego about my knowledge level. It was to get a couple people to root around in the advanced stats and go yeah I do think that is close to how it would flush out and then giving you guys access to the league manager I was in the process of building but screw that.

Complete waste of my time thinking that this board would be at all respectful of what I was trying to do.
 
  • Like
Reactions: Johnny Hoxville

super6646

Registered User
Apr 16, 2018
17,882
15,730
Calgary
It is not about understanding mathematics which from your condescending manner you did not pick up in my initial statement. It is that it is all written in Python (A scripting Language). It is not that the math is too difficult it is a matter that 99.99% of people on this planet don't know how to script in Python.

The bell curve analysis portion that I have written is a package with more than 200,000 lines of code with a university grad student guiding and testing. It isn't that I am "GREAT" because I am a great mathematician-- I am great at what I do because there are few people that can do what I do with code.

Despite the rude manner that I have been treated to here I had been willing to let that go until now.

This was supposed to be a fun project to play with my spare time. I was taking my passion of hockey and coding and bringing it here for you guys to use in fantasy leagues. A tool that could make your fantasy leagues more fun, and maybe allow you to make modeled decisions for who to drop, pick-up, or trade for in fantasy leagues.

I didn't take it to the main board because this wasn't about flexing my ego about my knowledge level. It was to get a couple people to root around in the advanced stats and go yeah I do think that is close to how it would flush out and then giving you guys access to the league manager I was in the process of building but screw that.

Complete waste of my time thinking that this board would be at all respectful of what I was trying to do.

Just ignore them. I actually like what you are doing, and yes its a little bit convoluted and confusing to understand from face value, but I personally appreciate that you put your time into something like this, and I'm sure much of the board thinks the same way.
 

Ad

Upcoming events

Ad

Ad