91Kadri91*
Guest
The first, and most pressing, issue is the lack of 'Discussion' prefix; why isn't there a 'Discussion' tag? I was going to use the 'GDT' prefix, but I elected to use the 'Proposal' prefix, because that's (sort of) what this is. I'm proposing that this thread becomes the all-purpose statistics thread, where analysis for players across the league, new models, and questions regarding statistics can all 'take place'. I recognize (and respect) that not everyone wants to discuss statistics, so I figured this would be a good place to have arguments/discussion involving new (or old) models.
So what spurred this desire? Well, The Passing Project is releasing copious amounts of data tomorrow. TPP was started by Ryan Stimson as a way to measure a volitional, calculated skill:
http://www.allaboutthejersey.com/2015/5/20/8627957/2015-passing-project-data-release-volume-i
So how are (and which) passes tracked?
The issue with goals is that, while they're valid (goal differential is, at least), there's a ton of variance (unexplained variance that is often attributed to luck). Burtch did a study showing the reliability and correlation (to Pt% and Win%) a variety of different statistics had, and found that GF% had a YoY (year-over-year) correlation (R^2 based on 6 years of data) of 0.175:
Corsi, meanwhile, is far more reliable, while offering some validity. The issue some have with Corsi is that it's not completely valid, nor is it entirely reliable. Consider this (very basic) explanation of what a statistic (or research) should 'have' to be considered sufficient:
https://www.uni.edu/chfasoa/reliabilityandvalidity.htm
Another issue with goals is that they don't predict future goals due to the limited sample (understandable considering the lack of events is one of the reasons goal metrics are so unreliable, and why they're not a reasonable indication of player or team ability). Corsi is a vast improvement over goal metrics in this regard, but they're not great. A solution to this (a way to increase the validity of corsi and the reliability of goal metrics) is a Poisson model (expected GF-GF=xGF/EGF is the basic idea), where expected goals is calculated using a weighted goals and shots (shots from rebounds, shot angle, rush shots, volume etc) approach. This model shows a significant improvement over both Corsi and GF%:
That's pretty impressive, but it can still be better. How do passing metrics compare? It's hard to say right now, since the data set isn't particularly large for specific passing events, but the aggregate result (total passes) is predictive:
Here are some graphs showing the reliability of specific passing metrics (again, small sample):
Okay, so at the very least we can say that passing is reliable (and therefore indicative of a player's skill) even if we don't know how reliable specific types of passing are (although they appear to be reliable, even if the samples are still too small to make any definitive conclusions). How valid are they? Well... it appears that passing efficiency could be very valid (the sample is a little too small to state with absolute certainty):
This is based on the tracking of five teams (Panthers, Devils, Rangers, Islanders and Blackhawks). Two of the teams didn't have all their games tracked. Here is the results using only the teams with all of their games tracked:
That's... almost perfect correlation. Again, the sample is small, but here's what the author has to say:
http://www.hockeyprospectus.com/the-pass-tracking-project-passing-and-goals/
So passing metrics may be more reliable than Corsi statistics, and just as valid as Goal metrics. That's incredibly exciting.
How, exactly, does passing data correlate to Corsi?
There's some relation, but it's far from perfect. Is it possible that amalgamating passing metrics and shot metrics (corsi, fenwick, shot attempts) could result in a complete picture (with both better reliability and validity)? Yes, many of the specific pass metrics are measured 'successful' when they result in a specific type of shot, but the shot is not accounted for. Both are a 'skill' (significantly less unexplained variances than goal metrics have), and accounting for both should improve the overall validity and reliability. As a result, there's a 'Passing Corsi' metric. Of course, these metrics can also show us who plays well together, and why:
http://www.allaboutthejersey.com/2015/6/3/8694363/2015-passing-project-data-release-volume-ii
Okay, so that all sounds great, but where can you actually view these results? Well, you can view forward and defensemen data from last season. "But what about this season?" Well, you may recall that I referenced passing data being released tomorrow. While you're waiting, here are a few charts they've released that elucidate the capabilities of Leafs players (and the team) based on a 13 game sample:
View Forwards at the link: http://hockey-graphs.com/2015/12/16/toronto-maple-leafs-passing-lane-corsi/
View more at the link: http://hockey-graphs.com/2015/12/15/toronto-maple-leafs-passing-and-linkup-network/
Much more at the link: http://hockey-graphs.com/2015/12/14/toronto-maple-leafs-passing-metrics-101/
So where does statistical analysis go from here? Well, more microanalysis will help to further the validity and reliability of metrics, and help determine what, exactly, causes (future) goals (zone-exits/entries, loose-puck recoveries, deke success rates, etc). There are many events being tracked (by companies such as SportLogIQ), but not a lot are publicly available.
----------------------------------------------------------------------------------------------------------------------
Alright, so since this is supposed to be an 'all-encompassing' statistical thread, I thought I'd provide some Links of Interest (I'll update this section with links provided by posters):
Advanced Stats (full season or single game; use player and lab tabs to find dCorsi, WAR and Comparison Scores): http://war-on-ice.com/
Single Game Tracking: http://www.hockeystats.ca/
Single Game Tracking: http://www.naturalstattrick.com/
Advanced Stats (full season): http://xtrahockeystats.com/players.php
Advanced Stats (full season): http://stats.hockeyanalysis.com/
Passing Viz: http://hockeyviz.com/passing.html
Play Tracking: http://sportlogiq.com/
Advanced Stats: http://hockeysimplified.blogspot.ca/
OHL Game Notes: http://buckeyestatehockey.com/2015/ohl-game-notes-13-16-mis-ldn-sag-nia-osh-bar.html
Corsi Plus-Minus: http://donttellmeaboutheart.blogspot.ca/p/corsi-plus-minus.html
Passing Data: https://public.tableau.com/profile/spencer.mann#!/vizhome/PassingDataDefense/Compare
https://public.tableau.com/profile/spencer.mann#!/vizhome/PassingDataForwards/Compare
Advanced Stats (articles): http://hockey-graphs.com/about/
Zone Entries/Scoring Chance Data (suggested by Menzinger):
https://mapleleafshotstove.com/leafsnews/post-game/
Articles of Interest (will be updated):
Does Scoring the First Goal Matter (simplified version of an interesting finding; I found a much more analytical approach to the data presentation but I can't find it)?: http://news.psu.edu/story/313928/2014/04/29/athletics/want-win-nhl-score-first-not-until-third
Using Corsi to Analyze 'Luck': http://www.pensinitiative.com/2014/10/utilizing-corsi-to-better-analyze-pdo.html
Do Zone-Starts Matter?: http://puckplusplus.com/2015/01/15/how-much-do-zone-starts-matter-i-maybe-not-as-much-as-we-thought/
http://puckplusplus.com/2015/01/20/how-much-do-zone-starts-matter-part-ii-a-lot-on-their-own-not-that-much-in-aggregate/
http://www.hockeybuzz.com/blog/James-Tanner/How-Much-Do-Zone-Starts-Matter/200/70744
http://hockeyanalysis.com/2014/12/13/zone-starts-dont-matter-much/
http://nhlnumbers.com/2012/5/22/the-effect-of-zone-starts-on-offensive-production
Does Competition Matter?: http://nhlnumbers.com/2012/7/23/the-importance-of-quality-of-competition
http://www.arcticicehockey.com/2011/7/7/2264529/does-qualcomp-matter
http://www.arcticicehockey.com/2011/7/27/2294013/further-to-does-qualcomp-matter
http://hockey-graphs.com/2014/10/14/how-much-does-matching-competition-matter-on-a-team-level/
Old (a little more simple) Expected Goals Model: http://www.sloansportsconference.com/wp-content/uploads/2012/02/NHL-Expected-Goals-Brian-Macdonald.pdf
Faceoffs Analysis: http://statsportsconsulting.com/main/wp-content/uploads/FaceoffAnalysis12-12.pdf
Shot Types and Goal Probability: http://hockeyanalytics.com/Research_files/Shot_Quality.pdf
The Importance of Zone Entries/Exits (read these articles): https://jenlc13.wordpress.com/2015/09/30/back-to-basics-offensive-zone-entries/
http://hockeyanalysis.com/2014/08/26/team-zone-entry-data-predicting-standings/
http://www.sbnation.com/nhl/2014/4/9/5592622/nhl-stats-zone-entries-defense
http://www.sportsnet.ca/hockey/nhl/a-whole-new-way-to-look-at-nhl-defencemen/
Who To Follow (will be updated):
https://twitter.com/IneffectiveMath
https://twitter.com/SteveBurtch
https://twitter.com/RK_Stimp
https://twitter.com/MannyElk
https://twitter.com/RegressedPDO
https://twitter.com/307x
https://twitter.com/ChrisBoyle33
https://twitter.com/Chris_LogiQ
https://twitter.com/AndrewBerkshire
https://twitter.com/NMercad
https://twitter.com/JDylanBurke
https://twitter.com/joshweissbock
https://twitter.com/MoneyPuck_
https://twitter.com/robvollmanNHL
https://twitter.com/NickAbe
https://twitter.com/DTMAboutHeart
https://twitter.com/web_sant
https://twitter.com/SpenceIce
https://twitter.com/Classlicity
https://twitter.com/ShutdownLine
https://twitter.com/GarretHohl
https://twitter.com/ToddCordell
https://twitter.com/Null_HHockey
https://twitter.com/hockeyanalysis
https://twitter.com/garik16
https://twitter.com/puckintel
----------------------------------------------------------------------------------------------------------------------
Well, that's a start. Let the discussion begin!
So what spurred this desire? Well, The Passing Project is releasing copious amounts of data tomorrow. TPP was started by Ryan Stimson as a way to measure a volitional, calculated skill:
That’s the first question I had before I even started tracking anything way back in October of 2013. What could I hope to learn from tracking passing plays? It started around the idea that as so much of hockey is random, passing represents a skillful play to link between players and generate shot attempts. By isolating that aspect of the game, you’re essentially isolating skilled, intentional plays made by the players. The other thing was a quote from Gavin Fleig in Soccernomics: "Teams that complete a higher number of passes in the final third consistently finish in the top four of the league."
So, that’s where it started. Could I find a passing metric that correlated well with teams winning games? Could isolating skilled plays teach us anything new about the game? The answers to both questions was a resounding "Yes."
http://www.allaboutthejersey.com/2015/5/20/8627957/2015-passing-project-data-release-volume-i
So how are (and which) passes tracked?
The issue with goals is that, while they're valid (goal differential is, at least), there's a ton of variance (unexplained variance that is often attributed to luck). Burtch did a study showing the reliability and correlation (to Pt% and Win%) a variety of different statistics had, and found that GF% had a YoY (year-over-year) correlation (R^2 based on 6 years of data) of 0.175:
Now look carefully at some of the other "important" yearly statistics. 5v5 PDO and the GF60 and GA60 stats. Notice how low their reliability scores are? That's because there's a large amount of variation in how much teams score or how many goals they allow year over year. This is because SH% and SV% are NOT repeatable, reliable statistics at the team level. Yes one player might be consistently good or consistently bad, but the randomness of all of his team-mates (the guy on the hot streak - the guy in a funk) has a way of balancing all of this out over the course of a season.
Corsi, meanwhile, is far more reliable, while offering some validity. The issue some have with Corsi is that it's not completely valid, nor is it entirely reliable. Consider this (very basic) explanation of what a statistic (or research) should 'have' to be considered sufficient:
While reliability is necessary, it alone is not sufficient. For a test to be reliable, it also needs to be valid. For example, if your scale is off by 5 lbs, it reads your weight every day with an excess of 5lbs. The scale is reliable because it consistently reports the same weight every day, but it is not valid because it adds 5lbs to your true weight. It is not a valid measure of your weight.
https://www.uni.edu/chfasoa/reliabilityandvalidity.htm
Another issue with goals is that they don't predict future goals due to the limited sample (understandable considering the lack of events is one of the reasons goal metrics are so unreliable, and why they're not a reasonable indication of player or team ability). Corsi is a vast improvement over goal metrics in this regard, but they're not great. A solution to this (a way to increase the validity of corsi and the reliability of goal metrics) is a Poisson model (expected GF-GF=xGF/EGF is the basic idea), where expected goals is calculated using a weighted goals and shots (shots from rebounds, shot angle, rush shots, volume etc) approach. This model shows a significant improvement over both Corsi and GF%:
That's pretty impressive, but it can still be better. How do passing metrics compare? It's hard to say right now, since the data set isn't particularly large for specific passing events, but the aggregate result (total passes) is predictive:
Here are some graphs showing the reliability of specific passing metrics (again, small sample):
Okay, so at the very least we can say that passing is reliable (and therefore indicative of a player's skill) even if we don't know how reliable specific types of passing are (although they appear to be reliable, even if the samples are still too small to make any definitive conclusions). How valid are they? Well... it appears that passing efficiency could be very valid (the sample is a little too small to state with absolute certainty):
Last season, I began tracking passes that generate shot attempts in an undertaking to explain how offense was being created. This led to various metrics to measure the efficiency of an individual player as well as a team when attempting passes. I name I ascribed to this metric was SAGE, Shot Attempt Generation Efficiency, and it was quite simple.
If a player completes two passes and the recipient attempts two shots on net, forcing a save and missing the net, the player making the pass had generated one shot on two attempts for a SAGE of 50%. Simple, yes?
I began keeping track of how often a team won various categories of efficiency and how often they also won games decided in regulation. The results through 82 games last season and 134 games this season are that it efficiency matters a great deal. Today, I wanted to offer an early answer on the relationship, if any, between passing (shot generation) efficiency and how many goals a team scores.
This is based on the tracking of five teams (Panthers, Devils, Rangers, Islanders and Blackhawks). Two of the teams didn't have all their games tracked. Here is the results using only the teams with all of their games tracked:
That's... almost perfect correlation. Again, the sample is small, but here's what the author has to say:
Well, isn’t that something? Through the first quarter of the season for the Devils, Panthers, and Blackhawks, there is nearly a perfect correlation between how efficient they are passing the puck and how many goals they score during 5v5 situations. Why could this be?
It’s my logical belief that a goalie has less time to get set and diagnose a situation when players make crisp, effective passes. We see this on a nightly basis when watching games. I don’t expect this correlation to remain at this level, but I also don’t expect to drop off significantly.
http://www.hockeyprospectus.com/the-pass-tracking-project-passing-and-goals/
So passing metrics may be more reliable than Corsi statistics, and just as valid as Goal metrics. That's incredibly exciting.
How, exactly, does passing data correlate to Corsi?
There's some relation, but it's far from perfect. Is it possible that amalgamating passing metrics and shot metrics (corsi, fenwick, shot attempts) could result in a complete picture (with both better reliability and validity)? Yes, many of the specific pass metrics are measured 'successful' when they result in a specific type of shot, but the shot is not accounted for. Both are a 'skill' (significantly less unexplained variances than goal metrics have), and accounting for both should improve the overall validity and reliability. As a result, there's a 'Passing Corsi' metric. Of course, these metrics can also show us who plays well together, and why:
It’s one thing for a player to shoot the puck. It’s another thing for them to generate offense from a pass. These are both valuable and, should be, essential means of player evaluation. If you’re only looking at a player’s individual shot attempts, you’re only getting half the picture. If you want a total picture, you need to look at both means of production.
Along that same line, certain players may combine with certain line mates better than others. So, if we can quantify how often Player A passes to Player B and vice versa, it allows to move closer to answering questions like, "Does Player A do something to improve Player B’s production? How does he do that? Is there a ‘chemistry’ between the two players? Is one simply easier or harder to play with?" Lots of questions like this come to mind when we look at players together and apart. That’s why I think tools like David Johnson’s WOWY and Super WOWYare of the utmost importance. How players play together and apart is incredible information. The passing linkups below simply take it a step closer to the ice level of what is actually occurring.
http://www.allaboutthejersey.com/2015/6/3/8694363/2015-passing-project-data-release-volume-ii
Okay, so that all sounds great, but where can you actually view these results? Well, you can view forward and defensemen data from last season. "But what about this season?" Well, you may recall that I referenced passing data being released tomorrow. While you're waiting, here are a few charts they've released that elucidate the capabilities of Leafs players (and the team) based on a 13 game sample:
View Forwards at the link: http://hockey-graphs.com/2015/12/16/toronto-maple-leafs-passing-lane-corsi/
View more at the link: http://hockey-graphs.com/2015/12/15/toronto-maple-leafs-passing-and-linkup-network/
Much more at the link: http://hockey-graphs.com/2015/12/14/toronto-maple-leafs-passing-metrics-101/
So where does statistical analysis go from here? Well, more microanalysis will help to further the validity and reliability of metrics, and help determine what, exactly, causes (future) goals (zone-exits/entries, loose-puck recoveries, deke success rates, etc). There are many events being tracked (by companies such as SportLogIQ), but not a lot are publicly available.
----------------------------------------------------------------------------------------------------------------------
Alright, so since this is supposed to be an 'all-encompassing' statistical thread, I thought I'd provide some Links of Interest (I'll update this section with links provided by posters):
Advanced Stats (full season or single game; use player and lab tabs to find dCorsi, WAR and Comparison Scores): http://war-on-ice.com/
Single Game Tracking: http://www.hockeystats.ca/
Single Game Tracking: http://www.naturalstattrick.com/
Advanced Stats (full season): http://xtrahockeystats.com/players.php
Advanced Stats (full season): http://stats.hockeyanalysis.com/
Passing Viz: http://hockeyviz.com/passing.html
Play Tracking: http://sportlogiq.com/
Advanced Stats: http://hockeysimplified.blogspot.ca/
OHL Game Notes: http://buckeyestatehockey.com/2015/ohl-game-notes-13-16-mis-ldn-sag-nia-osh-bar.html
Corsi Plus-Minus: http://donttellmeaboutheart.blogspot.ca/p/corsi-plus-minus.html
Passing Data: https://public.tableau.com/profile/spencer.mann#!/vizhome/PassingDataDefense/Compare
https://public.tableau.com/profile/spencer.mann#!/vizhome/PassingDataForwards/Compare
Advanced Stats (articles): http://hockey-graphs.com/about/
Zone Entries/Scoring Chance Data (suggested by Menzinger):
https://mapleleafshotstove.com/leafsnews/post-game/
Articles of Interest (will be updated):
Does Scoring the First Goal Matter (simplified version of an interesting finding; I found a much more analytical approach to the data presentation but I can't find it)?: http://news.psu.edu/story/313928/2014/04/29/athletics/want-win-nhl-score-first-not-until-third
Using Corsi to Analyze 'Luck': http://www.pensinitiative.com/2014/10/utilizing-corsi-to-better-analyze-pdo.html
Do Zone-Starts Matter?: http://puckplusplus.com/2015/01/15/how-much-do-zone-starts-matter-i-maybe-not-as-much-as-we-thought/
http://puckplusplus.com/2015/01/20/how-much-do-zone-starts-matter-part-ii-a-lot-on-their-own-not-that-much-in-aggregate/
http://www.hockeybuzz.com/blog/James-Tanner/How-Much-Do-Zone-Starts-Matter/200/70744
http://hockeyanalysis.com/2014/12/13/zone-starts-dont-matter-much/
http://nhlnumbers.com/2012/5/22/the-effect-of-zone-starts-on-offensive-production
Does Competition Matter?: http://nhlnumbers.com/2012/7/23/the-importance-of-quality-of-competition
http://www.arcticicehockey.com/2011/7/7/2264529/does-qualcomp-matter
http://www.arcticicehockey.com/2011/7/27/2294013/further-to-does-qualcomp-matter
http://hockey-graphs.com/2014/10/14/how-much-does-matching-competition-matter-on-a-team-level/
Old (a little more simple) Expected Goals Model: http://www.sloansportsconference.com/wp-content/uploads/2012/02/NHL-Expected-Goals-Brian-Macdonald.pdf
Faceoffs Analysis: http://statsportsconsulting.com/main/wp-content/uploads/FaceoffAnalysis12-12.pdf
Shot Types and Goal Probability: http://hockeyanalytics.com/Research_files/Shot_Quality.pdf
The Importance of Zone Entries/Exits (read these articles): https://jenlc13.wordpress.com/2015/09/30/back-to-basics-offensive-zone-entries/
http://hockeyanalysis.com/2014/08/26/team-zone-entry-data-predicting-standings/
http://www.sbnation.com/nhl/2014/4/9/5592622/nhl-stats-zone-entries-defense
http://www.sportsnet.ca/hockey/nhl/a-whole-new-way-to-look-at-nhl-defencemen/
Who To Follow (will be updated):
https://twitter.com/IneffectiveMath
https://twitter.com/SteveBurtch
https://twitter.com/RK_Stimp
https://twitter.com/MannyElk
https://twitter.com/RegressedPDO
https://twitter.com/307x
https://twitter.com/ChrisBoyle33
https://twitter.com/Chris_LogiQ
https://twitter.com/AndrewBerkshire
https://twitter.com/NMercad
https://twitter.com/JDylanBurke
https://twitter.com/joshweissbock
https://twitter.com/MoneyPuck_
https://twitter.com/robvollmanNHL
https://twitter.com/NickAbe
https://twitter.com/DTMAboutHeart
https://twitter.com/web_sant
https://twitter.com/SpenceIce
https://twitter.com/Classlicity
https://twitter.com/ShutdownLine
https://twitter.com/GarretHohl
https://twitter.com/ToddCordell
https://twitter.com/Null_HHockey
https://twitter.com/hockeyanalysis
https://twitter.com/garik16
https://twitter.com/puckintel
----------------------------------------------------------------------------------------------------------------------
Well, that's a start. Let the discussion begin!
Last edited by a moderator: