All Purpose Analytics and Extended Stats Discussion

SimplySensational

Heard of Hough
Mar 27, 2011
18,839
6
VA
From above:



IMO when you include the bold part with the first part it all becomes a whole lot more valuable. It gives it actual context.

But as far as I can tell the fancystats crowd disagrees with this and says a shot is a shot is a shot and I have serious issues with that.

This isn't unique. The websites track where shots come from, they average out shot distance only because the bandwidth required to show heatmaps of players shots would be prohibitive.
 

Ajax1995

Registered User
Dec 9, 2002
8,809
867
This isn't unique. The websites track where shots come from, they average out shot distance only because the bandwidth required to show heatmaps of players shots would be prohibitive.

I didn't say it was unique. And IMO average shot distance alone is not nearly enough context to then say a shot is a shot is a shot.
 

Hivemind

We're Touched
Oct 8, 2010
37,073
13,535
Philadelphia
From above:



IMO when you include the bold part with the first part it all becomes a whole lot more valuable. It gives it actual context.

But as far as I can tell the fancystats crowd disagrees with this and says a shot is a shot is a shot and I have serious issues with that.

It depends on what type of analysis you're trying to conduct. Nobody is going to complain about having more data to work with. But that doesn't mean that the existing data isn't valuable and useful.
 

Ajax1995

Registered User
Dec 9, 2002
8,809
867
It depends on what type of analysis you're trying to conduct. Nobody is going to complain about having more data to work with. But that doesn't mean that the existing data isn't valuable and useful.

Well answer me this, is Bowman's crew wrong? Is a shot is a shot is a shot or not? Why would they be trying to give context to something that doesn't matter?

Again, my chief gripe with fancystats is the principle that shot quality doesn't matter. It seems to me that you have to accept that for the rest of it to mean anything and I have major issues accepting that.

I get it making a determination on good shot / bad shot is subjective and thus difficult to do in a way most agree with, just like good play / bad play, but just because something is difficult to quantify doesn't mean it doesn't matter and can be ignored without losing context, and IMO a significant amount of it.

Let's say Player X has a very nice CORSI close of 55%. Not taking into account usage and quality of competition I would expect the fancystats crowd to trumpet this as a player who helps drive possession and thus a 'good' player. But what if Bowman's deeper analysis revealed that a significantly higher number of this player's negative CORSI events were high quality opportunities compared to Player Y, whose CORSI close was only 49%, and thus a 'bad' fancystat player? What if this same deeper analysis revealed a significant amount of these same scenarios all over the league?

Plus/minus still has some value IMO but again in so many respects it lacks context when trying to use it to evaluate players because it doesn't take into account in any way how the goals were scored. Is that really so different than not taking into account anything about the shot attempts?

i don't know...
 

ChibiPooky

Yay hockey!
May 25, 2011
11,486
2
Fairfax, VA
Shot quality against is shown not to be a repeatable skill - in other words, over a large number of shot attempts against, the defender is not the primary driver of shot quality against. It makes sense too - as an offensive player, if the D drives you to a bad shooting area, you're probably going to look for a better shot rather than taking the crappy shot you're given. In the end it's more about limiting quantity (by taking the puck away) than quality (which is usually achieved by possessing the puck long enough to break down the defense).
 

Hivemind

We're Touched
Oct 8, 2010
37,073
13,535
Philadelphia
Well answer me this, is Bowman's crew wrong? Is a shot is a shot is a shot or not? Why would they be trying to give context to something that doesn't matter?
I haven't seen their data, but generally speaking, shot quality has not been demonstrated to be a sustainable skill or have predictive value. People have looked at scoring chances, shot distance, and shot location and none have found any measure of shot quality to be highly sustainable. Individual players' On-Ice Shooting % has some sustainability, but only around half of what Fenwick is and tends to be incredibly expensive.

Again, my chief gripe with fancystats is the principle that shot quality doesn't matter. It seems to me that you have to accept that for the rest of it to mean anything and I have major issues accepting that.
I don't think that's true at all. Most simply put, if you have a larger volume of shots, you will almost always have a larger volume of high quality shots.

Overall, I think you're reading a lot into a that line in the article. Chicago is secretive about what they do. We don't know if shot information is a central part of it or not. They could have collected it and determined it's not of value. They could have only collected it in the past, and thus why they divulged it to the reporter. Or the reporter could be blindly speculating on that one line. We don't really know, and unless the Blackhawks spill the beans on what they do (unlikely), we won't anytime soon.
 

g00n

Retired Global Mod
Nov 22, 2007
30,625
14,712
It's not whether shot quality is sustainable or predictable. It's whether or not a shot taken (or NOT taken) is being affected by factors that don't show up in the stat analysis but are still important. If there's no difference from shot to shot then simply taking more shots would win hockey games and every team would just be flinging the puck at the net as soon as it's in range. And any player would be as good as the next guy.

I think it's a tempting compromise to assume shot quality balances out. I've wondered about it before. But I don't think it's reflective of reality.
 

Hivemind

We're Touched
Oct 8, 2010
37,073
13,535
Philadelphia
It's not whether shot quality is sustainable or predictable. It's whether or not a shot taken (or NOT taken) is being affected by factors that don't show up in the stat analysis but are still important. If there's no difference from shot to shot then simply taking more shots would win hockey games and every team would just be flinging the puck at the net as soon as it's in range. And any player would be as good as the next guy.

I think it's a tempting compromise to assume shot quality balances out. I've wondered about it before. But I don't think it's reflective of reality.

It is about whether or not shot quality metrics have been shown to be sustainable or predictable. With the current meta-game and existing public metrics, the metrics suggest that shot quality is something that largely cannot be controlled (or minimally, is significantly less controllable than shot attempts). If players started dumping the puck at the net all the time, it would constitute a change to the meta-game where shot attempts (presumably) no longer corrolated well with scoring goals or winning. Simply put, in the game as it currently stands with the publically available data, if you're limiting shots in general, the data suggests that you're also limiting quality shots.

This is an area that SportVU will shed a lot of light. The stats discussion is going to look significantly different in a few years, assuming the data is publically available.
 

Ajax1995

Registered User
Dec 9, 2002
8,809
867
...but generally speaking, shot quality has not been demonstrated to be a sustainable skill or have predictive value.

Based on what? Who is determining the quality? I don't believe anyone is going to claim an unscreened wrister from the blueline is the same quality as a break away or predict they will go into the net at the same rate.

Quality is subjective and thus I would assume something that is not actually taken into account at all, just something the fancystats crowd is masquerading as quality simply because there is an objective way to quantify it.
 

g00n

Retired Global Mod
Nov 22, 2007
30,625
14,712
It is about whether or not shot quality metrics have been shown to be sustainable or predictable. With the current meta-game and existing public metrics, the metrics suggest that shot quality is something that largely cannot be controlled (or minimally, is significantly less controllable than shot attempts). If players started dumping the puck at the net all the time, it would constitute a change to the meta-game where shot attempts (presumably) no longer corrolated well with scoring goals or winning. Simply put, in the game as it currently stands with the publically available data, if you're limiting shots in general, the data suggests that you're also limiting quality shots.

This is an area that SportVU will shed a lot of light. The stats discussion is going to look significantly different in a few years, assuming the data is publically available.


This seems like a circular argument to me. More shots equals better, but too much blows the curve. That means, to me, quality matters.
 

txpd

Registered User
Jan 25, 2003
69,649
14,131
New Bern, NC
If there's no difference from shot to shot then simply taking more shots would win hockey games and every team would just be flinging the puck at the net as soon as it's in range..

Quality of goaltending can trump quality of shot. So more shots doesn't necessarily translate to more wins.
 

g00n

Retired Global Mod
Nov 22, 2007
30,625
14,712
Quality of goaltending can trump quality of shot. So more shots doesn't necessarily translate to more wins.

Just one of the factors I'm talking about when saying a shot is not just a shot. Thank you.

The same shot may have a chance of going in or creating a rebound vs one goalie, but against another it may be worthless and amount to little more than a faceoff or change of possession. Shots of a certain type may be more likely to beat a goalie on a given night, and may fit a player's strengths more. And a lot of it may even depend on how the goalie is playing that day, or how the defenders around him are playing, the ice quality, and on and on. So "more shots = possession = winning" is only useful in this context, otherwise every team would just shoot shoot shoot no matter what. Therefore the better teams are the ones that are better at BOTH getting shots off and doing so within the "meta" framework of the variables involved. Which means a shot is not always just a shot. I think this is obvious, yes?

Point being ...BECAUSE we can't predict these things, they are important. Some stat people want to find ways to stretch the numbers to eliminate these variables from the metrics they prefer, and cite inability to control for them (or quantify them) as reasons for essentially ignoring them. Well, that's kind of the problem trying to empiricize complex systems into objective data. You can't just throw away these influences and considerations while citing the very reasons they're relevant as your justification for removing them. It is an assumption among many key assumptions.
 

Hivemind

We're Touched
Oct 8, 2010
37,073
13,535
Philadelphia
The methodology behind roster building doesn't count as roster building? These discussions usually stem from player-related discussions. It's August, there isn't much concrete roster composition discussion to be had. Get over it.

Based on what? Who is determining the quality? I don't believe anyone is going to claim an unscreened wrister from the blueline is the same quality as a break away or predict they will go into the net at the same rate.

Quality is subjective and thus I would assume something that is not actually taken into account at all, just something the fancystats crowd is masquerading as quality simply because there is an objective way to quantify it.
Based on shot distance, scoring chances, and on-ice shooting percentage.

This seems like a circular argument to me. More shots equals better, but too much blows the curve. That means, to me, quality matters.
The metrics that are meaningful in the way the game is currently played are valued. If you significantly change the way the game is played, you change what stats would be valued. To put it another way, the stats say that better teams tend to outshoot their opponent. They don't say that shooting more makes you a better team.
These premises hold true in any sport. Baseball statheads value on-base percentage more than batting average. If pitchers started walking players far more liberally, the value of on-base percentage would wane. There's very real discussions of issues like this when you talk about pitch framing and catchers, in particular.
 

ChibiPooky

Yay hockey!
May 25, 2011
11,486
2
Fairfax, VA
The methodology behind roster building doesn't count as roster building? These discussions usually stem from player-related discussions. It's August, there isn't much concrete roster composition discussion to be had. Get over it.

With the amount of discussion analytics generate on this board, I'm fine with giving the topic its own thread.
 

Ajax1995

Registered User
Dec 9, 2002
8,809
867
Based on shot distance, scoring chances, and on-ice shooting percentage.

I want to make sure I understand what makes up 'shot quality.'

Shot distance is obvious. On ice shooting percentage is the current shooting percentage of the person taking the shot? What is scoring chances?

Thanks
 

Hivemind

We're Touched
Oct 8, 2010
37,073
13,535
Philadelphia
I want to make sure I understand what makes up 'shot quality.'

Shot distance is obvious. On ice shooting percentage is the current shooting percentage of the person taking the shot? What is scoring chances?

Thanks

On-Ice Shooting Percentage is the shooting percentage of a players' team while he is on the ice (both him and his teammates). It does have some inidivudal sustainability year-to-year, but it's lower than Fenwick-for. It's also very difficult to assess an individual's impact with it (players like Alex Burrows and Brooks Orpik have very favorable on-ice sh%, presumably because they get to play with the Sedins and Malkin/Crosby). There is some potential in these stats, but nobody has really come up with a good way to harness them. Additionally, it's been shown that defensemen don't really impact the on-ice sh% against them.

Scoring chances are typically defined as shots-on-goal taken from within the "homeplate area" extending from the goal line to the faceoff dots to the top of the circles. Scoring chances correlate heavily with shot-attempt measures, and creating more scoring chances per shot doesn't seem to be a repeatable skill.
 

NobodyBeatsTheWiz

Happy now?
Jun 26, 2004
23,422
1,973
The Burbs
It amazes me that with all the focus on 'puck possession', there's no widely-used tracking of it, AFAIK.

Corsi and Fenwick don't measure puck possession, so we need to stop saying they do.

I'd think zone time and actual puck possession would be more useful measures.
 

Hivemind

We're Touched
Oct 8, 2010
37,073
13,535
Philadelphia
Zone Time and Corsi/Fenwick correlate very highly.

And to try and pin values on it, the Pearson correlation coefficient between zone time and the three different shots metrics:

Shots +/-: .86
Fenwick +/-: .87
Corsi +/-: .90
http://vhockey.blogspot.com/2008/08/zone-time.html

Extremely well. After 24 games, the Leafs' 5v5 score tied TOA was 42.2%. Their 5v5 score tied Fenwick number after 24 games? 42.6%. Yikes. Their 5v5 score tied Corsi was hardly very different, at 44.0%.

'But wait', you say. 'Maybe those numbers just happen to match up at the end!' Nope.
TOA_vs._Corsi_medium.png
http://www.pensionplanpuppets.com/2013/9/16/4727746/leafs-attack-time-at-the-halfway-mark
 

ChibiPooky

Yay hockey!
May 25, 2011
11,486
2
Fairfax, VA
It amazes me that with all the focus on 'puck possession', there's no widely-used tracking of it, AFAIK.

Corsi and Fenwick don't measure puck possession, so we need to stop saying they do.

I'd think zone time and actual puck possession would be more useful measures.

Corsi and Fenwick both correlate extremely strongly to actual possession. That's why people use them instead of actual possession - they're much easier to count and track and get you essentially the same result.

I think the neutral zone metrics are where the league is headed, though.

With regard to tracking, I believe zone time is counted by the NHL but not actual possession. SportVU should bring a lot of tools to bear as far as tracking goes, too. Remains to be seen how it's used but it's helped bring a huge stats boom to basketball.
 

Hivemind

We're Touched
Oct 8, 2010
37,073
13,535
Philadelphia
If no one measures possession, how could anyone conclude that Corsi and Fenwick "correlate extremely strongly" to it?

NHL used to record zone time (see my link to Vic Ferrari's article). Individual blogs have tracked zone time since then. After seeing the very high correlation their numbers had with fenwick/corsi, most (all?) stopped, just like they did with scoring chances.
 

NobodyBeatsTheWiz

Happy now?
Jun 26, 2004
23,422
1,973
The Burbs
NHL used to record zone time (see my link to Vic Ferrari's article). Individual blogs have tracked zone time since then. After seeing the very high correlation their numbers had with fenwick/corsi, most (all?) stopped, just like they did with scoring chances.

Well, Vic Ferrari's article is an abomination from a statistical standpoint, given the minimal sample size used. 2% of the NHL games in a single season is light years away from a statistically significant sample. There is not a single legitimate statistical conclusion that he could have come to.

Some baseball #fancystats like WAR, FiP, ZIPs etc. were developed using decades of data, and they're still rife with problems.

Seems to me there's a whole lot of assuming going on in an area that should be completely absent of assumption.
 

Hivemind

We're Touched
Oct 8, 2010
37,073
13,535
Philadelphia
Well, Vic Ferrari's article is an abomination from a statistical standpoint, given the minimal sample size used. 2% of the NHL games in a single season is light years away from a statistically significant sample. There is not a single legitimate statistical conclusion that he could have come to.

You might want to read that a bit closer.

So I wrote a simple little Excel macro to scrape the zone time info off of the NHL.com game sheets for 2001/2002, and the shots stuff from the NHL.com event sheets. Laziness prevented me from filtering out the empty net goals, and the data is missing for 21 of the 1230 games, c'est la vie.

He's only missing 21 games, as in he has a 98.3% sample.
 

Ad

Upcoming events

Ad

Ad