Corsi for/quality negative correlation?

SnowblindNYR

HFBoards Sponsor
Sponsor
Nov 16, 2011
52,096
30,687
Brooklyn, NY
So I think there's a hypothesis that people sometimes have that the more shots attempted the more likely a higher percentage of those are low danger shots. This was a criticism of the Canes at one point. The idea is if you're more selective when you shoot in terms of quality and that would also lead to lower quantity. I wonder if there's been any correlation between high danger chance % and corsi for and if not if there's a place I can find the breakdown of both so I can do this myself.
 

JaegerDice

The mark of my dignity shall scar thy DNA
Dec 26, 2014
25,142
9,398
So I think there's a hypothesis that people sometimes have that the more shots attempted the more likely a higher percentage of those are low danger shots. This was a criticism of the Canes at one point. The idea is if you're more selective when you shoot in terms of quality and that would also lead to lower quantity. I wonder if there's been any correlation between high danger chance % and corsi for and if not if there's a place I can find the breakdown of both so I can do this myself.

There’s a direct positive correlation, though I’m not sure how strong it is. Teams with higher CF% also tend to have higher xGF%.

It goes back to the simple fact that there is only one puck. A team with the skill to fire the puck, retrieve it, fire again, etc, will typically generate more scoring chances as well by simple virtue of having the puck more. Teams that have the puck less will have fewer attempts and fewer chances.

Teams that score more than average on fewer shots tend to do so because they have better shooters (higher current and career sh%), not because their fewer chances are of higher quality.
 

SnowblindNYR

HFBoards Sponsor
Sponsor
Nov 16, 2011
52,096
30,687
Brooklyn, NY
There’s a direct positive correlation, though I’m not sure how strong it is. Teams with higher CF% also tend to have higher xGF%.

It goes back to the simple fact that there is only one puck. A team with the skill to fire the puck, retrieve it, fire again, etc, will typically generate more scoring chances as well by simple virtue of having the puck more. Teams that have the puck less will have fewer attempts and fewer chances.

Teams that score more than average on fewer shots tend to do so because they have better shooters (higher current and career sh%), not because their fewer chances are of higher quality.

I'm more curious about high danger chance %. All things being equal of course the team with a higher Corsi would have a higher GF% since they take more shots.
 

hairylikebear

///////////////
Apr 30, 2009
4,177
1,804
Houston
the more shots attempted the more likely a higher percentage of those are low danger shots
Based on this it sounds like you are trying to correlate CF with HDCF/CF, which is an issue because CF and 1/CF will clearly have a strong negative correlation.

I think you would get more meaningful insight correlating these variables with goals for which is what these metrics are designed to measure.

naturalstattrick allows you to export to CSV and both Excel and Google Sheets have a correlation function that should make the task pretty simple if you're interested in exploring these relationships.
 
  • Like
Reactions: SnowblindNYR

SnowblindNYR

HFBoards Sponsor
Sponsor
Nov 16, 2011
52,096
30,687
Brooklyn, NY
Based on this it sounds like you are trying to correlate CF with HDCF/CF, which is an issue because CF and 1/CF will clearly have a strong negative correlation.

I think you would get more meaningful insight correlating these variables with goals for which is what these metrics are designed to measure.

naturalstattrick allows you to export to CSV and both Excel and Google Sheets have a correlation function that should make the task pretty simple if you're interested in exploring these relationships.

It's funny, I had a similar conversation at work regarding gross margin % (I work in corporate finance). When revenue went up, I assumed that the gross margin % would go down because the denominator was going up, my manager was like yeah but the numerator would also go up. Just like in that example at work you have to remember to account for both the denominator AND numerator. CF and 1/CF are negatively correlated but HDCF and CF are positively correlated (raw numbers). HDCF/CF + MDCF/CF + LDCF/CF = 1, I'd like to see how the distribution changes if at all. These three types of shot attempts will always equal 1 so neither of these metrics will go up just by virtue of CF going up
 

hairylikebear

///////////////
Apr 30, 2009
4,177
1,804
Houston
It's funny, I had a similar conversation at work regarding gross margin % (I work in corporate finance). When revenue went up, I assumed that the gross margin % would go down because the denominator was going up, my manager was like yeah but the numerator would also go up. Just like in that example at work you have to remember to account for both the denominator AND numerator. CF and 1/CF are negatively correlated but HDCF and CF are positively correlated (raw numbers). HDCF/CF + MDCF/CF + LDCF/CF = 1, I'd like to see how the distribution changes if at all. These three types of shot attempts will always equal 1 so neither of these metrics will go up just by virtue of CF going up
In this case you would just want to account for the numerator because the correlation with the denominator is trivial - so just leave it out.

But I don't think correlation is what you're even looking for here. Maybe another possible approach is to put the league in buckets for high/mid/low CF values and then look at those distributions. But I don't know where to find that kind of info first hand so the only option I'm aware of is getting into the spreadsheets.
 

SnowblindNYR

HFBoards Sponsor
Sponsor
Nov 16, 2011
52,096
30,687
Brooklyn, NY
In this case you would just want to account for the numerator because the correlation with the denominator is trivial - so just leave it out.

But I don't think correlation is what you're even looking for here. Maybe another possible approach is to put the league in buckets for high/mid/low CF values and then look at those distributions. But I don't know where to find that kind of info first hand so the only option I'm aware of is getting into the spreadsheets.

I was using correlation as a broader term rather than the "r".
 

Ad

Upcoming events

Ad

Ad