What Corsi really translates to (in numbers that are easy to grasp)

SladeWilson23

I keep my promises.
Sponsor
Nov 3, 2014
26,735
3,220
New Jersey
You saying there's no correlation isnt gonna convince me of anything, i've been looking up numbers for the past 5 years. I've had numerous discussions with people in the advanced stats community about almost everything corsi related.

Can you provide proof that zone starts affect corsi?
 

Appleyard

Registered User
Mar 5, 2010
31,808
41,288
Copenhagen
twitter.com
From my reading there are not too many who say zone starts do not matter at all.

Many say though that for a lot of the league the impact is negligible... as most players get between 48-52% relative zone starts.

But in turn big negative or positive relative zone starts do have an impact on corsi.
 

TT1

Registered User
May 31, 2013
23,753
6,244
Montreal
Can you provide proof that zone starts affect corsi?

I don't have a chart or anything but look up players (especially dmen) and compare their yearly numbers based on zone starts.

The numbers you look at should preferably be a few years a part (at the most) and the player should still be playing on the same team, preferably with the same d partner, forward core etc (that would give you the most precise data).

Personally I'm waiting to see Gostisbehere's numbers once his o-zone starts start dropping. He's an interesting case study.

http://www.hockey-reference.com/players/g/gostish01-advanced-5on5.html
 

Filthy Dangles

Registered User*
Sponsor
Oct 23, 2014
28,839
40,536
The thing is, there not being enough information to say that zone starts has any significance is not the same as there not being any significance. Logic would suggest that it does matter. If you're starting 180 feet from your own net vs. 10 feet away, well, any one who has played can tell you that one situation is more likely to see a shot against your net than the other.

If the "advanced" stats community feels that it's a closed discussion, I think they are doing themselves a disservice. You shouldn't just be looking at what the numbers say. You should also ask why it is saying it, and if it makes sense. To me, that's the difference between someone who has a real understanding of numbers and how to use them, and someone who lets the numbers tell them what to think.

Yeah. I'd love to see the average difference in CF% when only considering off vs def zone starts. There's obviously has to be a difference in these two situations. But like you said, whether that difference has any significance is unclear.
 

Appleyard

Registered User
Mar 5, 2010
31,808
41,288
Copenhagen
twitter.com
Yeah. I'd love to see the average difference in CF% when only considering off vs def zone starts. There's obviously has to be a difference in these two situations. But like you said, whether that difference has any significance is unclear.

I mean... there IS a relationship for sure.

When you plug in zone starts (OZ & DZ) and run them against CF% and CF% rel you come back with pretty consistent correlation over big samples of players and minutes...

generally it is in the 0.45-0.5 range, with OZ being positive, DZ being negative... and when NZ is ran the result is always right at zero. (I started to try and look at it myself the other year, but realised pretty quickly that: A. the range of data I would have available is not really sufficient to actually draw much that is worthwhile/concrete. B: It would take a lot of time! So stopped!)

So there is a moderate relationship. Especially when talking about samples of 200+ players over ~4 year time-spans each with like 2000+ mins.

You are talking about a P value (off the top of my head, I am not going away and running it!) of 0.05-0.08 for that realistically... not perfect, but close to what would be acceptable academically to say shows a significant level of correlation. (~5-8% chance that it is just down to chance.)

But while it is pretty certain there is a relationship between the two the extent of it is what is actually important... and that is what everyone has had trouble really isolating, as you are working with numbers that are generally in a very narrow range, with a lot of other variables around them.
 

SladeWilson23

I keep my promises.
Sponsor
Nov 3, 2014
26,735
3,220
New Jersey
I don't have a chart or anything but look up players (especially dmen) and compare their yearly numbers based on zone starts.

The numbers you look at should preferably be a few years a part (at the most) and the player should still be playing on the same team, preferably with the same d partner, forward core etc (that would give you the most precise data).

Personally I'm waiting to see Gostisbehere's numbers once his o-zone starts start dropping. He's an interesting case study.

http://www.hockey-reference.com/players/g/gostish01-advanced-5on5.html

One thing I'd like to see done is keeping track of where a player's ice time is distributed. An expansion of zone starts data.
 

Appleyard

Registered User
Mar 5, 2010
31,808
41,288
Copenhagen
twitter.com
I remember that what I ran the initial coefficients two things stood out:

Raw Corsi and Corsi rel against zone starts were basically the same variable... only for forwards. The results for both were like +0.45-0.49(OZ) and -0.45-0.49 (DZ).

But for Dmen the Raw Corsi was at ~0.48-0.55 both ways (so slightly stronger than for forwards)... but the Corsi rel was down near 0.4 flat (both ways, a bit weaker than forwards and not matching the raw for Dmen).

NZ was always within ~0.05 of positive and negative for both forwards and Dmen... though each time it was negatively correlated for Dmen, positive for forwards. But probably just noise given how tiny the number were.

EDIT: And I think I remember that DZ starts themselves were better correlated than OZ starts with corsi... but only for Dmen, for forwards both were similar but flipped, OZ being +, DZ being -.

But this was like 18 months ago, so I may be a bit out! But that was the basic jist.

Though ofc it does not matter apart from being of minor interest... since it does not really tell you the extent of the impact on actual results, just that there is a relationship!
 

The Thin White Duke

Registered User
Aug 11, 2009
3,909
1
One thing I'd like to see done is keeping track of where a player's ice time is distributed. An expansion of zone starts data.

Having a little RFID chip somewhere in a player's equipment, as well as one in every puck would help out so much. I assume teams already have guys measuring zone time and puck-on-stick time possession privately, it would be nice to see those numbers and compare them to what we have now.
 

Machinehead

GoAwayTrouba
Jan 21, 2011
144,691
118,621
NYC
Why don't they matter though? Is it because zone starts imbalances tend to even out or do they really not matter at all? Consider this unrealistic example. A 52% top pairing D plays 1000 consecutive shifts that start in the D zone followed by 1000 consecutive shifts in the O zone. What would both % look like?

The data exists to look at such a situation too. Take a top pairing D who logs tons of minutes and shifts over the last 3-4 seasons. Ignore neutral zone starts. Then, only consider shifts that started in the off zone and then the shifts that started in the D zone.

Willing to bet significant difference.

In that hypothetical, he'd probably come out to around 60% CF in his 1,000 OZ shifts and 40% CF in his 1,000 DZ shifts. Fair enough.

But here's how it actually works - out of those 2,000 shifts, 1,200 will start when the player jumps over the boards. The majority of shifts do not start in any of the three zones; everyone forgets that.

Out of the remaining 800, another 270-300 start in the neutral zone.

Now we're talking about 500-540 shifts out of 2,000 that actually have an effect.

It's not that OZ and DZ start don't affect corsi, it's more so that over the course of a season, there's just not actually that many OZ and DZ starts - not enough to influence the data.

Now let's assume this hypothetical player has 60% OZ starts. That's 300 out of the 500, or 50 more than a player who is split evenly. Over 82 games, 50 shifts is a fraction of peanuts.

And it's really 25 because half the time, you win/lose the faceoff depending on which zone you're in, and nothing happens.
 
Last edited:

Ol' Jase

Steaming bowls of rich, creamy justice.
Sponsor
Jul 24, 2005
12,485
4,818
In that hypothetical, he'd probably come out to around 60% CF in his 1,000 OZ shifts and 40% CF in his 1,000 DZ shifts. Fair enough.

But here's how it actually works - out of those 2,000 shifts, 1,200 will start when the player jumps over the boards. The majority of shifts do not start in any of the three zones; everyone forgets that.

Out of the remaining 800, another 270-300 start in the neutral zone.

Now we're talking about 500-540 shifts out of 2,000 that actually have an effect.

It's not that OZ and DZ start don't affect corsi, it's more so that over the course of a season, there's just not actually that many OZ and DZ starts - not enough to influence the data.

Now let's assume this hypothetical player has 60% OZ starts. That's 300 out of the 500, or 50 more than a player who is split evenly. Over 82 games, 50 shifts is a fraction of peanuts.

And it's really 25 because half the time, you win/lose the faceoff depending on which zone you're in, and nothing happens.

I know this will fall on deaf ears, but this is a perfect synopsis of the inherent flaw in hockey advanced stats compared to baseball.

There is never a situation, not a single one, where a shift doesn't start in a zone. The area where the puck is in play is where the player's shift starts. Period. Creating caveats to eliminate this fact from statistical analysis makes the statistic dishonest.

YMMV.
 

Machinehead

GoAwayTrouba
Jan 21, 2011
144,691
118,621
NYC
I know this will fall on deaf ears, but this is a perfect synopsis of the inherent flaw in hockey advanced stats compared to baseball.

There is never a situation, not a single one, where a shift doesn't start in a zone. The area where the puck is in play is where the player's shift starts. Period. Creating caveats to eliminate this fact from statistical analysis makes the statistic dishonest.

YMMV.



We simply don't measure it that way. That's your opinion on what a zone start should be, but corsiRelOlJase is not a stat.
 

Ol' Jase

Steaming bowls of rich, creamy justice.
Sponsor
Jul 24, 2005
12,485
4,818


We simply don't measure it that way. That's your opinion on what a zone start should be, but corsiRelOlJase is not a stat.


"You"? Your "group's" definition?

This is the major reason for the flaws in advanced stats. The arrogant *******s who still don't understand the simple concept that hockey is not subject to very many static variables.

It's laughable that you think "your" definition of a zone start is somehow realistic.

Try to follow now. If "you" are creating caveats based on the fact that a zone start doesnt pertain to 60% of the situations that Corsi would be measured, than the concept of a zone start not affecting Corsi full stop is impossible to determine, because Corsi isn't segregated between the two variables of a recorded zone start and one that isn't.

Tell me you at understand this extremely basic principle. Please.
 
Last edited:

Machinehead

GoAwayTrouba
Jan 21, 2011
144,691
118,621
NYC
It's a textbook definition. If the other team wins 1-0 because one of their shots crossed the line, you don't get to say "I don't think that's a goal"
 

Machinehead

GoAwayTrouba
Jan 21, 2011
144,691
118,621
NYC
Also I love the notion that "over the boards" starts are happening in different zones.

Show me a team change lines while they're playing defense. I'll wait.
 

Ol' Jase

Steaming bowls of rich, creamy justice.
Sponsor
Jul 24, 2005
12,485
4,818
Also I love the notion that "over the boards" starts are happening in different zones.

Show me a team change lines while they're playing defense. I'll wait.

It's doesn't matter where the players are. The puck determines where a player starts his shift. Of course very few line changes happen when a team is "playing defense", but many line changes happen when the team has puck possession in their own zone. It's still a shift start.

If you are eliminating 60% of the situations where Corsi would be measured as "non zone starts", then it is impossible to properly determine full stop that zone starts don't affect Corsi because you're not segregating the Corsi measurement between situation where zone starts are applicable and when they are not.

How do you not understand this?
 

Machinehead

GoAwayTrouba
Jan 21, 2011
144,691
118,621
NYC
It's doesn't matter where the players are. The puck determines where a player starts his shift. Of course very few line changes happen when a team is "playing defense", but many line changes happen when the team has puck possession in their own zone. It's still a shift start.

If you are eliminating 60% of the situations where Corsi would be measured as "non zone starts", then it is impossible to properly determine full stop that zone starts don't affect Corsi because you're not segregating the Corsi measurement between situation where zone starts are applicable and when they are not.

How do you not understand this?

Because that's how you do it.

That's the definition of a zone start used by every advanced stats website, every traditional stats website, every coach, every GM, and the NHL itself per their official website.

How do you not understand this?
 

Ol' Jase

Steaming bowls of rich, creamy justice.
Sponsor
Jul 24, 2005
12,485
4,818
Because that's how you do it.

That's the definition of a zone start used by every advanced stats website, every traditional stats website, every coach, every GM, and the NHL itself per their official website.

How do you not understand this?

I understand how the stat is measured. Jesus.

Have "you guys" gotten around to properly segregating Corsi measurements for zone start applicable and not applicable so you can actually properly determine if zone starts affect Corsi?

Yeah, didn't think so.
 

Machinehead

GoAwayTrouba
Jan 21, 2011
144,691
118,621
NYC
I understand how the stat is measured. Jesus.

Have "you guys" gotten around to properly segregating Corsi measurements for zone start applicable and not applicable so you can actually properly determine if zone starts affect Corsi?

Yeah, didn't think so.

Yes, we have. Thanks for asking.
 

TT1

Registered User
May 31, 2013
23,753
6,244
Montreal
But its obviously true that hockey advanced stats aren't as precise as baseball. Hockey has a lot of variables that influence stats, baseball variables are isolated, giving for far more precise data.
 

Bomber0104

Registered User
Apr 8, 2007
15,178
7,158
Burlington
If you really think there's no difference between 55% and 45%, I just don't know what to say about that.

One team has the puck 10% more than the other team...

That tells you nothing about what they are doing with the puck, whether the teams utilize systems that allow or don't allow shots from the outside or whether their systems stress shot quality vs. quantity ... all of which will affect a shot count.

Anyone who uses Corsi for any sort of "analysis" needs their head examined in my opinion. It's a cute little stat that shows an approximation of possession but in terms of statistical meaningfulness, it's garbage.

That's why it's been shown over and over that puck possession doesn't translate to wins.

The correlation doesn't exist.
 

Machinehead

GoAwayTrouba
Jan 21, 2011
144,691
118,621
NYC
But its obviously true that hockey advanced stats aren't as precise as baseball. Hockey has a lot of variables that influence stats, baseball variables are isolated, giving for far more precise data.

That's true that baseball situations are isolated, but baseball also gets into really advanced algorithms.

Everything in hockey is fairly simple and almost all of it is based on shooting the puck. I'd be very skeptical of something like WAR in hockey, but I think attempted shots are fairly manageable.
 

Machinehead

GoAwayTrouba
Jan 21, 2011
144,691
118,621
NYC
One team has the puck 10% more than the other team...

That tells you nothing about what they are doing with the puck, whether the teams utilize systems that allow or don't allow shots from the outside or whether their systems stress shot quality vs. quantity ... all of which will affect a shot count.

Anyone who uses Corsi for any sort of "analysis" needs their head examined in my opinion. It's a cute little stat that shows an approximation of possession but in terms of statistical meaningfulness, it's garbage.

That's why it's been shown over and over that puck possession doesn't translate to wins.

The correlation doesn't exist.

22 of the last 24 Stanley Cup winners were top 10 corsi teams. Try again.
 

Bomber0104

Registered User
Apr 8, 2007
15,178
7,158
Burlington
22 of the last 24 Stanley Cup winners were top 10 corsi teams. Try again.

That means nothing.

If puck possession is considered to be a good thing, it must correlate to winning hockey games.

Screen-Shot-2015-11-05-at-11.56.02-AM.png


Here's a chart plotting wins against CF% for the 2015 season with a miserable .22 r-value.

Screen-Shot-2015-11-05-at-12.11.19-PM.png


Over 4 years (2011-2015), the r-value between CF% and Wins is .35 , which again demonstrates weak-to-no correlation between the two variables.

If being a good puck possession team doesn't statistically lead to winning hockey, just why exactly are people like you hyping it so much?
 
Last edited:

Ad

Upcoming events

Ad

Ad