Philosophy of hockey sabermetrics: Can hockey accurately be measured?

Nalens Oga

Registered User
Jan 5, 2010
16,780
1,053
Canada
I love advanced stats. I don't really rely on them much for the now but how cool would it have been to have had them 40 years from now so we could compared defensive minded players and 2-way centres? Know which ones beared a tough or easy workload, etc.
 

James Mirtle

Registered User
May 15, 2006
226
0
Toronto
www.facebook.com
Some pretty funny happenings tonight re: advanced stats.
The Leafs were being outshot 11-3 by the Isles I believe, yet still had a 2-0 lead.
Leafs media guys that don't like advanced stats of words like "unsustainable" (James Mirtle and Dave Shoalts, in particular) proceeded to make fun of stat geeks on Twitter because of what had unfolded.

You need to read something I've written. Anything.
 

Warden of the North

Ned Stark's head
Apr 28, 2006
46,314
21,591
Muskoka
Some pretty funny happenings tonight re: advanced stats.
The Leafs were being outshot 11-3 by the Isles I believe, yet still had a 2-0 lead.
Leafs media guys that don't like advanced stats of words like "unsustainable" (Dave Shoalts, in particular) proceeded to make fun of stat geeks on Twitter because of what had unfolded.

At the end of the 1st period though, the Isles had a 19-5 edge in shots and 3-2 lead. :laugh:

The first period was somewhat misleading though as the Leafs had significant PK time, including a large chunk of 5 on 3. Thats going to skew the shot count of any team.
 

Michael Farkas

Grace Personified
Jun 28, 2006
13,424
7,947
NYC
www.HockeyProspect.com
I'm not a stats guy, but I'm not going to just shut out the idea of them without doing due diligence, certainly.

I tried on behindthenet for an explanation, but didn't really get a sufficient one. Is there a good place where I could read about the explanation of these terms...how QoC is calculated, how QoT is calculated, etc.

Any recommended reading that yous could point me towards...?
 

hella rights

Registered User
Oct 9, 2006
431
214
Regarding fenwick/corsi stats, i just looked them up for the rangers/devils game this afternoon http://timeonice.com/shots1213.php?gamenumber=20671

Judging by fenwick/corsi alone it looks like the devils outplayed the rangers, but the rangers actually won the game pretty handily 4-1. Am I interpreting this wrong and, if so, how should these results be interpreted?
 

The Noot

scaldin ur d00dz
Apr 12, 2012
9,841
404
Zurich
Advanced stats are interesting and can explain many things that seems to be random at first sight.
But some guys simply take them "too serious" and try to predict things that can't be (precisely) predicted with our current formulas.

And what I keep encountering: Blatant ignorance towards this topic. Especially this chapter gets ignored way too often.
 

Danny34

Registered User
May 18, 2013
5
0
Regarding fenwick/corsi stats, i just looked them up for the rangers/devils game this afternoon http://timeonice.com/shots1213.php?gamenumber=20671

Judging by fenwick/corsi alone it looks like the devils outplayed the rangers, but the rangers actually won the game pretty handily 4-1. Am I interpreting this wrong and, if so, how should these results be interpreted?
The thing is: Every hockey game is influenced by chance so much, that outplaying your opponent may not be enough on any given night. BUT if you continue to outshoot your opponent, you will end up winning more games than you lose. That's why FenClose after 30 games is a better indicator of a team's strength than their actual record. Just look at Minnesota in 11-12. After 35 games they were very close to the top of the NHL standings (if not at the top), but they won a lot of those games while being outshot. In the end they didn't even make the playoffs, because their good luck was bound to run out.
If you compare the end of year standings with the FenClose or FenTied standings, they're pretty similar, but it does take some time before the effects of luck even out.

To come back to your question: Over the course of a season the Fenwick and Corsi data "sync up" with the scoring chances and the scoring chances "sync up" with goals. So if New Jersey continues to outshoot their opponents, they will start winning games.
 

GKJ

Global Moderator
Feb 27, 2002
186,671
38,700
Regarding fenwick/corsi stats, i just looked them up for the rangers/devils game this afternoon http://timeonice.com/shots1213.php?gamenumber=20671

Judging by fenwick/corsi alone it looks like the devils outplayed the rangers, but the rangers actually won the game pretty handily 4-1. Am I interpreting this wrong and, if so, how should these results be interpreted?

Score effects take hold. Teams trailing by more than two goals are more likely to take more frequent attempts

or

The Devils did outplay the Rangers handily. Just wasn't their day.
 

Shrimper

Trick or ruddy treat
Feb 20, 2010
104,192
5,268
Essex
I've always tried to understand Corsi and Fenwick but struggled. Anyone give me a 101 on it?
 

BoHorvatFan

Registered User
Dec 13, 2009
9,091
0
Vancouver
when every single team plays the exact same system, minutes are distributed equally for all players of equal ability then sure. Until then the stats are just showing us what we are already seeing but missing lots.
 

Doctor No

Registered User
Oct 26, 2005
9,250
3,971
hockeygoalies.org
when every single team plays the exact same system, minutes are distributed equally for all players of equal ability then sure. Until then the stats are just showing us what we are already seeing but missing lots.

What's this "missing lots" that you mention?

Yes, current hockey stats miss things - and they're better than what's currently used by mainstream audiences (which is better than what was used before, and so forth). And they're not perfect - if they were perfect, then hockey could be explained 100% by the numbers, and it wouldn't be very interesting at all. On the other hand, the fact that they aren't perfect is a good thing, since it means that progress is able to be made.

You probably could have picked out relevant criticisms instead of using a scatter shot approach - the way you wrote your post, it doesn't seem like you're even aware of which statistics do what.

And how are you not "missing lots" when you watch the games? I'd recommend Daniel Kahneman's book "Thinking Fast and Slow" for a great primer on what narrative bias (among other things) does to you when you watch hockey. Your brain's not as good at this as you'd like to think.
 

Bank Shot

Registered User
Jan 18, 2006
11,377
6,942
The thing is: Every hockey game is influenced by chance so much, that outplaying your opponent may not be enough on any given night. BUT if you continue to outshoot your opponent, you will end up winning more games than you lose. That's why FenClose after 30 games is a better indicator of a team's strength than their actual record. Just look at Minnesota in 11-12. After 35 games they were very close to the top of the NHL standings (if not at the top), but they won a lot of those games while being outshot. In the end they didn't even make the playoffs, because their good luck was bound to run out.
If you compare the end of year standings with the FenClose or FenTied standings, they're pretty similar, but it does take some time before the effects of luck even out.

To come back to your question: Over the course of a season the Fenwick and Corsi data "sync up" with the scoring chances and the scoring chances "sync up" with goals. So if New Jersey continues to outshoot their opponents, they will start winning games.

Why can't teams get lucky on their Fenwick and corsi's for parts of the season?

I see people applying the luck factor to everything except those stats.
 

Master_Of_Districts

Registered User
Apr 9, 2007
1,744
4
Black Ruthenia
Why can't teams get lucky on their Fenwick and corsi's for parts of the season?

I see people applying the luck factor to everything except those stats.

That would be because the luck factor becomes negligible reasonably early in the season for those statistics.

For example, if we're looking at Corsi percentage specifically, the non-luck/luck division would be about 95%/5% at the 40 game mark.

If you wanted to get the best measure of each team's outshooting/possession talent, you could regress each team's Corsi percentage 5% of the distance to the league average at that point.

But that would almost be a waste of time.
 

Bank Shot

Registered User
Jan 18, 2006
11,377
6,942
That would be because the luck factor becomes negligible reasonably early in the season for those statistics.

For example, if we're looking at Corsi percentage specifically, the non-luck/luck division would be about 95%/5% at the 40 game mark.

If you wanted to get the best measure of each team's outshooting/possession talent, you could regress each team's Corsi percentage 5% of the distance to the league average at that point.

But that would almost be a waste of time.

Over the long haul, a lot of other things become non luck as well, like goal differential.

I see a lot of people using corsi/fenwick to express what happened in individual games, or groups of 5-10 games.

In those instances, Corsi, and fenwick should be susceptible to luck just like any other statistics.
 

kmad

riot survivor
Jun 16, 2003
34,133
61
Vancouver
I've always tried to understand Corsi and Fenwick but struggled. Anyone give me a 101 on it?

Corsi = Shot attempts (shots on goal, blocked shots. missed shots)

Fenwick = Shot attempts minus blocked shots (shots on goal, missed shots)
 

Muelch

Registered User
Mar 14, 2014
55
0
Los Angeles
I've always liked playing around with regressions of hockey stats on Excel so the advanced stats are great for the convenience of data collection.

That said I think people misinterpret the value of the advanced stats, trying to use them as an actual ranking of teams or players. As a simple score, they are virtually non-usable in any predictive manner. However, they are very useful in determining the styles of players and teams, which can help determine ultimate value only in comparing several stats and real time situations together.

For example, one of my favorite stats is the Offensive Zone Finish %. Players with a high value in this stat are clearly Powerforwards and net crashers who cause goalie stoppages while low % are often players who have good skilled scoring and typically jump off the ice after a long shift in the opponent's zone has been cleared.

Similarly high Offensive zone Start% are the top line offensive types while low start % are the more defensive types. A relatively low % is a common trait shared among Selke nominees, but a player with a slightly lower stat isn't necessarily a better defensive player, as there's a good chance he and his team keep the puck out of his zone properly. Looked at with several comparisons, however, this stat can very useful.

As for the overwhelming interest there is in shot differential among these advanced stats, it is certainly a mistake to think these stats can show a straight forward chance of winning in the playoffs. Having done analysis on various forms of shot differential compared to playoff success, I have found there is definitely a "significant" relationship between the two, but the actual correlation (how much the difference in success is explained by shots) is very slim.

It may be a strange concept to those not familiar with statistical analysis, but just because something has "statistical significance" doesn't mean it has a MAJOR effect, it only means that it does have some effect, regardless of how much.

In saying that, the value of shot differential does make logical sense to a degree, a team with a better differential is more likely to have a strong possession game, something that is valuable in the playoffs. That doesn't mean we can write off Shooting% and Save% as luck just because long term regressions DON'T have statistical significance with Playoff Success, it only means that a strong possession game has been the more consistent trait of the teams that have won in the playoffs. Some years might show significance in other aspects, and while one year might show save% to be significant, another might show PP% to be significant. One year might show rainfall in China to be significant, but obviously examining that with any kind of logic shows that as meaningless.

So while I strongly believe that hockey can be analyzed with advanced statistics, there is no way to plug in an equation and get a perfect playoff bracket. Any statistics have to be looked at with logic and scrutiny, and comparing them to actual events in the game is the only way to understand their value.
 

MarkGio

Registered User
Nov 6, 2010
12,533
11
Maybe a math wizard could speculate on chaos? Human behavior seems very chaotic. Matt Stajan just had his child pass away, so how do we quantify the effect that may or may not have on his performance?

I feel like a margin for chaos doesn't cut it, as an athlete. I feel like, I'm in control when I play and can see the sequences of events unwinding. Imperfection or lack of perfect control is felt, so I'm not sure what chaos is other than no game ever feels exactly the same. Which is strange because the same parameters are laid out time and time again within this tiny 200ft x 85ft confinement.
 

kmad

riot survivor
Jun 16, 2003
34,133
61
Vancouver
Maybe a math wizard could speculate on chaos? Human behavior seems very chaotic. Matt Stajan just had his child pass away, so how do we quantify the effect that may or may not have on his performance?

I feel like a margin for chaos doesn't cut it, as an athlete. I feel like, I'm in control when I play and can see the sequences of events unwinding. Imperfection or lack of perfect control is felt, so I'm not sure what chaos is other than no game ever feels exactly the same. Which is strange because the same parameters are laid out time and time again within this tiny 200ft x 85ft confinement.

What is chaos other than variables that we don't know how to measure?
 

Mathletic

Registered User
Feb 28, 2002
15,777
407
Ste-Foy
I'm no expert on chaos theory. I never had courses on the subject specifically but as far as I know chaos theory applies to deterministic systems. Without getting too crazy about the maths a deterministic system is one that if given the same initial conditions will always return the same answer. For example, let's take the funciton 2x + 3. If we replace x by 4 we'll always get the same answer no matter what same for 5, 6 and so on. Like I said, 2x + 3 is a function so it is also total (not sure how you call those in english but in french we'll say totale) but that's a different story.

What chaos theory says that is a system will behave chaotically if it is deterministic and if a small change in its initial condition results in a completely different outcome.

I don't think the case of Matt Stajan is one that applies under chaos theory. One thing that psychologists have learned over the years is that we generally have baseline "happiness" level that we all tend to gravitate to over the long-term. No matter how hard the events we have to go through we generally comeback to the same "happiness" level we were at before the events. I use the word happiness in a general way. I cannot express how tough it must be for the Stajan family to go through this.

In the end, Hockey isn't deterministic, so I doubt chaos theory has many applications in hockey or sports in general. Maybe to some specific problems but not in a general way.
 
Last edited:

Epsilon

#basta
Oct 26, 2002
48,464
369
South Cackalacky
What is chaos other than variables that we don't know how to measure?

It's a specific type of dynamical system whose properties have been largely misunderstood and butchered by the media, general public, and entertainment. The so-called "butterfly effect" in particular is based on a misunderstanding of what Lorenz's point was in his eponymous talk on the subject.
 

MarkGio

Registered User
Nov 6, 2010
12,533
11
All I know is that despite and endless amount of variables, the variance in the results are very small. Hockey usually ends in 1-10 goals per game. Teams usually score 200-300 goals per season. Scoring leadets usually have 45-60 goals and point leaders usually have 90-120 points.

So there is chaos within a game, but somehow the results are always similar. That's strange to me
 

Mathletic

Registered User
Feb 28, 2002
15,777
407
Ste-Foy
All I know is that despite and endless amount of variables, the variance in the results are very small. Hockey usually ends in 1-10 goals per game. Teams usually score 200-300 goals per season. Scoring leadets usually have 45-60 goals and point leaders usually have 90-120 points.

So there is chaos within a game, but somehow the results are always similar. That's strange to me

A statistician could explain this better but I'll give it a try :P

In order to understand all of this a bit better I think it would be a good idea to read a good stats book or watch some videos on youtube or Khan academy. There's a good bunch of resources that can help you understand all this a but better.

I'm not sure we can talk about chaos all that much. I think what you refer to is what statisticians mean by noise or explained/unexplained variation.

What you're noticing on a large scale is what is known as the central limit theorem. On small samples certain evens will follow a poisson distribution others a gamma distribution and whatnot. Over large samples however, what you're seeing is that all these distributions tend to follow a normal distribution. The infamous bell curve, inverted U or what have you.

Let's take for example a coin flip. Say we flip it twice and got two heads. If we wanted a deep understanding on why it turned that way we could study the physics of it. I.e. the force applied on the coin, the height at which it was thrown, friction with air and so on. Those physical models are fairly easy to make. That way we could predict 100% of the time on what side the coin will fall.

However, in most endeavours, it becomes very difficult if not impossible to make such models. So, we lean towards statistics and probabilities. If we flip a coin 10 times, 30 times, a million or a billion times we'll notice that the results tend to be 50/50 +/- a small margin of error. It's definately not as good as a true physical model that would provide enough information to predict 100% of the time the expected outcome given certain conditions but in most cases, statistics and probabilities are our best bet.

Over the course of a hockey season we tend to see the same things. On a given night Jan Bulis might score 4 goals and Sidney Crosby 0. There's a lot of variables and noise in those small samples. However, over the course of a season, a career or multiple careers we see the central limit theorem taking over.

I'm not sure thinking of hockey as a chaotic system helps understanding the game a whole lot better.
 

matthew94

Registered User
Jul 2, 2005
617
0
WNY
www.matthew94.blogspot.com
Don't mean to over-analyze, but...

I really think the debate between pro-analytics and anti-analytics is closely connected to our broader cultural debate between belief in a spiritual realm and naturalism.

The former, if held antagonistically (toward analysis), just ignores this whole analytics industry to its own peril.

The latter, wrongfully in my opinion, believes that nature is all there. If true, this would lend itself directly to scientific measurement and analysis of all things. More importantly, it suggests that such measurements are fully capable of explaining pretty much everything (as long as we have enough stats).

The right position, it seems to me, is to realize that there are (technically) invisible components involved in hockey. Relationships, moods, what's going on at home, line chemistry, intuition, etc. all can have a big impact on the game that doesn't necessarily show up on the stats sheet.

You can come up with ways to measure a lot, but there are some things that are either very difficult or impossible to measure. Analytics should be used, but not abused.
 

Ad

Upcoming events

Ad

Ad