Philosophy of hockey sabermetrics: Can hockey accurately be measured?

plusandminus · Jul 11, 2012

MOD EDIT: Posts 1 & 2 were moved from the Past Studies thread to reduce clutter in that thread. Please see post 3 for the actual beginning of this thread.

I studied the "with or without" effect (team results when player participating in game, vs when he didn't) for all players during all the last 25 or so seasons, lately even the goalies, also comparing (by game by game basis) to an "expected" game outcome. I also studied how some players' point production was affected when certain other players wasn't around. This is, as I see it, things that would be of great interest, considering all the debates upon how much different players were helped by each other, etc.
I've studied the strength of different year groups too. And lots on adjusted scoring, scoring distributions, the effect of faceoffs, situational adjusted goalie stats, penalty killing stats where attempting to taking away the goalie effect, different kinds of adjusted +/-, a combined overall player stat for ES+PP+SH, "how easy it was to produce points during a certain season", schedule adjusted standings, individual winning %, etc.

I find a problem with basically all studies here (including my own) in that they require a lot of work and time. There are usually many hours of boring research to have to be done, in order to learn and know about many factors leading up to the end results. There also seem to basically always be factors "biasing" things, including (of course) "randomness" or "circumstances".
(To use a common example, people often try to determine who produced more impressively between peak Gretzky and peak Mario. We can adjust based on mathematical methods, ending up with adjusted stats. But then we also want to know how much teammates affected their stats, or playing system. And in the end, we just end up with more or less arbitrary "feelings" of who did best.)
So, I think there needs to be done "boring" research in order to progress. To sort of lay foundations or reference to build upon. There are so many more or less automatic assumptions being done, and I think those too needs to be closely examined.

Czech Your Math · Jul 11, 2012

plusandminus said:
I find a problem with basically all studies here (including my own) in that they require a lot of work and time. There are usually many hours of boring research to have to be done, in order to learn and know about many factors leading up to the end results. There also seem to basically always be factors "biasing" things, including (of course) "randomness" or "circumstances".

I agree and that is why I believe that the topic of, metrics and methodology of, and estimated time involved in such studies should be considered and chosen carefully. It is important that the effect being measured, and the metric used in doing so, are not likely to overwhelmed by random error and/or factors which can't be removed, quantified, and/or easily assessed.

If an author wishes feedback on a potential or ongoing study, he/she can always start a thread (and even post a link to such in this thread, while requesting feedback in the thread for the study). The thread might contain the specific results to date and could be used to receive feedback about and discuss the various aspects of the study (topic, metrics, methodologies, etc.), especially those factors which the author believes are complicating the study.

plusandminus said:
(To use a common example, people often try to determine who produced more impressively between peak Gretzky and peak Mario. We can adjust based on mathematical methods, ending up with adjusted stats. But then we also want to know how much teammates affected their stats, or playing system. And in the end, we just end up with more or less arbitrary "feelings" of who did best.)
So, I think there needs to be done "boring" research in order to progress. To sort of lay foundations or reference to build upon. There are so many more or less automatic assumptions being done, and I think those too needs to be closely examined.

Data analysis is obviously limited by the types and quantity of data available. The less data is able to quantify a variable, the less one is able to analyze such a variable. At least it provides a more objective starting point from which non-quantifiable variables can be considered. If the starting point is wrong, the conclusion is much more likely to be wrong.

IMO this thread is best if not used to discuss individual studies (completed, ongoing or potential). However, since you reference a specific, common type of comparison for which math and data analysis are used, then let me briefly continue for illustrative purposes only.

Let's say one is comparing Gretzky and Lemieux with "simple adjusted" data (for league games, GPG and assist/goal ratio), but believes there are many other factors not being assessed. This is how I might approach such a comparison. First, let me say that even if we stop with "simple adjusted" data for the two, we very likely have a better starting point than if we used raw data. One might be tempted to stop there and use mental estimates for other factors in the interest of saving time/effort. However, the constraint is often more one of time/effort than in limits of the data. Eventually, one reaches the point of diminishing returns, where either the time/effort vs. info obtained is too much, and/or the the info provided by the data vs. the influence of non-quantifiable factors and random error is too little.

In such a comparison, there are often ways one could use other data and/or build upon the studies of others to help filter out other factors and refine the comparison.

- League quality: A study of league quality and/or the difficulty could assist in this case (it seems we, and others, have studied such things). Specifically, Lemieux probably played more of his prime years in a league of higher quality (although due to his injuries, the differences aren't as drastic), while if using "simple adjusted" data, Gretzky played more during a time when such data is biased against him to a some degree (probably due in large part to factors such as scoring being more balanced between lines).

- Competition: The possibility and likely causes of differing competition should be assessed and taken into account somehow if possible. Specifically, Gretzky's final years and much of (what should have been) Lemieux's late prime were impacted by a large group of forwards from outside Canada, which differs from Gretzky's prime years when such impact was relatively minor. The simplest, yet admittedly imperfect way I have found to look at this impact in isolation is the thought experiment "what if there were no (or player X, the one being studied, was the only) non-Canadian (or non-North American) player(s) in the NHL? How would this have affected player X's rankings in various categories. Again, the point is to have a much fairer and better starting point, not be perfect, since the other choice is much less fair and therefore less useful in considering the impact of this other factor. Without looking at the data, Lemieux should be impacted more by his additional competiton, but since both players often were leaders in various categories anyways, I'm not sure if there's much/any difference. This may have been different if Lemieux was playing more from '98-01.

- Teammates/Linemates: One can look at how each player performed with various linemates and/or teammates, and how he performed without each/all of them. One can also look at how some/all of those players performed without the player being studied. Specifically, I haven't really looked at this effect in depth.

- Team Playing Style: One can look at team performance in such categories as ESGA as a general, imperfect indicator of whether a team was more open or restrictive in playing style. Of course this metric depends on the quality of the defense/goalie, etc., so it's far from perfect, but at least it may give us some important info. Specifically, Gretzky tended to play on much better overall teams than Lemieux, and I think his teams tended to have lower ESGA (but not certain without looking at data).

- Overall Impact: One can look at adjusted plus-minus to see how much better each player's team was with and without him on the ice at even strength. Specifically, Gretzky's adj. +/- is better than Lemieux's, but this becomes complicated by the fact that they had unusual comparisons (Jagr, one of the leaders in this metric, and Messier, who often performed poorly in this metric in large part due to having Gretzky as part of his "off ice"). Also, although of limited value, the win% of the team with and without the player can be examined (if calculated properly). Specifically, Lemieux's teams performed much, much worse without him than with him. This is complicated by the fact that he missed the majority of some seasons (which makes the "with" component much less reliable). Gretzky didn't miss enough games during his prime to have even a decent sample of games from which to assess his overall impact.

- Playoff/International: The importance of this is often overemphasized in proportion to regular season performance. I'm not making a judgement, necessarily, but simply saying that by many/most people it's given much more importance in proportion to number of games played. There are some factors that often seem to be mostly or completely neglected when using this metric. First, while most at least know about adjusted numbers, they often cite actual playoff data which is unadjusted and therefore difficult to compare across different periods. Second, if career numbers are used, the proportion of playoff games played during a player's peak/prime vs. career can vary dramatically. Third, while there are smaller differences in strength of schedule during the season, the differences in opponents are much larger. A player on a very strong team will still generally face teams which are worse than his, but on avg. the playoff opponents will be better than the team's regular season opponents. However, a player on a weaker (in playoff terms) or mediocre team (in reg. season terms) will almost always be facing superior opposition and so his performance can generally be expected to be significantly worse than a player on a significantly better team. For instance, Dionne and Kariya are often criticized for their playoff performances, yet they are likely the underdog in most cases, so their playoff performance would be generally be expected to be worse. They also generally can less afford to rest during the regular season, lest their team miss the playoffs. Specifically, without looking at the data more in-depth at present, but using my own adjusted playoff data, I think Gretzky performed slightly better on a prime/career basis, but given the generally better teams he played for, it's difficult to distinguish between them.

- Trophies & Voting: This is another factor often cited by people when evaluating players. What most don't acknowledge is that it is simply quantifying the opinions of alleged "experts." Just how much importance should be given to the opinions of others, given that their choices are often difficult to explain and their credentials may vary substantially? The source data is simply the opinions of a select group (most often sportswriters). Quantifying this does not change this fact. While some interesting work has been done in this field (such as HockeyOutsider's Hart & Norris shares, which place emphasis on how often and/or how close to a player was at/near the top, rather than simply Trophy counting), the source data is completely subjective and this needs to be remembered.

In summary, while there are usually limits on the availability of and information provided by various data, we often assume that data cannot be used to evaluate various, seemingly non-quantifiable factors, rather than find a way to properly use what data is available to shed further light on the factor being considered. The goal IMO isn't to create some grand unified theory of hockey, but to attempt to provide a more objective starting point for further subjective discussion. It's for the individual to decide whether the importance of quantifying various categories of performance and/or contextual factors is worth the time and effort involved, but we must be careful in declaring such factors as completely unquantifiable and otherwise resorting to completely subjective means of analyzing performance and providing context.

kmad · Jul 27, 2012

I'll toss up a weak first opinion, to echo what I said in the Introduce Yourself thread:

I don't think it can. It's not segmented like baseball or football. It's too fluid, too complicated, and there are far too many variables to consider. For each statistic shown, it has to be given at least two layers of context to have any meaning. Usually far more.

Thoughts?

Doctor No · Jul 27, 2012

My answer to those who think that it can't be: you're trying to measure too much all at once (or equivalently, you're trying to take too large of a bite).

Start by measuring things that you can measure.

Or work on measuring something better - even in baseball, you can't get a perfectly isolated statistic.

Perfection is the enemy of good. If you know that you can't achieve "perfection", and that stops you from aiming for "good", then we lose in the process.

Polansky · Jul 27, 2012

Jagorim Jarg said:
I'll toss up a weak first opinion, to echo what I said in the Introduce Yourself thread:

I don't think it can. It's not segmented like baseball or football. It's too fluid, too complicated, and there are far too many variables to consider. For each statistic shown, it has to be given at least two layers of context to have any meaning. Usually far more.

Thoughts?

I also think it will be very difficult.

Just imagine the following simple situation. Patrick Kane comes down the left side of the rink, the defence plays him perfectly, they keep him to the outside and allow a weak shot from almost the corner. A stick even gets in there causing the puck to slide weakly on the ice. Great defence from all five players on the ice. Oh no! The puck goes in. Game over! Stanley Cup goes to Chicago.

In real life, a team played good defence and really worked hard to keep a dynamic player to the outside, but on the stat sheet, all it says about that shift is one goal against.

I think finding a way to take that situation and fairly put that into a statistic is going to be really difficult. That goal is entirely on the goalie, but there is no way to know unless we have human's rating the difficulty of a goal and shown by UZR in baseball, the second humans have to rate something, there are problems.

metalfoot · Jul 27, 2012

It can be mathematically modelled, but only to the same extent as, say, weather can. As previous posters noted, there are a lot of variables in hockey, more so than in say football or baseball. As research indicates which variables are more important and which are less, the value of relative statistics become clearer. But there is a lot of variability which is hard to break into discrete, measurable chunks, in my opinion.

Czech Your Math · Jul 28, 2012

Taco MacArthur said:
My answer to those who think that it can't be: you're trying to measure too much all at once (or equivalently, you're trying to take too large of a bite).

Start by measuring things that you can measure.

Or work on measuring something better - even in baseball, you can't get a perfectly isolated statistic.

Perfection is the enemy of good. If you know that you can't achieve "perfection", and that stops you from aiming for "good", then we lose in the process.

I agree with this. Throwing the baby out with the bathwater does't solve anything. Otherwise, we all go back to arguing based solely on what we (think we) saw.

As far as measuring everything at once, I see two approaches:

- using linear regression, which assigns numerical importance (coefficients) to competing factors without having to try to integrate measurements for multiple factors simultaneously

- design the study in such a way that it essentially filters out many factors which are difficult to measure and/or unknown

cptjeff · Jul 28, 2012

Polansky said:
I also think it will be very difficult.

Just imagine the following simple situation. Patrick Kane comes down the left side of the rink, the defence plays him perfectly, they keep him to the outside and allow a weak shot from almost the corner. A stick even gets in there causing the puck to slide weakly on the ice. Great defence from all five players on the ice. Oh no! The puck goes in. Game over! Stanley Cup goes to Chicago.

In real life, a team played good defence and really worked hard to keep a dynamic player to the outside, but on the stat sheet, all it says about that shift is one goal against.

I think finding a way to take that situation and fairly put that into a statistic is going to be really difficult. That goal is entirely on the goalie, but there is no way to know unless we have human's rating the difficulty of a goal and shown by UZR in baseball, the second humans have to rate something, there are problems.

Another issue is the relative paucity of quantifiable events. When you're manipulating stats, it's best to go off hard data where possible. In baseball, you get a pretty solid point of data with every pitch. Yes, it's not absent complicating factors, but there are far, far, less then there are in hockey. For every shot, there's about a minute and a half of play in an average hockey game, much of which can't be quantified. And the shots themselves are wildly different in situation and quality. It's difficult to factor that in.

When you build advanced stats on a foundation so weak, your uncertainty multiplies, and in some of the stuff I've seen trotted out, the uncertainty is multiplied to the point where the stat is utterly useless.

Some of the stats may be valuable, but I'm very dubious of the category as a whole. Going back to baseball again, with those advanced stats the external factors to each result can be assumed to cancel out over time or be so small as to be insignificant over a large sample. Think Newton's Laws and relativity. Yes, relativity always applies, but until you hit .1c, you can just zero out the effect with no consequence unless you're using more than something like 5 significant figures. That's baseball. The external stuff can be worked around or ignored without much consequence. With hockey, the factors you can't quantify are much more significant. When you have 20 significant variables and only 5 values, you can't solve the equation. It's simply impossible, but that's what a lot of these advanced stats try to do, and more often then not they seem to do it by simply deleting 14 of the variables and pretending that they don't exist.

Smokey McCanucks · Jul 28, 2012

Intangibles, clutchness, will to win, these are not things that can be statistically defined. Often the players who become heroes in the playoffs do so precisely because there's no reason to think based on their stats and their career trajectories that they will come through in a huge moment.

Would statistical analysis tell you that Dustin Penner would be one of the most key guys in the playoffs this season? The stats would probably tell you Penner sucks. But he came through when it mattered. What about a Max Talbot, or a Sean Bergenheim, or a Claude Lemieux, or a Bill Barilko, guys like that are legends (not Bergenheim but he totally came out of nowhere that one year, you know what I mean)... precisely because the immense value of their contributions was surprising and unexpected, because their lackluster overall stats didn't matter and they stepped it up when it really mattered. That's a part of hockey, it's intangible and it transcends "Sabremetric" statistical analysis, it exists on another level, another plane.

Hynh · Jul 28, 2012

cptjeff said:
For every shot, there's about a minute and a half of play in an average hockey game, much of which hasn't yet been quantified.

Fixed that for you. Although I think your estimate of 90s per shot is high. That's only 40 shots per game, a number often passed by a single team. Even the St. Louis-LA series averaged more than 40 shots per game and those teams are famed for their low shots against totals.

cptjeff said:
Some of the stats may be valuable, but I'm very dubious of the category as a whole.

Then it's a good thing it is improving. People are constantly finding new ways to quantify a player/team. Traditional hockey analysis (good 'ol Tranna boy, not some gutless yuropeen *slams hands on desk*) is static. It doesn't change. It's out of date and doesn't reflect the reality of the game.

Think Newton's Laws and relativity. Yes, relativity always applies, but until you hit .1c, you can just zero out the effect with no consequence unless you're using more than something like 5 significant figures. That's baseball.

Baseball is Newtonian physics. Simple and easy to measure. Throw ball up, ball falls down. Hockey is quantum mechanics. While it may be more difficult, it can be figured out.

I don't understand how you have to choose to either take all of the stats or none of the stats. Too often I see anti-stat folk discarding perfectly accurate and reasoned stats because they say mean things about their favorite player/team, then saying all stats suck.

I prefer stats that start by looking at things we know to be true, yet don't measure. Zone starts is a great example. Every hockey fan understands that starting in your own zone is harder than starting the opponent's zone. People are now moving onto zone exits and zone entries. We understand that a player able to breakout of his zone is a good player, so tracking who these players are helps us understand their relative value.

almostawake · Jul 28, 2012

I'm quite certain that if can be to a large degree. A lot of the 'issues' that posters have identified in this thread aren't really issues at all.

The general approach for isolating a player's on ice contribution is regression analysis. A problem with this approach is that you need an opposition ensemble that has a wide range of player quality. But this is not so difficult, for most players you can come up with a decent ensemble. The real killer with hockey is players play with the same team mates so much. Basically what I'm getting at, is if a player is only on the ice with the same 4 team mates over a full season it is statistically impossible to decouple that player's contribution from the other 4.

There is another trend that seems to be emerging on this board is that people just focus on the currently available data. Even in baseball the amount of raw data has exploded in the last few years. Player tracking, pitch tracking, etc. Typical SABR stuff is about re-arranging a bunch of statistics to come up with a number the correlates well with a desirable outcome. But even baseball has moved way beyond this in the last few years. Now the game is about analytics. The quantitative guys being hired in MLB front offices these days aren't working with OPS, WAR, etc. They're developing and exploiting new data streams. Hockey is headed this direction to. Anything that can be done, likely will.

Just as a side note, I find the fact that American Football has been brought up a few times in this thread kind of funny. No sport is more difficult to quantitatively analyse. Short seasons mean there's little data, extreme position specialization, the Oline, etc. If you want to talk about a sport that's exceptionally ill-suited for statistical analysis, it's football.

cptjeff · Jul 28, 2012

Hynh said:
Then it's a good thing it is improving. People are constantly finding new ways to quantify a player/team. Traditional hockey analysis (good 'ol Tranna boy, not some gutless yuropeen *slams hands on desk*) is static. It doesn't change. It's out of date and doesn't reflect the reality of the game.

That's not traditional hockey analysis. That's Don Cherry- scouts aren't that stupid. Good, insightful analysis from a scout is very different, and might look something like this:

X defenseman has good gap control, and is highly skilled at angling forwards to low percentage areas. His stickwork is weak, and he's overly prone to trying for pokechecks, though he shows remarkable recovery ability when those attempts fail. Good at blocking shots low, but often puts himself out of position when trying to block shots further away from the net. Has good poise with the puck, and passes, while uncreative and somewhat predictable, are usually executed well. Decent skating with an average top speed, above average mobility and acceleration. Strong, but not a great hitter. Proficient at pinning opposing players to the board, allowing teammates to recover pucks, but takes more holding penalties than he should for doing it. On offense, capable but not spectacular with passes, and possessing an average, but accurate shot that usually stays low. Occasionally has difficulty holding pucks on his right side in the zone, though far above average on his stick side. Pinches, while infrequent, never seem to be inappropriate, and he rarely ranges past the top of the circles.

Note that I'm not a professional scout and that's modeled after no particular player, just a generic higher end 2nd pairing defenseman.

But you get a pretty good picture of the player, and his strengths and weaknesses from that kind of report, and you get a lot of details that stats wouldn't give you, which allows you to determine if the player would be a good fit. The player I describe doesn't have the strongest transition game, so if your team focuses on that, maybe you don't trade for him. If, on the other hand, you like a defense that reads plays well and minimizes high risk shots, he'd be an excellent fit. If you like a physical, shutdown defense that gives up very few shots and hits really well, again, not a good fit. If you like a defense that engages in the offensive zone, again, maybe not a player you like so much. If you like your D to be more responsible even if it costs you a few offensive chances, the fact that he doesn't go low and is really smart about pinches is a big plus.

Stats, even advanced stats, don't tell you stuff like that. They may be interesting when negotiating a contract and comparing a player's effectiveness league wide, but they don't tell you how any given player fits the role you want him to play, or how he'll mesh with your coach's style or his linemates. But a scout can tell you all of those things. It's not either or, but there's a lot more to think about than just if the player has a good zone start ratio or a bad one.

metalfoot · Jul 28, 2012

I'm wondering how you would quantify things like good decisions, right times to pinch, backchecking capabilities, and such. All important parts of the game, but how do you quantify them to analyze them?

Bear of Bad News · Jul 28, 2012

metalfoot said:
I'm wondering how you would quantify things like good decisions, right times to pinch, backchecking capabilities, and such. All important parts of the game, but how do you quantify them to analyze them?

If they're truly good decisions, then they lead to good results (which can hopefully be measured).

kmad · Jul 28, 2012

Taco MacArthur said:
If they're truly good decisions, then they lead to good results (which can hopefully be measured).

Sometimes the best decision you can make with the information you have doesn't always net you the best end result.

Case in point: Can you name me one team that wouldn't have drafted Alexandre Daigle first overall?

Doctor No · Jul 28, 2012

Jagorim Jarg said:
Sometimes the best decision you can make with the information you have doesn't always net you the best end result.

I never claimed that the best decision always leads to the best result (so I'm not sure why you're implying otherwise). Hopefully that was implied in my post above. Over the long run, the best decision will usually lead to a better outcome.

If you're trying to compel me with Alexander Daigle, rest assured that I'm aware of sample size considerations. In university, I played hockey with a guy who never stretched before games because he knew a guy one time who stretched and then got hurt in the game.

kmad · Jul 28, 2012

Taco MacArthur said:
I never claimed that the best decision always leads to the best result (so I'm not sure why you're implying otherwise). Hopefully that was implied in my post above. Over the long run, the best decision will usually lead to a better outcome.

You did say:

If they're truly good decisions, then they lead to good results

A good decision can only be measured by how good of a decision it was given the information available at the time. Good results don't always follow.

Doctor No · Jul 28, 2012

Jagorim Jarg said:
You did say:

I sure did! I also didn't say "always". I understand that your intent is to be the skeptic here, but I'm not interested in arguing semantics with you. Have a great night.

kmad · Jul 29, 2012

Well as long as my point is understood I don't feel the need to argue. Take care.

Iain Fyffe · Jul 29, 2012

Smokey McCanucks said:
Would statistical analysis tell you that Dustin Penner would be one of the most key guys in the playoffs this season? The stats would probably tell you Penner sucks. But he came through when it mattered.

The crucial point here is: no one could have told you that. Some people might have predicted that, but then people are often subject to confirmation bias, where they tout their successful predictions and ignore their incorrect ones. Stats couldn't tell you that Penner will perform well in the 2012 playoffs? Well, neither could anything else.

All players have peaks and valleys in performance. Those that have peaks in the playoffs are called "clutch." Those that have valleys are called "chokers." Even if these peaks and valleys are randomly distributed. That's one of the biggest differences in points of view between statistical and traditional analysis. Traditional analysis tries to attach meaning to every variance in performance (AKA building narratives from randomness). Statistical analysis looks at trends in performance level, and sees peaks and valleys everywhere, with no particular pattern. Traditional analysis loves streaks and slumps, statistical analysis sees little value in them.

One reason that statistical analysis cannot define clutchiness is that there is little to no evidence that it actually exists. Players who are clutch one year are suddenly not clutch the next. Cam Ward is so clutch that he usually can't even get his team into the playoffs. If it were a real thing, it would be observable and predictable, and not rely on ex post facto explanation. If clutchiness can only be identified after the fact, chances are you're attaching meaning to patterns that are not necessarily real patterns.

Because that's another thing the human mind is seriously prone to: seeing patterns in random noise. It's often called apophenia or patternicity. Ask a person to create a random series of As and Bs and they will almost always create a series of patterns, whereas an actual random series will include long stretches of As and Bs, which a human brain will see as a pattern, even though there isn't one.

Here's a random sequence of 100 I created with an online tool:

babbabaaabbbaaaabbaaaabbaaaababbbababaaabbaabbabbbabbbbababbabbbabbbaaabbaabababbaaabababaabbbbbbbaa

Note the seven consecutive Bs near the end, which traditional analysis calls a streak, and if it occurred in the playoffs would be seen as clutch performance (if B is good) or choking (if B is bad). There's only one sequence where it goes ABABABA, otherwise it's groups of the same letters for the most part. There are three runs of four consecutive As, very close to each other, resulting in a "streak" of 12 of 16 letters that are A. There's also a run where 16 of 21 letters are Bs.

There are all kinds of streaks and slumps here, and yet the sequence is randomly generated by a computer. Traditional analysis assigns meaning to streaks and slumps. Statistical analysis sees normal variance in performance.

Here's the next one I did (I'm using this tool BTW):

baabababaaaaaababbbabbaababaaaabbbbbbbaaaaaababbbbbbaabaabbabbbbaaabbbbbbbbbbbbbbaabbaaaaabbbbbabaaa

55 Bs and 45 As (the first was 53 and 47, this player must be good at getting Bs), and a series of 14 consecutive Bs at one point. Also four As followed by six Bs then five As. Patterns and streaks everywhere, and yet it's actually random.

There are just the first two series I created. Further ones might "look" a little more random that this last one, but not much more. The fact is, most people don't really comprehend what "random" means, even if they think they do.

Smokey McCanucks said:
That's a part of hockey, it's intangible and it transcends "Sabremetric" statistical analysis, it exists on another level, another plane.

And so it belongs, perhaps, in another forum?

Jagorim Jarg said:
Case in point: Can you name me one team that wouldn't have drafted Alexandre Daigle first overall?

Indeed, both statistical analysis and the scouts agreed that Daigle was the best bet that year. But things like drafting are just that: bets. You can find the players most likely to succeed in the NHL, but no more than "most likely".

Was drafting Daigle a bad idea? Probably not. If there were better, consistent, psychological profiling of prospects, maybe there would have been a clue. But then, maybe not. Professional scouts miss all the time on draft picks. There's just no way of ever predicting the future with 100% accuracy.

unknown33 · Jul 29, 2012

Iain Fyffe said:
....

Don't you think some players start to care and try harder in the playoffs? Not really 'clutchness', but it can't be attributed to randomness either.

Canadiens1958 · Jul 29, 2012

Clutchness

Smokey McCanucks said:
Intangibles, clutchness, will to win, these are not things that can be statistically defined. Often the players who become heroes in the playoffs do so precisely because there's no reason to think based on their stats and their career trajectories that they will come through in a huge moment.

Would statistical analysis tell you that Dustin Penner would be one of the most key guys in the playoffs this season? The stats would probably tell you Penner sucks. But he came through when it mattered. What about a Max Talbot, or a Sean Bergenheim, or a Claude Lemieux, or a Bill Barilko, guys like that are legends (not Bergenheim but he totally came out of nowhere that one year, you know what I mean)... precisely because the immense value of their contributions was surprising and unexpected, because their lackluster overall stats didn't matter and they stepped it up when it really mattered. That's a part of hockey, it's intangible and it transcends "Sabremetric" statistical analysis, it exists on another level, another plane.

Clutchness is misunderstood and misrepresented.

Example, a goalie has strengths and weaknesses. Over the course of a season playing against the full range of league teams the goalie will have strong games and weak games. Same goalie in a 7 game series will look great against a team that cannot exploit his weaknesses and very average against a team that can exploit his weaknesses. String four such great series and a SC victory is possible. Next season or playoffs the same goalie reverts back to form and may have a weak playoff because he faces opposing teams that can exploit his weaknesses. See Michael Leighton.

Likewise for forwards and defensemen. If the opposition in the playoffs cannot play to the players weaknesses the player will look great, otherwise he will look weak.

Iain Fyffe · Jul 29, 2012

unknown33 said:
Don't you think some players start to care and try harder in the playoffs? Not really 'clutchness', but it can't be attributed to randomness either.

Maybe, but until someone can demonstrate that that is true, rather than simply assert that it true, it shouldn't be part of the analysis. All we know for a fact is that the same player plays in the regular season and in the playoffs. If you want to assert that there is something inherently and predictably different about this player's ability to perform based on what time of year that is, you need to provide evidence. Otherwise you fail to show anything that cannot be explained by normal variance in game-to-game performance.

Claude Lemieux scored .34 goals per game in the playoffs, and .31 in the regular season. He had some big playoff years, but also years where he scored 1 in 11, 3 in 19 and 4 in 23. When looking at players with clutch reputations, proponents tend to focus on the data that supports the assertion while downplaying that which does not. It's classic confirmation bias: I believe that clutch play exists, therefore I will find and present data to support it. It's starting with the conclusion and then looking for supporting evidence. You need to look at the evidence first, and derive your conclusion from that.

And then, moving beyond all that, is the underlying value judgement. If certain players try harder in the playoffs, that means they are not giving 100% in the regular season. Yet they are held up as heroes, while those who give the same effort year-round are not.

Iain Fyffe · Jul 29, 2012

Canadiens1958 said:
Likewise for forwards and defensemen. If the opposition in the playoffs cannot play to the players weaknesses the player will look great, otherwise he will look weak.

When people talk about clutchiness, they generally don't mean matchups, though those certainly have an effect. But you're once again falling into the trap of saying that everything can be explained by matchups, and leaving no room whatever for normal variance. We've been down this road before, of course. Not everything can be explained by Team A having a preponderance of right-handed centres.

Canadiens1958 · Jul 29, 2012

No

Iain Fyffe said:
When people talk about clutchiness, they generally don't mean matchups, though those certainly have an effect. But you're once again falling into the trap of saying that everything can be explained by matchups, and leaving no room whatever for normal variance. We've been down this road before, of course. Not everything can be explained by Team A having a preponderance of right-handed centres.

No you are simply drifting towards your own biases of variances and trying to narrow down the issue to match-ups.

A goalies performance may be measured by SV% in a global manner as is often the case on HF boards. Or you can extend the SV% analysis the way the teams do it factoring in the provenance of the shot by zones, the location of the shot - part of the net and the type of offence that is faced - perimeter, rush, crash the net, east/west movement,etc. Then match the results against the oppositions results - where do they score their goals from, type of shots, type of offence, etc. Looking at all aspects of the data advances understanding.

For variance to be a factor it would have to be shown that from season to season a goalies global SV% is relatively constant globally, but the component sub SV%s vary greatly. His performance on breakaways may vary by 25-30% from season to season, by 40% on shots from the LW(strong one/weak the next) vs RW(weak one/strong the next),etc.

Philosophy of hockey sabermetrics: Can hockey accurately be measured?

Registered User

I am lizard king

riot survivor

Registered User

Registered User

Karlsson!

I am lizard king

Reprehensible User

PuckDaddy "Perfect HFBoard Trade Proposal 02/24/14

Registered User

Registered User

Reprehensible User

Karlsson!

Your Third or Fourth Favorite HFBoards Admin

riot survivor

Registered User

riot survivor

Registered User

riot survivor

Hockey fact-checker

Registered User

Registered User

Hockey fact-checker

Hockey fact-checker

Registered User

Ad

Ad

Ad