Context matters, and we can effectively adjust for it.

TomasHertlsRooster · Mar 26, 2020

Since the start of the 2007-2008 season, there have been 180 NHL playoff series. 15 per season over 12 seasons.

The team with the higher place in the regular season standings has won 99 of those series. (55%)

The team with the higher regular season team GAR has won 108 of those series. (60%)

In other words, if you had guessed the winner of every single playoff series based on nothing but regular season standings rankings, you would've guessed 9 more series winners correctly than you would have if you had just flipped a coin for every series, and managed to guess exactly half of them. If you had guessed the winner of every single playoff series based on nothing but regular season team GAR, you would've guessed 9 more series winners correctly than you would have if you had picked them based on nothing but regular season standings rankings, and 18 more series winners than you would have if you had picked them based on coin flips and got exactly 50% right.

To put things more simply: the gap between using GAR and standings rankings is the same as the gap between using standings rankings and flipping a coin. If we were to say Team GAR>>Standings Rankings>>Coin Flips, the number of greater than symbols that we use would be the exact same size.

GAR is made to correlate very closely with a team's performance, and it does; just under 3 quarters of a team's standings points since 2007-2008 can be explained by their team GAR. Here is what this looks like:

So, why does GAR do a better job of predicting playoff series winners than standings points? Because although it's very similar GAR adjusts for context. I won't go into too much detail on how GAR does this, but to put it quite simply, it uses a regression to determine how certain contextual factors such as opposition, rest, score, venue, etc. all play into a team's results, and then subtracts the impact of these contextual factors from the raw results.

In theory, let's say we took two identical teams full of replacement level players, and replaced the Vegas Golden Knights with one of them, and the New Jersey Devils with the other team. The team that replaced the Vegas Golden Knights would do much better in the standings. Why? Because in the scenario where they replace Vegas, they are replacing the strongest team in the weakest division. In the scenario where they replace New Jersey, they are replacing the weakest team in the strongest division. The team that replaced New Jersey would have 5, maybe 10 more points. But, would that make them a better team? No! The teams are f***ing identical! This is why context is very important, and this is why a metric like GAR is better at predicting the winner of a playoff series than a metric like standings points. Both teams would have a team GAR of 0, but very different standings points.

I probably didn't need to explain what context means or why it was important; just about all hockey fans have an opinion on context and believe it should be adjusted for. By page 5 of every thread that compares two players with similar accomplishments and accolades, the discussion shifts to comparing each player's linemates and context, rather than their individual performance. Find an Auston Matthews Vs. Leon Draisaitl thread from last summer, and by page 5, the discussion will have shifted to fans downplaying their own players in a comparison between Rieder/Lucic/McDavid and Hyman/Nylander/Kapanen. Taylor Hall won the Hart Trophy in 2017-2018 not because he had the most points, but because he had the largest gap in points between himself and his next closest teammate, and his weak team made the playoffs. Back in the 1940's, fans probably argued that Toe Blake wouldn't have scored as many points if he weren't lining up next to Elmer Lach and Maurice Richard on Montreal's top line.

This is all by way of saying that the concept of adjusting for context is not new. We can pretty much all agree that some sort of context should be used when assessing players and teams. But the concept of using regressions to objectively adjust for context is very new, and draws a ton of ire from more traditionalist hockey fans who prefer more arbitrary, subjective methods of doing so. In fact, I never see people say "but context!" as often as they do when metrics like GAR and RAPM are brought up - even though these metrics actually do adjust for context, unlike other metrics such as points which aren't held to such a high standard. I don't know if this is because people don't understand that metrics like GAR/RAPM adjust for context, or because they just want an easy excuse to discredit whatever assessment came alongside the GAR/RAPM, but it's rather misguided to slam these metrics for not including context. In reality, these metrics require the least additional context to be provided, because they already adjust for a ton of it.

It's worth noting that I just looked at was at the team level. At the individual player level, things get a bit more dicey, and I think even the creators of GAR, along with the biggest proponents, would tell you that it does not perfectly distribute credit among individual players. For example, the top-6 players in PPO GAR (power play offense) are McDavid, Chiasson, Draisaitl, Bergeron, Pastrnak, and DeBrusk. Is it likely that Chiasson and DeBrusk, whose PP scoring rates are 65th and 164th, respectively, are two of the top-6 most positively impactful PP players in the league? No, probably not. It's a lot more likely that the model isn't giving enough credit for that PP impact to guys like McDavid, Draisaitl, and Pastrnak - who respective PP scoring rates rank 1st through 3rd. At the team level, Edmonton and Boston rank 1st and 2nd in PP GAR, so it's likely that they have some of the league's best PP players - the credit just isn't perfectly distributed here by GAR.

However, that doesn't mean that GAR should be discredited entirely for individual players. The issues with distribution of credit are more prominent on the power play than they are at even strength for a few reasons, and I went out of my way to pull out the most wonky GAR results that I don't agree with. By understanding how GAR is calculated, it's easier to make subjective assessments to get an idea of where GAR results may differ from a player's actual impact. But, since GAR gives us a good picture of team quality, the teams full of players with strong GAR will be good teams, and vice versa, so the players whose GAR are not at all indicative of their level of performance are more likely to be a few outliers who have antagonistic outlier teammates on the other side of the spectrum. It's not like this year's Detroit Red Wings are going to be full of players with good GAR, or last year's Tampa Bay Lightning will be full of players with bad GAR, but the contextual adjustments made by GAR will show that a few players on those teams (Larkin, Mantha, Koekkok, Schenn) do fit that bill. If a player's GAR really doesn't match their impact, it should be pretty easy to take a look under the hood and get an idea of whether or not they've got a teammate whose GAR is an outlier in the opposite direction and figure out why that is.

Most importantly, because the contextual adjustments behind GAR make it so much more effective than raw standings at the team level, it's reasonable to say that those contextual adjustments also make it more effective than the metrics we use at the individual player level, even if there should still be more uncertainty around GAR at the player level.

To be clear, there should still be some uncertainty when approaching GAR at the team level as well; these metrics aren't perfect. I've got my own issues with GAR that I could go into better detail on below. But, by adjusting for context, they are better than the metrics that don't adjust for context, and that is proven by how they predict playoff series victories. They clearly do a better job of assessing team quality than raw standings points. And since GAR at the team level is comprised of the GAR of all of their players, GAR can tell us which teams are full of good players and which aren't; even if they will occasionally fail to perfectly distribute credit among players on that team.

If you want to come into this thread and discredit the merit of stats, in favor of the eye test, go ahead. But in that case, you have to discredit all stats, including the traditional ones like standings points that are clearly inferior to GAR. I don't think anybody is actually prepared to do that.

wetcoast · Mar 26, 2020

Garrrrrrrrrrrrrrrrrrrrrrrrrrr...don't get upset I'm drinking and this is really hurting my head.

But seriously the Blacks Hawks were one game of Mike Bolland and a Brian Campbell goalpost away from beating the Canucks in the first round in 2011.

99 or 108 of 180 series doesn't seem like a huge difference to me, after all these are only 7 game series.

What is the breakdown from each year BTW?

DearDiary · Mar 26, 2020

GARRRRRuuuuuuuuuuuuuuuu owo

TomasHertlsRooster · Mar 26, 2020

wetcoast said:
Garrrrrrrrrrrrrrrrrrrrrrrrrrr...don't get upset I'm drinking and this is really hurting my head.

But seriously the Blacks Hawks were one game of Mike Bolland and a Brian Campbell goalpost away from beating the Canucks in the first round in 2011.

99 or 108 of 180 series doesn't seem like a huge difference to me, after all these are only 7 game series.

What is the breakdown from each year BTW?

Hockey is an extremely volatile sport, and results are driven by a ton of luck. I'd imagine that even if we had a perfect statistic to determine who the better team is, the better team might not win more than 2/3 of playoff series.

Here is the breakdown from each year:

Season	Series winners with higher GAR	Series winners with lower GAR	Series winners with higher points	Series winners with lower pints
2018-2019	7	8	7	8
2017-2018	9	6	11	4
2016-2017	5	10	8	7
2015-2016	12	3	8	7
2014-2015	7	8	8	7
2013-2014	7	8	6	9
2013	9	6	9	6
2011-2012	7	8	5	10
2010-2011	13	1	10	5
2009-2010	11	4	8	7
2008-2009	10	5	9	6
2007-2008	11	4	10	5

Total	108	71	99	81
	0.6		0.55

[TBODY] [/TBODY]

Syckle78 · Mar 27, 2020

If you believed in these stats as much as you claim you wouldn't try so hard to convert us. You're basing your entire sermon on a 5% difference.

Butch 19 · Mar 27, 2020

wow, that's really a terrible OP. :help:

PoutineSp00nZ · Mar 27, 2020

Hey guys, here's an advanced stat that does a slightly better job of a basic stat in this one basic scenario.

Analytics are important guys. Really.

Mr Tadakichi · Mar 27, 2020

tl:dr version?

Clamshells · Mar 27, 2020

Have you tried gambling all of your savings on this theory?

BlackFrancis · Mar 27, 2020

Have you looked at something simpler, but closer to provable like goal differential?

Kairi Zaide · Mar 27, 2020

Syckle78 said:
If you believed in these stats as much as you claim you wouldn't try so hard to convert us. You're basing your entire sermon on a 5% difference.

A 5% increase in prediction success is actually very significant, especially in a sport like hockey

BlackFrancis said:
Have you looked at something simpler, but closer to provable like goal differential?

Would be interesting since goal differential is generally a very good indicator of how good a team has been. The predictive power of goal differential rates, for individual players, is generally good but is increased by adding advanced stats in the model, so it'd be interesting to see for teams as a whole.

Legionnaire · Mar 27, 2020

I don't think context is quantifiable.

Senor Catface · Mar 27, 2020

Mr Tadakichi said:
tl:dr version?

TLDR version is:

Draisaitl is a top 5 player and Carlson is a top D-Man.

You heard it from the OP.

TomasHertlsRooster · Mar 27, 2020

Mr Tadakichi said:
tl:dr version?

Predicting every playoff series winner since 2007-2008 using standings rankings would give you a 55% success rates.

Predicting every playoff series winner using team GAR would give you a 60% success rate.

Although GAR correlates heavily with standings rankings, it does a better job of predicting playoff series winners because it adjusts for context, while standings rankings do not.

While GAR at the team level definitely doesn't do a perfect job of distributing credit among each player, it is still very useful because the sum of a team's player's GARs will give us an accurate estimate of how good of a team they are, and identifying inaccurate distributions of credit among teammates is something we can definitely do.

Syckle78 said:
If you believed in these stats as much as you claim you wouldn't try so hard to convert us. You're basing your entire sermon on a 5% difference.

It's the same difference as the difference between standings points and just guessing. And as mentioned above, 5% is pretty big.

Butch 19 said:
wow, that's really a terrible OP.

Care to explain why?

Legionnaire said:
I don't think context is quantifiable.

Well, quantifying and subsequently adjusting for it doubles the strength of our predictive ability.

Ciao · Mar 27, 2020

Butch 19 said:
wow, that's really a terrible OP.

JoeThorntonsRooster said:
Care to explain why?

It's unintelligible.

What the hell is GAR?

BlackFrancis · Mar 27, 2020

Zaide said:
Would be interesting since goal differential is generally a very good indicator of how good a team has been. The predictive power of goal differential rates, for individual players, is generally good but is increased by adding advanced stats in the model, so it'd be interesting to see for teams as a whole.

Hell, even just taking a last quarter sample of goal differential, so you try and wedge some recency heading into the playoffs in there (injuries, momentum, etc).

TomasHertlsRooster · Mar 27, 2020

The 6ix said:
It's unintelligible.

What the hell is GAR?

I honestly just assumed most readers on a hockey forum would have some kind of idea what it is - I’m kind of shocked that you have been on this forum for 10 years and don’t know what it is - but GAR, short for goals above replacement, is similar to wins above replacement (WAR) in baseball. It estimates the additional amount of goals that a player contributed to his team, compared to what a replacement level player (roughly a 13th forward/7th D) would have contributed.

Ciao · Mar 27, 2020

JoeThorntonsRooster said:
I honestly just assumed most readers on a hockey forum would have some kind of idea what it is - I’m kind of shocked that you have been on this forum for 10 years and don’t know what it is - but GAR, short for goals above replacement, is similar to wins above replacement (WAR) in baseball. It estimates the additional amount of goals that a player contributed to his team, compared to what a replacement level player (roughly a 13th forward/7th D) would have contributed.

How would you calculate GAR?

Super Hans · Mar 27, 2020

JoeThorntonsRooster said:
I honestly just assumed most readers on a hockey forum would have some kind of idea what it is - I’m kind of shocked that you have been on this forum for 10 years and don’t know what it is - but GAR, short for goals above replacement, is similar to wins above replacement (WAR) in baseball. It estimates the additional amount of goals that a player contributed to his team, compared to what a replacement level player (roughly a 13th forward/7th D) would have contributed.

What a patronizing response. You deserve all the ridicule you get.

TomasHertlsRooster · Mar 27, 2020

The 6ix said:
How would you calculate GAR?

The first step is to run a regression to determine the impact that every contextual variable - teammates, competition, rest between games, score, venue, etc. has on a player’s on-ice results.

The next step is to subtract the impact of all of their contextual variables from each player’s results. Then, you get their isolated impact.

The next step is to look at the isolated impact of the average replacement level player (14th forwards and 8th D and below) around the league. Then, you subtract the isolated impact of a replacement level player from the isolated impact of the given player that you’re looking at.

There is a slight relative to teammate adjustment made after, in order to make sure that GAR at the player level adds up to GAR at the team level.

That is the process for the GAR inputs for even strength offense, power play offense, even strength defense, and shorthanded defense. (Although for special teams, they use a different set of players than the 14th forwards and 8th D - instead they use anybody below the top 11 players for team for PP and anybody below the top 9 for PK) The offensive categories use goals as the input to judge a player’s input, while the defensive categories use expected goals against (shots, weighted by quality) in order to filter out goaltending from a player’s impact.

There’s also the penalties. This is a much simpler process - you take the average rate at which a replacement level player draws and takes penalties, and subtract that from the rate at which the player in question draws and takes penalties. Drawing more penalties than a replacement level is good, taking more penalties than a replacement level is bad.

Super Hans said:
What a patronizing response. You deserve all the ridicule you get.

Lol, do you not see the post that I responded to?

BlackFrancis · Mar 27, 2020

JoeThorntonsRooster said:
The first step is to run a regression to determine the impact that every contextual variable - teammates, competition, rest between games, score, venue, etc. has on a player’s on-ice results.

The next step is to subtract the impact of all of their contextual variables from each player’s results. Then, you get their isolated impact.

The next step is to look at the isolated impact of the average replacement level player (~13-14th forwards and ~7th D) around the league. Then, you subtract the isolated impact of a replacement level player from the isolated impact of the given player that you’re looking at.

There is a slight relative to teammate adjustment made after, in order to make sure that GAR at the player level adds up to GAR at the team level.

That is the process for the GAR inputs for even strength offense, power play offense, even strength defense, and shorthanded defense. The offensive categories use goals as the input to judge a player’s input, while the defensive categories use expected goals against (shots, weighted by quality) in order to filter out goaltending from a player’s impact.

There’s also the penalties. This is a much simpler process - you take the average rate at which a replacement level player draws and takes penalties, and subtract that from the rate at which the player in question draws and takes penalties. Drawing more penalties than a replacement level is good, taking more penalties than a replacement level is bad.

Lol, do you not see the post that I responded to?

So you wouldn't be able to calculate GAR until the end of a season, as you don't have concrete numbers for replacement value until all the numbers are in, given rule changes, etc? Right?

TomasHertlsRooster · Mar 27, 2020

BlackFrancis said:
So you wouldn't be able to calculate GAR until the end of a season, as you don't have concrete numbers for replacement value until all the numbers are in, given rule changes, etc? Right?

Nah, you could either calculate GAR based on last year’s replacement value, a few years worth of replacement value, or you could just calculate replacement value dynamically during a season based on the data that’s available. GAR numbers are up for this season for example.

Pizza!Pizza! · Mar 27, 2020

OP

Ill throw you a bone and say that in theory your ideas hold merit. However, you overlooked the single biggest variable in the equation - the actual teams. Players get traded/waived/signed/injured; Coaches get hired/fired. This fundamentally can change the viability of a club, I mean did you already forget last season? Did you fail to notice teams like CAR and OTT that go to the ECF and then don't make the playoffs for 3 years and then go on another ECF run?

Your GAR theory would only work in a computer model where player values and coaching effects were constant.

CanadienShark · Mar 28, 2020

JoeThorntonsRooster said:
I honestly just assumed most readers on a hockey forum would have some kind of idea what it is - I’m kind of shocked that you have been on this forum for 10 years and don’t know what it is - but GAR, short for goals above replacement, is similar to wins above replacement (WAR) in baseball. It estimates the additional amount of goals that a player contributed to his team, compared to what a replacement level player (roughly a 13th forward/7th D) would have contributed.

Dude, most people don't have a clue what GAR is, regardless of how long they've been here. I just kind of guessed what the acronym stands for.

ZJuice · Mar 28, 2020

I’ve been playing hockey for 17 years and watching it for 14 years. I had no idea what GAR was.

Now that I know it stands for “Goals Above Replacement” I still don’t know what it means because I thought your response to the poster asking what GAR is was pretty rude and I don’t want to dignify you by reading the rest of your text walls.

- Average reader of hockey boards (ARB)

edit: My apologies for being snappy, I’m very sensitive these days

Context matters, and we can effectively adjust for it.

Don’t say eye test when you mean points

Registered User

🐷🐷🐷🐷🐷🐷🐷🐷🐷🐷🐷🐷🐷🐷🐷🐷🐷🐷🐷🐷🐷🐷🐷

Don’t say eye test when you mean points

Registered User

Go cart Mozart

Electricity is really just organized lightning.

Never Reads OP Before Posting

¯\_(ツ)_/¯

Athletic Supporter Patch Partner

Unforgiven

Help On The Way

Registered User

Don’t say eye test when you mean points

Registered User

Athletic Supporter Patch Partner

Don’t say eye test when you mean points

Registered User

Stats Evangelist

Don’t say eye test when you mean points

Athletic Supporter Patch Partner

Don’t say eye test when you mean points

Registered User

Registered User

pickle juice connoisseur

Ad

Ad

Ad