Expected goals for & against

Hockey Outsider

Registered User
Jan 16, 2005
9,166
14,496
Has anyone (on HF or elsewhere) looked at whether expected goals for (or against) correlate more highly from year to year than actual goals for (or against)? My understanding is, at least in theory, expected goals are supposed to be more predictive of the future.

Anecdotally, I've seen many examples where a player consistently over- (or under-) perform the expected value. That's based on my impressions, not actual evidence, so I'm curious if anyone has looked into this more rigorously.
 

Mickey Marner

Registered User
Jul 9, 2014
19,600
21,316
Dystopia
At the team level they reach peak predictive value at around 30 games. This would like vary by model though.

newplot.png


Expected Goals are a better predictor of future scoring than Corsi, Goals

At the player level, I think there will be quite a bit of variance. I'm not even sure how close the sum of expected goals is to the sum of actual goals league-wide. I speculate that expected goals for would be more accurate than expected goals against on a team level. Simply because it's easier to predict league average opposition goaltending than the individual goaltending one team will receive.

naturalstattrick data is all downloadable if it interests you.
 

Hockey Outsider

Registered User
Jan 16, 2005
9,166
14,496
At the team level they reach peak predictive value at around 30 games. This would like vary by model though.

newplot.png


Expected Goals are a better predictor of future scoring than Corsi, Goals

At the player level, I think there will be quite a bit of variance. I'm not even sure how close the sum of expected goals is to the sum of actual goals league-wide. I speculate that expected goals for would be more accurate than expected goals against on a team level. Simply because it's easier to predict league average opposition goaltending than the individual goaltending one team will receive.

naturalstattrick data is all downloadable if it interests you.
Thanks for the article - will try to take a deeper dive on the weekend.
 

DatsyukToZetterberg

Alligator!
Apr 3, 2011
5,550
739
Island of Tortuga
I know when it comes to projecting players future goal outputs xG hold significant predictive power. When I was testing my fantasy projection model I found that a model of just xG vs G would end up with similar MAE & r^2 values. There is a level of diminishing returns where the more seasons a player has the less of an effect xG have on the regression.

For example, if I was projecting a player's ES goals that has played the past 3 seasons the model and regression coefficients would be:

Pred_Goals = B0+ B1*G.2019 + B2*G.2020+ B3*G.2021+B4*xG.2019+ B5*xG.2020+ B6*xG.2021

Where the Beta's are equal to:

B1B2B3B4B5B6
0.1137​
0.1116​
0.2360​
0.089​
0.100​
0.206​

However, if a player had just 1 season of data the model and regression coefficients were:

Pred_Goals = B0+B1*G.2021+B2*xG.2021

Where the Beta's are equal to:

B1B2
0.2939​
0.5006​

The xG based coefficient for just 1 season holds far more weight in the regression than the 3 season example. In both cases the data was normalized so the B0 was essentially 0 so I didn't include it in the tables.

There are issues with xG when used as a variable in the regression. Certain players that are poor shooters will look far better than you'd expect, think someone like Jordan Staal or Brady Tkachuk. Overall, I think the model improvements you get from including xG outweigh the overrating of those types of players, especially for players with just 1 or 2 seasons of data.
 
Last edited:

RationalExpectations

Registered User
May 12, 2019
4,987
3,773
At the team level they reach peak predictive value at around 30 games. This would like vary by model though.

newplot.png


Expected Goals are a better predictor of future scoring than Corsi, Goals

At the player level, I think there will be quite a bit of variance. I'm not even sure how close the sum of expected goals is to the sum of actual goals league-wide. I speculate that expected goals for would be more accurate than expected goals against on a team level. Simply because it's easier to predict league average opposition goaltending than the individual goaltending one team will receive.

naturalstattrick data is all downloadable if it interests you.
At the peak a R2 of 30% is quite weak to be honest. Given correlation between past goal and past expect goals I don‘t think a cumulative model with both variables would display a R2 above 35%, that s still 65% unexplained variance.
 

bossram

Registered User
Sep 25, 2013
15,596
14,855
Victoria
At the peak a R2 of 30% is quite weak to be honest. Given correlation between past goal and past expect goals I don‘t think a cumulative model with both variables would display a R2 above 35%, that s still 65% unexplained variance.

It's still a higher r^2 than any other single metric *(e.g. past goal differential, win%, etc.).

65% unexplained off a very basic model checks out to me. Hockey is inherently a very random sport. I think it was 538 that estimated roughly 40% of outcomes in hockey are just luck? Sounds about right.

Add in special teams and goaltending to 5v5 xG any maybe you get a model closer to like 50% r^2? I would be highly dubious of any model/person/whatever saying they can predict better.
 

Ad

Upcoming events

Ad

Ad