Refining Adjusted Scoring

BraveCanadian

Registered User
Jun 30, 2010
14,706
3,573
I'm not convinced one person's role in a 5on5 scenario resulting in a goals against is anywhere close to that person's role resulting in goals-FOR, obviously a pure goal/assist contribution to a GF is clearly measured, I on a GA, it's unclear.

The assumption that each player's role in a GF is equal (as in the PLUS side of +/1) is just as misguided on each player's role in a GA scenario.

Using GA as a metric individually is way more of a minefield than using GF where goals and assists have been tracked. At least with those you have a chance of determining who was responsible for the outcome.
 

canucks4ever

Registered User
Mar 4, 2008
3,997
67
Another thing about adjusted stats is that they make jaromir jagr look very god like. If you adjust his four year peak from 1998-2001 and put him in the nhl from 1980 to 1983, his stats look like this.

1997-98: 102 points. adjust to 1979-80: 135 points in 77 games
1998-99: 127 points. adjust to 1980-81: 184 points
1999-00: 96 points. adjust to 1981-82: 140 points in 63 games
2000-01: 121 points. adjust to 1982-83: 169 points
 

plusandminus

Registered User
Mar 7, 2011
1,404
268
This may be what team GF and GA would look like if adjusting based on opposition.

Seas|Seas2|Team|GF|GA|adjGF|adjGA| difGF |difGA| diffGF |diffGA
2010|2011|VAN|258|180|247.8926|182.7046|1.0408|0.9852|10.107|-2.705
2010|2011|NAS|213|190|207.1659|185.7373|1.0282|1.0230|5.8341|4.2627
2010|2011|WAS|219|191|215.2762|192.1225|1.0173|0.9942|3.7238|-1.122
2010|2011|LAK|209|196|205.6189|191.4040|1.0164|1.0240|3.3811|4.5960
2010|2011|MIN|203|229|200.5288|224.7572|1.0123|1.0189|2.4712|4.2428
2010|2011|BOS|244|189|241.0787|193.8096|1.0121|0.9752|2.9213|-4.810
2010|2011|CGY|241|230|238.2989|231.1063|1.0113|0.9952|2.7011|-1.106
2010|2011|CHI|252|220|249.5761|220.6716|1.0097|0.9970|2.4239|-0.672
2010|2011|NYR|224|195|222.0549|198.0278|1.0088|0.9847|1.9451|-3.028
2010|2011|SJS|243|208|241.0376|208.9693|1.0081|0.9954|1.9624|-0.969
2010|2011|PIT|228|196|226.6188|198.1463|1.0061|0.9892|1.3812|-2.146
2010|2011|MTL|213|206|211.9572|207.2419|1.0049|0.9940|1.0428|-1.242
2010|2011|PHO|226|220|225.6778|217.4356|1.0014|1.0118|0.3222|2.5644
2010|2011|STL|236|228|236.2026|226.6726|0.9991|1.0059|-0.203|1.3274
2010|2011|NJD|171|207|171.3342|202.4513|0.9980|1.0225|-0.334|4.5487
2010|2011|FLO|191|222|191.5049|220.8076|0.9974|1.0054|-0.505|1.1924
2010|2011|DAL|222|226|222.8204|222.8756|0.9963|1.0140|-0.820|3.1244
2010|2011|EDM|191|260|191.7822|253.6909|0.9959|1.0249|-0.782|6.3091
2010|2011|ANA|235|233|236.0400|231.3294|0.9956|1.0072|-1.040|1.6706
2010|2011|DET|257|237|258.8212|237.9182|0.9930|0.9961|-1.821|-0.918
2010|2011|TBL|241|234|242.8219|239.0866|0.9925|0.9787|-1.822|-5.087
2010|2011|PHI|256|216|257.9494|222.9530|0.9924|0.9688|-1.949|-6.953
2010|2011|BUF|240|228|241.9020|232.3756|0.9921|0.9812|-1.902|-4.376
2010|2011|CAR|232|234|234.7487|238.4438|0.9883|0.9814|-2.749|-4.444
2010|2011|CBS|210|250|212.6471|244.4229|0.9876|1.0228|-2.647|5.5771
2010|2011|TOR|213|245|216.6609|245.7556|0.9831|0.9969|-3.661|-0.756
2010|2011|OTT|190|245|193.5036|242.2320|0.9819|1.0114|-3.504|2.7680
2010|2011|COL|221|287|225.8517|283.5163|0.9785|1.0123|-4.852|3.4837
2010|2011|ATL|218|262|222.8773|264.1229|0.9781|0.9920|-4.877|-2.123
2010|2011|NYI|225|258|232.1667|261.2689|0.9691|0.9875|-7.167|-3.269

Vancouver faced Colorado many times, which may favour them as other teams tended to score more goals (and more goals per game) on Colorado than on any other team. Vancouver also had Edmonton in their division. They may have had the opportunity to gain 10 GF extra by playing the schedule they did. (If so, the Sedins might have gained about 4 pts each.)

Highest since 1960-61 is around 8 % either way, but that is rare.
1968-69 St Louis was very favoured. They had very low GA themselves (40 year old Jaques Plante and 39 year old Glen Hall had unretired to play for them). St Louis also played in a poor division (half) of the league. I suppose St Louis benefited by not having to play themselves, while all other teams had to..
1989-90 Quebec was very unfavoured. Maybe because they was by far the worst team in the league, including GA wise. They never got the opportunity to play themselves in order to boost their GF. Their division rivals finished 1, 2, 3 and 6 in GA, of 21 teams, while themselves finished dead last.
(I know one can argue about this, like what caused what, and I will look further into it.)
Edit: Even if excluding all Quebec's games, their rivals still finished 1, 2, 3 and 6 in GA per game. So Quebec surely did face division rivals who were very strong GA wise.

It's a bit complicated to explain how I did the calculations it.
Many games vs opponents allowing many GA per GP against other teams is favourable.
Many games vs opponents allowing few GA per GP against other teams, is non-favourable.

My method can be improved further, though I don't think results will be very different.
Don't take the table as something completed.
I will try to include refine the method a bit.

Edit: If GF and GA are different than in the standings, it might be because I exclude all SO goals (as they don't affect the scoring).
 
Last edited:

Czech Your Math

I am lizard king
Jan 25, 2006
5,169
303
bohemia
Good thread.

I understand the overlapping thing, taking 3 season periods. It does even things out.
But I think that's unfair too. If doing so, I would rather have it 20%-60%-20% than 1/3-1/3-1/3%. But it is questionable doing at all.

I don't think you understand how I am using overlapping seasons. I tried to present the methodology as completely and yet simplistically as possible, but can see how it would be confusing.

I was not calculating a moving average of seasons. I was calculating a factor to potentially further refine adjusted scoring data. Once a factor for each season was calculated, the factors can then be multiplied to create cumulative effects and index numbers.

The only overlap is the "before" and "after" seasons:

1947: 1946 = before 1947 = after

If a player went from 1.00 adjusted PPG to 1.10 adjusted PPG, that is +10%. If he went from 1.00 to .90 that is -10%. This is calculated for each player in the study, the data sorted by % change and medians (I used 1/2 or 1/3 of players in study... maybe this is where you got 1/3?) are then determined and the percentages of the players in the median group are averaged. This gives a refinement factor for the season. These can then be multiplied to get an index number (and at some point the results can be normalized which allows comparison to unrefined data).

If median % change for seasons are:

season 1= -10% or -.10
season 2= +5% or +.05
season 3 = +10% or +.10

This gives factors of:

season 1= 1 - .10 = .90
season 2= 1 + .05 = 1.05
season 3 = 1 + 1.10 = 1.10

Season 0 (the before season for season 1) would be base of 1.00:

season 0 = 1.00
season 1 = 1.00 * .90 = .90
season 2 = .90 * 1.05 = .945
season 3 = .945 * 1.10 = 1.0395

So season 0 adjusted stats would remain same, season 1 stats divided by .90, season 2 stats divided by .945, and season 3 stats divided by 1.0395.

We can stop there, the only reason to normalize the factors calculated would be to put them on roughly the same level as the adjusted stats we started with.

Another thing is that scoring also changes during seasons, as does things affecting scoring such as officiating, penalites/PP, etc. In that sense, one might actually want to look at smaller time periods (while also looking at whole season). On the other hand, this probably is being caught good enough in the total stats for the season.

I think looking at less than a season is going to overcomplicate things for a minimal increase in accuracy. I think the biggest issue for adjusted stats that needs refinement is how much more/less difficult it was for top players to score points than is already indicated by adjusted stats. Delving into intra-season stats is not going to help solve that issue and the more you subdivide the data, the more it is going to be prone to "noise" influencing results.

One might want to consider different situations like ES, PP and SH. Different seasons sees a larger amount of PP and/or SH goals than others.
Not only does these things affect total scoring, but it also probably affects distribution. PP specialists are favoured when large share of PP goals.

Then, like I said yesterday, one should also adjust for opposition. To calculate as if all teams faced each other an equal amount of games.
Take Edmonton.
1. What GA per game did their opponents have when not facing Edmonton?
2. How did that compare to league average?
3. Adjust.

This is important. I will present some league power play vs. even strength scoring data by season in the near future.

Adjusting for schedule strength may be worthwhile too, the only problem is then there isn't a league-wide factor being used, which makes it so much more complicated for most to understand and utilize.

Regarding you table of 1st to 4th players on team, it is interesting. I intend to do similar things. Actually, the tables I posted yesterday was based on the average player. One can take extremes like Gretzky out of the calculations, but in the whole even Gretzky's points aren't very many compared to the league as a whole. Let's say 21 teams scored an average of 4 goals per game. That's 84 goals. Let's say Gretzky is responsible for adding 2 goals per game. Left is 82 goals. 82/21=3.9048. Gretzky's affect on total goals per game thus was "only" 0.1. 84/82=1.0244. The presence of Gretzky added total scoring per game by 2.44 %. OK, it has an effect, but not that big. The seasons 1980-85 saw an increase in scoring by 25-30 %.

If taking away the extremes like Gretzky, it might be even more motivated to take away the negative extremes, like scoring vs teams with high GA.

I agree with you, that's a great example of a refinement that makes things way more complicated (in terms of calculation) for a minimal (if any) increase in accuracy.
 

pappyline

Registered User
Jul 3, 2005
4,587
182
Mass/formerly Ont
1968-69 St Louis was very favoured. They had very low GA themselves (40 year old Jaques Plante and 39 year old Glen Hall had unretired to play for them). St Louis also played in a poor division (half) of the league. I suppose St Louis benefited by not having to play themselves, while all other teams had to..

I don't think the calibre of the St. Louis' division had much to do with it. If I remember correctly, they played each team an equal number of times regardless of division.
 

plusandminus

Registered User
Mar 7, 2011
1,404
268
Scoring adjusted by schedule.

Below is scoring adjusted by schedule. G, A and Pts are totally non-adjusted. asG, asA and asPts are adjusted to make each team play each other the same number of times.
GP totally unadjusted, so notice some seasons are 84 games while other 78, etc.
(Within paranthesis is rank. If players have same number, the anyway get different ranks. So keep that in mind, and look for details to determine if player is actually tied at higher rank.)

2010-11 adjusted by schedule:
Seas|Seas2|t|Team|Pos|Name|GP|G|A|Pts|asG|asA| asPt
2010|2011|1|VAN|L|Daniel Sedin|82| 41 ( 2) | 63 ( 3) | 104 ( 1) | 39.394 ( 2) | 60.532 ( 3) | 99.926 ( 1)
2010|2011|1|TBL|R|Martin St. Louis|82| 31 ( 19) | 68 ( 2) | 99 ( 2) | 31.234 ( 21) | 68.514 ( 2) | 99.748 ( 2)
2010|2011|1|TBL|C|Steven Stamkos|82| 45 ( 1) | 46 ( 16) | 91 ( 4) | 45.340 ( 1) | 46.348 ( 13) | 91.688 ( 3)
2010|2011|1|VAN|C|Henrik Sedin|82| 19 ( 108) | 75 ( 1) | 94 ( 3) | 18.256 ( 114) | 72.062 ( 1) | 90.317 ( 4)
2010|2011|1|WAS|L|Alex Ovechkin|79| 32 ( 12) | 53 ( 5) | 85 ( 5) | 31.456 ( 18) | 52.099 ( 6) | 83.555 ( 5)
2010|2011|1|DET|L|Henrik Zetterberg|80| 24 ( 49) | 56 ( 4) | 80 ( 6) | 24.170 ( 51) | 56.397 ( 4) | 80.567 ( 6)
2010|2011|1|DAL|C|Brad Richards|72| 28 ( 34) | 49 ( 8) | 77 ( 7) | 28.103 ( 31) | 49.181 ( 8) | 77.285 ( 7)
2010|2011|1|CAR|C|Eric Staal|81| 33 ( 9) | 43 ( 23) | 76 ( 8) | 33.391 ( 9) | 43.509 ( 23) | 76.900 ( 8)
2010|2011|1|PHI|R|Claude Giroux|82| 25 ( 47) | 51 ( 7) | 76 ( 10) | 25.190 ( 44) | 51.388 ( 7) | 76.579 ( 9)
2010|2011|1|CHI|C|Jonathan Toews|80| 32 ( 16) | 44 ( 20) | 76 ( 9) | 31.692 ( 17) | 43.577 ( 22) | 75.269 ( 10)
2010|2011|1|BUF|L|Thomas Vanek|80| 32 ( 17) | 41 ( 26) | 73 ( 13) | 32.254 ( 13) | 41.325 ( 29) | 73.579 ( 11)
2010|2011|1|DAL|L|Loui Eriksson|79| 27 ( 37) | 46 ( 17) | 73 ( 15) | 27.100 ( 37) | 46.170 ( 16) | 73.270 ( 12)
2010|2011|1|SJS|C|Patrick Marleau|82| 37 ( 4) | 36 ( 47) | 73 ( 12) | 36.701 ( 4) | 35.709 ( 50) | 72.410 ( 13)
2010|2011|1|CHI|R|Patrick Kane|73| 27 ( 35) | 46 ( 15) | 73 ( 14) | 26.740 ( 38) | 45.558 ( 18) | 72.298 ( 14)
2010|2011|1|LAK|C|Anze Kopitar|75| 25 ( 46) | 48 ( 12) | 73 ( 16) | 24.596 ( 49) | 47.223 ( 12) | 71.819 ( 15)
2010|2011|1|DAL|C|Mike Ribeiro|82| 19 ( 100) | 52 ( 6) | 71 ( 18) | 19.070 ( 106) | 52.192 ( 5) | 71.262 ( 16)
2010|2011|1|CHI|C|Patrick Sharp|74| 34 ( 7) | 37 ( 46) | 71 ( 17) | 33.673 ( 8) | 36.644 ( 46) | 70.317 ( 17)
2010|2011|1|VAN|C|Ryan Kesler|82| 41 ( 3) | 32 ( 72) | 73 ( 11) | 39.394 ( 3) | 30.746 ( 87) | 70.140 ( 18)
2010|2011|1|SJS|C|Joe Thornton|80| 21 ( 87) | 49 ( 9) | 70 ( 19) | 20.830 ( 85) | 48.604 ( 9) | 69.435 ( 19)
2010|2011|1|NYI|C|John Tavares|79| 29 ( 27) | 38 ( 41) | 67 ( 21) | 29.924 ( 25) | 39.210 ( 38) | 69.134 ( 20)
2010|2011|1|PHI|C|Danny Briere|77| 34 ( 6) | 34 ( 57) | 68 ( 20) | 34.259 ( 7) | 34.259 ( 57) | 68.518 ( 21)

Best seasons, adjusted for schedule, 1960-2011:
Seas|Seas2|t|Team|Pos|Name|GP|G|A|Pts|asG|asA|asPt
1985|1986|1|EDM|C|Wayne Gretzky|80| 52 ( 103) | 163 ( 1) | 215 ( 1) | 50.251 ( 145) |157.517 ( 1) |207.768 ( 1)
1981|1982|1|EDM|C|Wayne Gretzky|80| 92 ( 1) | 120 ( 6) | 212 ( 2) | 89.594 ( 1) |116.862 ( 7) |206.456 ( 2)
1984|1985|1|EDM|C|Wayne Gretzky|80| 73 ( 8) | 135 ( 2) | 208 ( 3) | 70.870 ( 11) |131.060 ( 2) |201.930 ( 3)
1988|1989|1|PIT|C|Mario Lemieux|76| 85 ( 4) | 114 ( 8) | 199 ( 5) | 86.225 ( 2) |115.643 ( 8) |201.868 ( 4)
1983|1984|1|EDM|C|Wayne Gretzky|74| 87 ( 2) | 118 ( 7) | 205 ( 4) | 84.458 ( 4) |114.552 ( 9) |199.010 ( 5)
1982|1983|1|EDM|C|Wayne Gretzky|80| 71 ( 10) | 125 ( 3) | 196 ( 6) | 69.673 ( 14) |122.663 ( 3) |192.336 ( 6)
1986|1987|1|EDM|C|Wayne Gretzky|79| 62 ( 28) | 121 ( 5) | 183 ( 7) | 61.330 ( 30) |119.692 ( 5) |181.021 ( 7)
1988|1989|1|LAK|C|Wayne Gretzky|78| 54 ( 74) | 114 ( 9) | 168 ( 9) | 55.790 ( 57) |117.779 ( 6) |173.569 ( 8)
1987|1988|1|PIT|C|Mario Lemieux|77| 70 ( 12) | 98 ( 14) | 168 ( 8) | 71.832 ( 9) |100.565 ( 13) |172.397 ( 9)
1980|1981|1|EDM|C|Wayne Gretzky|80| 55 ( 71) | 109 ( 10) | 164 ( 10) | 55.370 ( 63) |109.733 ( 10) |165.104 ( 10)
1995|1996|1|PIT|C|Mario Lemieux|70| 69 ( 16) | 92 ( 21) | 161 ( 12) | 70.006 ( 13) | 93.341 ( 21) |163.346 ( 11)
1990|1991|1|LAK|C|Wayne Gretzky|78| 41 ( 435) | 122 ( 4) | 163 ( 11) | 40.398 ( 448) |120.209 ( 4) |160.607 ( 12)
1992|1993|1|PIT|C|Mario Lemieux|60| 69 ( 17) | 91 ( 25) | 160 ( 13) | 68.033 ( 16) | 89.725 ( 25) |157.758 ( 13)
1988|1989|1|LAK|C|Bernie Nicholls|79| 70 ( 13) | 80 ( 56) | 150 ( 16) | 72.321 ( 8) | 82.652 ( 50) |154.973 ( 14)
1988|1989|1|DET|C|Steve Yzerman|80| 65 ( 22) | 90 ( 26) | 155 ( 14) | 64.465 ( 21) | 89.260 ( 28) |153.725 ( 15)
1995|1996|1|PIT|R|Jaromir Jagr|82| 62 ( 27) | 87 ( 35) | 149 ( 17) | 62.904 ( 24) | 88.268 ( 31) |151.171 ( 16)
1970|1971|1|BOS|C|Phil Esposito|78| 76 ( 6) | 76 ( 85) | 152 ( 15) | 74.148 ( 5) | 74.148 ( 91) |148.295 ( 17)
1989|1990|1|LAK|C|Wayne Gretzky|73| 40 ( 446) | 102 ( 12) | 142 ( 23) | 41.072 ( 414) |104.734 ( 12) |145.807 ( 18)
1981|1982|1|QUE|C|Peter Stastny|80| 46 ( 231) | 93 ( 19) | 139 ( 25) | 47.957 ( 189) | 96.956 ( 16) |144.912 ( 19)
1987|1988|1|EDM|C|Wayne Gretzky|64| 40 ( 447) | 109 ( 11) | 149 ( 18) | 38.853 ( 548) |105.875 ( 11) |144.728 ( 20)
1985|1986|1|PIT|C|Mario Lemieux|79| 48 ( 198) | 93 ( 20) | 141 ( 24) | 49.095 ( 172) | 95.122 ( 17) |144.217 ( 21)
1992|1993|1|BUF|C|Pat LaFontaine|84| 53 ( 89) | 95 ( 18) | 148 ( 19) | 51.603 ( 120) | 92.497 ( 24) |144.100 ( 22)
1981|1982|1|NYI|R|Mike Bossy|80| 64 ( 24) | 83 ( 51) | 147 ( 20) | 62.277 ( 25) | 80.766 ( 55) |143.043 ( 23)

For example, Gretzky's best numbers gets slightly worse, while Mario's often gets slightly better. It seems that the gap in goal scoring between Gretzky's best and Mario's best would go from 7 to just 3.3 goals.

Total 1960-2011, when adjusting for schedule (not 100 % due to players sometimes changing team during seasons):
Name|GP|G|A|Pts|asG|asA|asPts
Wayne Gretzky|1487| 894 ( 1) |1963 ( 1) |2857 ( 1) | 886.008 ( 1) |1948.455 ( 1) |2834.463 ( 1)
Mark Messier|1756| 694 ( 5) |1193 ( 3) |1887 ( 2) | 689.755 ( 6) |1188.377 ( 3) |1878.133 ( 2)
Ron Francis|1731| 549 ( 21) |1249 ( 2) |1798 ( 3) | 556.279 ( 21) |1265.700 ( 2) |1821.979 ( 3)
Marcel Dionne|1348| 731 ( 3) |1040 ( 8) |1771 ( 4) | 734.588 ( 3) |1046.098 ( 9) |1780.687 ( 4)
Mario Lemieux|915| 690 ( 7) |1033 ( 9) |1723 ( 6) | 699.134 ( 5) |1047.786 ( 8) |1746.919 ( 5)
Steve Yzerman|1514| 692 ( 6) |1063 ( 7) |1755 ( 5) | 687.290 ( 7) |1055.292 ( 7) |1742.582 ( 6)
Joe Sakic|1378| 625 ( 12) |1016 ( 10) |1641 ( 7) | 633.134 ( 12) |1031.409 ( 10) |1664.543 ( 7)
Jaromir Jagr|1273| 646 ( 10) | 953 ( 12) |1599 ( 8) | 643.444 ( 10) | 947.600 ( 12) |1591.044 ( 8)
Raymond Bourque|1612| 410 ( 58) |1169 ( 4) |1579 ( 9) | 409.542 ( 60) |1167.470 ( 4) |1577.012 ( 9)
Mark Recchi|1652| 577 ( 16) | 956 ( 11) |1533 ( 10) | 578.280 ( 16) | 955.479 ( 11) |1533.760 ( 10)
Paul Coffey|1409| 396 ( 67) |1135 ( 5) |1531 ( 11) | 391.333 ( 73) |1124.914 ( 5) |1516.247 ( 11)
Luc Robitaille|1431| 668 ( 8) | 726 ( 37) |1394 ( 16) | 677.604 ( 8) | 735.401 ( 34) |1413.005 ( 12)
Adam Oates|1337| 341 ( 114) |1079 ( 6) |1420 ( 13) | 337.351 ( 112) |1067.591 ( 6) |1404.942 ( 13)
Bryan Trottier|1279| 524 ( 24) | 901 ( 14) |1425 ( 12) | 516.183 ( 26) | 887.296 ( 14) |1403.479 ( 14)
Dale Hawerchuk|1188| 518 ( 26) | 891 ( 15) |1409 ( 14) | 515.248 ( 27) | 885.719 ( 15) |1400.967 ( 15)
Brett Hull|1264| 740 ( 2) | 650 ( 48) |1390 ( 17) | 734.735 ( 2) | 645.498 ( 48) |1380.234 ( 16)
Jari Kurri|1251| 601 ( 15) | 797 ( 22) |1398 ( 15) | 592.249 ( 15) | 787.061 ( 23) |1379.309 ( 17)
Mike Modano|1499| 561 ( 19) | 813 ( 20) |1374 ( 18) | 560.481 ( 19) | 812.603 ( 20) |1373.084 ( 18)
Brendan Shanahan|1524| 656 ( 9) | 698 ( 42) |1354 ( 19) | 652.490 ( 9) | 694.395 ( 44) |1346.885 ( 19)
Mats Sundin|1346| 564 ( 18) | 785 ( 25) |1349 ( 21) | 562.666 ( 18) | 783.679 ( 24) |1346.344 ( 20)
Pierre Turgeon|1294| 515 ( 27) | 812 ( 21) |1327 ( 26) | 518.480 ( 25) | 816.777 ( 19) |1335.258 ( 21)
Dave Andreychuk|1639| 640 ( 11) | 698 ( 41) |1338 ( 23) | 636.785 ( 11) | 696.630 ( 41) |1333.415 ( 22)
Phil Esposito|973| 608 ( 13) | 724 ( 38) |1332 ( 25) | 606.570 ( 13) | 723.221 ( 38) |1329.791 ( 23)
Mike Gartner|1432| 708 ( 4) | 627 ( 56) |1335 ( 24) | 702.853 ( 4) | 622.333 ( 56) |1325.186 ( 24)
Denis Savard|1196| 473 ( 37) | 865 ( 16) |1338 ( 22) | 466.701 ( 39) | 853.863 ( 16) |1320.564 ( 25)
Guy Lafleur|1126| 560 ( 20) | 793 ( 23) |1353 ( 20) | 545.514 ( 22) | 773.911 ( 27) |1319.425 ( 26)

If there had been "fairer" schedules, there might have been some changes. Like Ron Francis moving past Messier. Mario moving past Yzerman. Pierre Turgeon moving up.
Mario might have been ahead of Yzerman and Messier in goal scoring.
Mario might also have been ahead of Marcel Dionne is assists.
 
Last edited:

seventieslord

Student Of The Game
Mar 16, 2006
36,129
7,215
Regina, SK
I don't think the calibre of the St. Louis' division had much to do with it. If I remember correctly, they played each team an equal number of times regardless of division.

You're confusing it with the later post-expansion years, which we discussed before. i recall we were both surprised to find out that in the 1971 and 1972 seasons, the league was balanced and Hull's stats were therefore not inflated.

In 1969, however, the schedule was unbalanced with more intra-divisional play.
 

plusandminus

Registered User
Mar 7, 2011
1,404
268
I don't think the calibre of the St. Louis' division had much to do with it. If I remember correctly, they played each team an equal number of times regardless of division.

8 games vs each divisional rival. 6 games vs non-divisional.
 

Czech Your Math

I am lizard king
Jan 25, 2006
5,169
303
bohemia
I like the idea of adjusted stats, I'd like to see "better" stats. But I'm not sure if this thread is meant to be purely statistical or whether there's room to add the element of IMPACT.

It isn't meant to be purely statistical. I think it's important that there is logic to how and why things are calculated and adjusted. A qualitative discussion as to the effects of different things like talent dilution (war, competitor leagues, expansion, etc.) and talent contraction (post-war, mergers, influx of overseas talent, lack of expansion for long periods, etc.) can help explain patterns in the data that otherwise may be attributed to other factors.

So, is there any value, assuming there's the right data, to apply some level of multiplier to certain statistics?

For example, can you weigh:
- a goal more than an assist,
- a game winning goal more than a goal
- devalue an empty net goal, or assist
- increase the value of a playoff goal
(not all of these are valuable or realistic - treat as thought starters)

Goals created is a great stat if you put more value on a goal than an assist.

I don't think GWG should be given extra value in general, at least not enough to bother further adjustment. Same goes for devaluing empty net goals/assists, especially as it's a lot harder to find this data. Playoff stats should be treated separately IMO, because it's not so much a level playing field (in terms of schedule, etc.) as in the regular season. However, one could use adjusted playoff stats and give them a higher proportional value when combining the two.

I also love the plusandminus suggestion of adjusting for opposition.

Of course, every adjustment, meant to refine, can easily lead to more uncertainty.

I don't know how you account for style of play, for example. Some teams play a style that other teams, or players, are better suited to. And that's across all eras as well. Look a Bryan McCabe pre-post lockout. Bigger, slower defensemen are much more effective in an obstruction era but wouldn't have jobs today.

Tim Kerr was effective as hell in his era, but he couldn't skate at all. How would he play today? Adjusting for style is probably impossible but interesting nonetheless.

I think style should be considered when evaluating players. By that I mean that a player whose skills would allow him to succeed in most/all eras should be considered better than the same level player whose skills were less versatile and may have only been able to succeed in certain eras. Still, this is very subjective and don't see how it can be objectively quantified and analyzed easily.

I would love to see adjusted stats JUST for playoffs. Because you get the best of the era competing in really important games, the data becomes MORE indicative of real impact. Teams are more evenly matched in the playoffs, especially in the later rounds. Measuring a player's performance in those heightened moments of intensity will do a much better job of statistically measuring the impact of a player (plus you get closer to a player's impact on wins/cups)

I have adjusted playoff data that I will present in the near future. I know Pnep has done some work in this area, and his results seemed to be very similar to mine. Perhaps he will present those.

This is really hard, but interesting stuff.

How do you account for Gretzky's impact on Messier. Just being a #2 centre, not facing top defenders and top checking lines, makes Messier's job dis-proportionally easier than say Dale Hawerchuk's job.

cool but impossible. good thread.

I know it seems impossible, and in many cases it is.

It is easy to get bogged down in the endless ways that data can be subdivided and analyzed. That is why I would prefer looking at broader trends such as adjusted scoring in different eras.

Still, I believe the year over year study (somewhat similar to a study of league quality I have seen) has a lot of validity and may be the only way to reflect how much easier/more difficult it was from season to season for top players to produce.

I see the goal as further shortening the distance between current adjusted data and "perfectly adjusted" data. If raw data is 60/100 in terms of reliability, and current adjusted data is 80-85, then it seems very possible that we could further refine it to ~90 with only a couple more important adjustments or refinements. Making several or dozens of minor adjustments that only increase random error and substantially decrease the data sample sizes, in the attempt to get to 92 or 93 instead of 90/100 is probably not the best idea at this point.
 

Czech Your Math

I am lizard king
Jan 25, 2006
5,169
303
bohemia
Interesting thread.

Would like to see the bottom tier of players represented in a fashion since regardless of era they have a drag impact on scoring relative to the era. Qualifier would be that such players should have played at least 70% of the scheduled games.

The adjusted scoring regular season vs playoffs comparison suggested by Redbull has merit.

I will present more tiered data and adjusted playoff data in the near future.
 

Czech Your Math

I am lizard king
Jan 25, 2006
5,169
303
bohemia
There also needs to be some refinement for players in the 1970's. In 1974, the league averaged 6.39 goals per game, yet only 3 players cracked 90 points. Doesn't sound like a run n gun season to me. The fact that Orr outscored his non-teammates by over 35 points that year just shows how truly dominant he was.

I agree the 70's need refinement... but we may disagree on the direction. By far the largest jump in the year over year study is from 1967 to 1968 during the first year of expansion. Also, if you look at the scoring and GF/GA ratios of the original six teams before and after expansion, the effect is shown there as well. Similarly, the parity in the 70's shows it to be by far the most unbalanced time. Maybe players on the expansion teams get underrated, but probably need to do a separate year over year study to determine whether that is the case.

I don't see too much unusual about 1974. The % of goals scored on the power play is lower than most other years in that era, which may be one reason. Strangely, from 1973 to 1974 is the lowest change in adjusted scoring of any years in the YoY study ('47-'79).

Another thing about adjusted stats is that they make jaromir jagr look very god like. If you adjust his four year peak from 1998-2001 and put him in the nhl from 1980 to 1983, his stats look like this.

1997-98: 102 points. adjust to 1979-80: 135 points in 77 games
1998-99: 127 points. adjust to 1980-81: 184 points
1999-00: 96 points. adjust to 1981-82: 140 points in 63 games
2000-01: 121 points. adjust to 1982-83: 169 points

I know a lot of people are not fans of Jagr, but his peak adjusted stats (and what you listed is only about half of his peak/prime) are only one of the many alternative studies that show his value:

- career adjusted goals/assists/points (especially if factor missed lockout season and three seasons in KHL... and still going)
- adjusted plus-minus (by Overpass, this was fantastic work)
- Hart shares (as calculated by Hockey Outsider, great as always)
- the records and GF/GA data of his teams when he was playing and when he was injured (I have presented some of this before)
- adjusted playoff stats (if I present individual data here)
 

Czech Your Math

I am lizard king
Jan 25, 2006
5,169
303
bohemia
Here is data for 1st N thru 12th N, adjusted to 6.0 GPG:

Year 1N 2N 3N 4N 5N 6N 7N 8N 9N 10N 11N 12N
-----------------------------------------
2011 83 66 58 52 48 44 40 35 32 28 26 23
2010 88 68 58 52 47 42 38 35 31 28 24 22
2009 87 69 59 52 47 43 40 35 31 28 25 22
2008 91 72 61 54 45 41 38 34 30 27 24 21
2007 92 71 61 55 47 42 37 33 30 27 25 22
2006 89 69 60 53 47 42 38 34 31 27 25 22
2004 85 65 59 52 47 41 37 34 31 29 26 23
2003 92 70 61 53 47 42 37 34 31 27 23 21
2002 85 71 61 54 48 43 38 34 30 28 25 21
2001 93 75 61 53 47 40 36 33 30 26 24 22
2000 85 69 59 52 46 42 38 34 31 28 25 23
1999 94 67 60 54 48 43 38 35 32 28 25 21
1998 89 69 58 52 47 42 38 34 30 27 25 22
1997 91 69 59 52 46 41 37 33 30 27 24 22
1996 100 72 60 53 48 42 37 32 29 26 23 20
1995 92 70 62 52 45 42 37 34 31 27 24 22
1994 89 71 62 52 47 41 37 33 29 27 24 22
1993 95 71 62 55 47 41 36 32 29 25 23 20
1992 88 69 62 53 49 42 37 33 29 26 24 21
1991 91 68 59 51 45 40 37 33 30 27 24 22
1990 89 70 61 55 48 42 37 34 30 27 23 21
1989 91 67 59 53 48 42 37 35 32 28 24 21
1988 89 68 60 52 46 42 37 33 29 26 23 21
1987 83 65 59 52 46 43 39 36 33 29 26 23
1986 88 63 57 52 47 43 38 34 31 29 26 24
1985 88 65 55 50 46 42 37 34 31 28 25 23
1984 86 67 58 51 46 42 38 35 31 28 25 23
1983 85 64 58 53 47 42 38 34 32 29 26 24
1982 88 66 58 53 47 43 38 34 31 29 25 22
1981 86 64 56 52 49 45 42 37 32 28 25 23
1980 89 68 60 54 48 42 39 35 32 29 26 22
1979 90 67 58 52 48 43 38 35 31 29 26 23
1978 88 68 59 54 49 44 39 36 33 30 25 22
1977 88 67 57 52 47 43 40 37 33 29 27 24
1976 93 71 62 55 49 44 40 36 33 30 27 24
1975 92 69 60 55 49 42 39 35 33 30 27 25
1974 90 72 62 55 49 44 40 35 32 30 27 23
1973 93 76 65 55 49 44 40 36 33 30 27 23
1972 99 73 61 54 49 46 41 37 32 29 25 22
1971 95 68 60 54 49 44 40 35 33 30 27 24
1970 91 73 64 58 50 46 41 37 34 29 26 23
1969 100 74 63 56 50 46 42 36 32 29 26 23
1968 91 72 62 56 50 46 42 37 34 30 26 24
1967 87 68 57 53 49 46 42 41 38 32 29 24
1966 92 71 63 57 52 46 41 38 33 29 24 22
1965 93 70 59 55 51 47 43 36 34 29 25 22
1964 101 79 69 60 50 43 37 33 31 29 26 24
1963 91 75 69 56 50 43 39 35 31 30 26 24
1962 90 73 66 57 53 48 45 40 33 28 25 22
1961 96 76 63 57 49 45 40 38 36 32 27 22
1960 91 79 62 55 49 44 41 36 34 32 28 24
1959 102 76 68 61 51 46 42 37 32 28 25 24
1958 94 73 64 59 53 49 45 39 34 30 27 24
1957 103 72 61 55 52 48 43 39 31 28 25 21
1956 105 76 67 57 53 46 40 35 31 28 26 23
1955 100 78 65 57 50 46 42 39 34 28 25 23
1954 93 71 64 57 52 46 42 39 35 30 26 23
1953 104 70 61 56 52 44 39 36 35 30 26 21
1952 94 74 65 59 55 51 44 37 32 27 24 19
1951 94 76 57 53 50 46 43 40 34 31 28 22
1950 92 71 63 58 54 49 44 40 35 29 26 22
1949 97 75 66 61 53 48 42 36 32 27 25 21
1948 90 79 67 57 51 45 39 35 31 29 27 24
1947 96 78 68 60 53 46 42 35 32 29 26 22
1946 95 80 69 59 52 46 41 36 32 29 26 22
 
Last edited:

Czech Your Math

I am lizard king
Jan 25, 2006
5,169
303
bohemia
power play data

This is some power play data since 1964. ES/G is even strength GPG, PP/G is power play GPG, %PPG is percentage of total goals scored on the power play, and PP% is percentage of power plays converted. Shorthanded goals are counted as even strength goals, but not sure where they should be counted.

YEAR ES/G PP/G %PPG PP%
---------------------------------------
1964 2.20 0.57 20.6% 15.7%
1965 2.18 0.70 24.3% 15.8%
1966 2.33 0.71 23.5% 18.9%
1967 2.38 0.60 20.1% 18.1%
1968 2.18 0.54 19.9% 16.9%
1969 2.37 0.61 20.3% 17.4%
1970 2.18 0.72 24.9% 19.5%
1971 2.43 0.69 22.2% 18.8%
1972 2.40 0.67 21.8% 19.4%
1973 2.65 0.63 19.2% 18.7%
1974 2.57 0.63 19.7% 19.1%
1975 2.63 0.80 23.4% 20.3%
1976 2.59 0.83 24.2% 20.5%
1977 2.66 0.66 20.0% 19.8%
1978 2.62 0.68 20.5% 21.2%
1979 2.72 0.78 22.2% 22.7%
1980 2.75 0.76 21.7% 21.9%
1981 2.88 0.96 25.0% 22.5%
1982 3.10 0.91 22.7% 22.9%
1983 2.98 0.89 23.0% 22.9%
1984 3.02 0.93 23.4% 21.9%
1985 3.00 0.89 22.8% 22.2%
1986 2.95 1.03 25.8% 22.1%
1987 2.77 0.90 24.5% 21.0%
1988 2.60 1.11 29.9% 20.3%
1989 2.68 1.06 28.4% 21.0%
1990 2.74 0.95 25.8% 20.8%
1991 2.57 0.89 25.7% 19.4%
1992 2.52 0.96 27.7% 19.2%
1993 2.59 1.04 28.6% 19.6%
1994 2.34 0.90 27.9% 18.6%
1995 2.21 0.77 25.8% 17.7%
1996 2.24 0.90 28.7% 17.9%
1997 2.24 0.67 23.0% 16.3%
1998 1.94 0.70 26.3% 15.1%
1999 1.94 0.70 26.4% 15.8%
2000 2.10 0.65 23.5% 16.2%
2001 1.99 0.77 27.9% 16.6%
2002 1.97 0.65 24.7% 15.8%
2003 1.92 0.73 27.6% 16.4%
2004 1.87 0.70 27.0% 16.5%
2006 1.99 1.04 34.3% 17.7%
2007 2.03 0.85 29.7% 17.6%
2008 1.96 0.76 27.8% 17.8%
2009 2.06 0.79 27.8% 19.0%
2010 2.08 0.68 24.7% 18.2%
2011 2.05 0.68 25.0% 18.0%
 
Last edited:

plusandminus

Registered User
Mar 7, 2011
1,404
268
This is some power play data since 1964. ES/G is even strength GPG, PP/G is power play GPG, %PPG is percentage of total goals scored on the power play, and PP% is percentage of power plays converted. Shorthanded goals are counted as even strength goals, but not sure where they should be counted.

...

Are you sure you are excluding shootout goals?
 

Czech Your Math

I am lizard king
Jan 25, 2006
5,169
303
bohemia
adjusted playoff data

This is the GPG for adjusted playoff data. This would the number to use for a player instead of the regular season GPG when adjusting playoff stats for that playoff season.

RSG/G is regular season GPG (league-wide, all teams)
EPG/G is expected goals per game in playoffs
PG/G is actual GPG in playoffs
ADJ is adjusted GPG (this is number to use for adjusting playoff data)
%RS is actual GPG as percentage of expected GPG

The great thing about this adjusted playoff GPG number is that once data is adjusted, it is comparable to regular season data (in the sense that the averages are the same, but not in the sense that the composition of teams was the same).

Expected GPG was calculated as follows:

for each team: PO games * regular season GPG = expected goals
sum PO games for all teams, sum expected goals for all teams
Expected GPG = (sum of expected goals)/(sum of PO games)

Actual GPG = (total PO GF+GA)/(total PO games)

Adjusted Playoff GPG = Reg. Season GPG * (Actual PO GPG/Expected PO GPG)

YR RSG/G EPG/G PG/G ADJ. %RS
-------------------------------------
2011 5.46 5.36 5.61 5.71 105%
2010 5.53 5.49 5.98 6.02 109%
2009 5.70 5.96 5.48 5.36 94%
2008 5.45 5.46 5.46 5.47 100%
2007 5.76 5.77 4.95 5.07 88%
2006 6.05 6.20 5.66 5.64 93%
2004 5.14 5.06 4.40 4.47 87%
2003 5.31 4.97 4.71 4.86 93%
2002 5.23 4.77 5.24 4.76 91%
2001 5.51 5.64 4.80 4.69 85%
2000 5.49 5.36 4.67 4.79 87%
1999 5.27 5.23 5.12 5.15 98%
1998 5.28 5.19 5.05 5.13 97%
1997 5.83 5.75 5.35 5.43 93%
1996 6.29 6.38 5.88 5.80 92%
1995 5.97 5.95 6.36 6.38 107%
1994 6.48 6.42 5.72 5.78 89%
1993 7.25 7.30 6.84 6.79 94%
1992 6.96 6.80 6.43 6.42 92%
1991 6.91 6.83 6.59 6.50 94%
1990 7.36 7.22 6.65 6.78 92%
1989 7.48 7.42 6.54 6.59 88%
1988 7.43 7.36 7.57 7.63 103%
1987 7.33 7.18 6.28 6.41 87%
1986 7.93 7.81 6.50 6.56 83%
1985 7.77 7.74 7.19 7.22 93%
1984 7.88 7.99 6.31 6.22 79%
1983 7.72 7.60 7.53 7.65 99%
1982 8.02 8.07 7.21 7.17 89%
1981 7.68 7.60 7.94 8.02 104%
1980 7.02 6.95 6.60 6.66 95%
1979 6.99 7.03 6.18 6.14 88%
1978 6.59 6.53 5.78 5.83 88%
1977 6.64 6.67 6.39 6.35 96%
1976 6.82 6.75 5.79 5.85 86%
1975 6.85 7.09 6.14 5.93 87%
1974 6.39 6.38 5.76 5.77 90%
1973 6.55 6.58 6.34 6.32 96%
1972 6.13 6.13 6.03 6.03 98%
1971 6.24 6.18 6.12 6.34 102%
1970 5.81 5.77 6.06 6.10 105%
1969 5.96 5.88 5.67 5.75 96%
1968 5.58 5.45 5.60 5.73 103%
1967 5.96 5.80 5.44 5.59 94%
1966 6.08 5.91 5.31 5.46 90%
1965 5.75 5.64 5.15 5.25 91%
1964 5.55 5.43 5.33 5.45 98%
1963 5.95 5.63 5.69 6.01 101%
1962 6.02 5.85 5.67 5.83 97%
1961 6.00 5.78 5.00 5.19 87%
1960 5.90 5.71 5.43 5.61 95%
1959 5.80 5.80 6.00 6.00 103%
1958 5.60 5.64 6.19 6.15 110%
1957 5.38 5.32 5.60 5.66 105%
1956 5.07 5.02 5.60 5.65 111%
1955 5.04 5.02 5.75 5.77 114%
1954 4.80 4.65 4.31 4.45 93%
1953 4.79 4.94 5.50 5.34 111%
1952 5.19 4.95 3.93 4.12 79%
1951 5.42 5.16 4.38 4.60 85%
1950 5.47 5.21 4.32 4.53 83%
1949 5.43 5.29 4.63 4.74 87%
1948 5.86 5.67 5.80 5.99 102%
1947 6.32 6.00 5.25 5.53 87%
1946 6.69 6.44 6.50 6.75 101%
1945 7.35 7.33 4.90 4.91 67%
1944 8.17 7.41 5.57 6.14 75%

I will present some individual data from application of these results, probably in a separate thread.
 
Last edited:

Czech Your Math

I am lizard king
Jan 25, 2006
5,169
303
bohemia
Are you sure you are excluding shootout goals?

Actually, I forgot to do that for 2011, but it has now been corrected. I'm still mystified as to why 2007, 2008 and 2011 team GF and GA data didn't match in the parity/standard deviation data presented earlier. I'm sure there are errors in my data, but they should be relatively few and are certainly unintentional. Anyone who has the same or similar data, or the means and inclination to calculate such is encouraged to do so.
 

plusandminus

Registered User
Mar 7, 2011
1,404
268
Actually, I forgot to do that for 2011, but it has now been corrected. I'm still mystified as to why 2007, 2008 and 2011 team GF and GA data didn't match in the parity/standard deviation data presented earlier. I'm sure there are errors in my data, but they should be relatively few and are certainly unintentional. Anyone who has the same or similar data, or the means and inclination to calculate such is encouraged to do so.

OK. This is what I get when summarizing.

Seas|Seas2|G|ESG|PPG|SHG|ESperc|PPperc|SHperc
1963|1964|1166|896|240|30|0.76844|0.20583|0.02573
1964|1965|1208|862|291|55|0.71358|0.24089|0.04553
1965|1966|1277|950|299|28|0.74393|0.23414|0.02193
1966|1967|1252|966|253|33|0.77157|0.20208|0.02636
1967|1968|2476|1918|490|68|0.77464|0.19790|0.02746
1968|1969|2718|2096|550|72|0.77116|0.20235|0.02649
1969|1970|2649|1904|663|82|0.71876|0.25028|0.03096
1970|1971|3409|2552|752|105|0.74861|0.22059|0.03080
1971|1972|3348|2512|731|105|0.75030|0.21834|0.03136
1972|1973|7431|5789|1443|199|0.77903|0.19419|0.02678
1973|1974|7394|5772|1406|216|0.78063|0.19015|0.02921
1974|1975|8964|6815|1876|273|0.76026|0.20928|0.03046
1975|1976|8888|6652|1971|265|0.74842|0.22176|0.02982
1976|1977|8212|6379|1604|229|0.77679|0.19532|0.02789
1977|1978|7317|5644|1491|182|0.77135|0.20377|0.02487
1978|1979|6673|5058|1452|163|0.75798|0.21759|0.02443
1979|1980|5902|4493|1286|123|0.76127|0.21789|0.02084
1980|1981|6457|4639|1607|211|0.71845|0.24888|0.03268
1981|1982|6741|5011|1536|194|0.74336|0.22786|0.02878
1982|1983|6493|4807|1493|193|0.74034|0.22994|0.02972
1983|1984|6627|4839|1547|241|0.73019|0.23344|0.03637
1984|1985|6530|4801|1498|231|0.73522|0.22940|0.03538
1985|1986|6667|4714|1716|237|0.70706|0.25739|0.03555
1986|1987|6165|4431|1516|218|0.71873|0.24590|0.03536
1987|1988|6237|4109|1861|267|0.65881|0.29838|0.04281
1988|1989|6286|4248|1778|260|0.67579|0.28285|0.04136
1989|1990|6189|4363|1599|227|0.70496|0.25836|0.03668
1990|1991|5805|4084|1493|228|0.70353|0.25719|0.03928
1991|1992|6123|4174|1700|249|0.68169|0.27764|0.04067
1992|1993|7311|4918|2081|312|0.67268|0.28464|0.04268
1993|1994|7081|4801|1975|305|0.67801|0.27892|0.04307
1994|1995|3727|2619|964|144|0.70271|0.25865|0.03864
1995|1996|6701|4471|1927|303|0.66721|0.28757|0.04522
1996|1997|6216|4527|1422|267|0.72828|0.22876|0.04295
1997|1998|5624|3873|1491|260|0.68866|0.26511|0.04623
1998|1999|5830|4077|1533|220|0.69931|0.26295|0.03774
1999|2000|6306|4594|1496|216|0.72851|0.23723|0.03425
2000|2001|6782|4638|1877|267|0.68387|0.27676|0.03937
2001|2002|6442|4621|1601|220|0.71732|0.24853|0.03415
2002|2003|6530|4513|1787|230|0.69112|0.27366|0.03522
2003|2004|6318|4357|1717|244|0.68962|0.27176|0.03862
2005|2006|7443|4580|2545|318|0.61534|0.34193|0.04272
2006|2007|7082|4715|2099|268|0.66577|0.29639|0.03784
2007|2008|6691|4581|1871|239|0.68465|0.27963|0.03572
2008|2009|7006|4833|1938|235|0.68984|0.27662|0.03354
2009|2010|6803|4948|1664|191|0.72733|0.24460|0.02808
2010|2011|6721|4944|1571|206|0.73560|0.23374|0.03065
 

Czech Your Math

I am lizard king
Jan 25, 2006
5,169
303
bohemia
I don't ask anyone to take any of the numbers presented as gospel. The important thing is the logic behind the calculations and the methodology involved in calculating the numbers.

I believe the most likely wide-ranging improvements might come from:

- Year over Year studies: Keeps constant the upper tiers of players and is possibly the only way to find variations from season to season rather than between eras (larger groups of years). Must have large enough group of players that there are a handful of players born in each year (to keep age from being a bias) and a substantial number of players in each two consecutive seasons (so the sample is large enough for a valid study, several outliers on each extreme of % change can be excluded and still have a large median sample remaining from which to calculate good numbers).

- Studies of tiers (whether proportional such as 1st N to 12th N or tiers using fixed numbers... I tend to think a combination of the two may be best, but weighted much more toward proportionality): This shows how the top players (which vary from season to season) performed on an adjusted basis. However, since the players vary from season to season, significant changes in the quality of the league will greatly influence the results of these types of studies, which necessitates a lot more qualitative discussion/analysis to reach results worthy of using in further refinement of adjusted data.

- Indirect data which influences scoring: This would include such data as power play data, league parity, strength of schedule, etc. It's going to be tricky to integrate much of this type of data, but it may be worthwhile in some cases.
 

Czech Your Math

I am lizard king
Jan 25, 2006
5,169
303
bohemia
Style of play, being a 2nd liner, etc., is hard to quantify.
There are other things we cannot consider. So our attempts at adjusting scoring has limited value.

Looking at greatness of players, scoring is also only one factor - scoring ability. This makes a guy scoring 1+1 but allowing 3 ES goals, appear better than a guy scoring 0+1 without allowing ES goals.

It's important to separate the things which are quantifiable and those that aren't. In your ES goal example, looking at adjusted plus-minus is an example of taking a raw data with a lot of flaws and adjusting it to make it a valuable metric in most cases. There are still flaws which can't be factored out (at least not easily), but they are mostly inherent in the data. Basically, the metric has likely been to it's practical limit by adjusting it with a sound methodology and that's all that can be expected. Any further refinement may only be of use in individual examples (e.g. when comparing Sakic and Forsberg, can look at linemates and situational ice time as was done in a thread comparing the two).
 

plusandminus

Registered User
Mar 7, 2011
1,404
268
I don't ask anyone to take any of the numbers presented as gospel. The important thing is the logic behind the calculations and the methodology involved in calculating the numbers.

I believe the most likely wide-ranging improvements might come from:

- Year over Year studies: Keeps constant the upper tiers of players and is possibly the only way to find variations from season to season rather than between eras (larger groups of years). Must have large enough group of players that there are a handful of players born in each year (to keep age from being a bias) and a substantial number of players in each two consecutive seasons (so the sample is large enough for a valid study, several outliers on each extreme of % change can be excluded and still have a large median sample remaining from which to calculate good numbers).

- Studies of tiers (whether proportional such as 1st N to 12th N or tiers using fixed numbers... I tend to think a combination of the two may be best, but weighted much more toward proportionality): This shows how the top players (which vary from season to season) performed on an adjusted basis. However, since the players vary from season to season, significant changes in the quality of the league will greatly influence the results of these types of studies, which necessitates a lot more qualitative discussion/analysis to reach results worthy of using in further refinement of adjusted data.

- Indirect data which influences scoring: This would include such data as power play data, league parity, strength of schedule, etc. It's going to be tricky to integrate much of this type of data, but it may be worthwhile in some cases.

I think we might get in different directions. I am not really with you on the importance and chronology of some of the things you suggest.
For once, I think adjusting for schedule (which I have spent the day doing, see posts above) is a natural start. (Isn't it natural to start with making adjustments within seasons, so that we adjust for things like some players facing worse teams GA wise more often than others?)

From that start, we can then do other calculations and adjustments. For example, we can recalculate seasonal total goals per game based on the schedule adjusted data. Then we calculate them - when possible - separately for ES, PP and SH. We also look at EN goals. We get new factors for ES, PP and SH, which we apply separately.

I also think you sometimes post a lot of numbers, but you don't not say very much about what they actually mean/tell, or how you think one can use them? Maybe that is intentionally, as you're still are developing your thoughts? Maybe you are at this stage posting different adjustment components, that later on are going to be merged together?

I do think you're on an interesting path regarding looking at what you call "tiers". I intend to do that too, but not tonight. In the thread that seemed to inspire this thread, I started doing it, but didn't complete it. (Well, in a way I did complete it, as I chose the "average player" as the reference.)
But let's suppose we list scoring by "tiers". It is interesting, but using it to adjust scoring would still be arbitrary. If we adjust so that one tier looks good, then other tiers may not look good. When adjusting them, the first one gets worse. Still, we probably need to be consistent in the way we adjust based on tiers.
 

Czech Your Math

I am lizard king
Jan 25, 2006
5,169
303
bohemia
I think we might get in different directions. I am not really with you on the importance and chronology of some of the things you suggest.
For once, I think adjusting for schedule (which I have spent the day doing, see posts above) is a natural start. (Isn't it natural to start with making adjustments within seasons, so that we adjust for things like some players facing worse teams GA wise more often than others?)

That's an interesting approach, I see your point. For example, in studying a fixed group of players over time (the year over year study... similar to studies of league quality) it would make sense to adjust for schedule first (if you were going to do so at all).

How do you deal with the games each player misses or trade players? Is that already built into your adjustment? Do you deduct the team's GF against an opponent from that opponent's GA?

I think if the methodology is correct, it makes sense to adjust based on schedule strength. The effect seems significant enough to make it potentially worthwhile. I hadn't given much thought to chronology of making adjustments, but you are right it can be important. I was thinking more along the lines of looking at different factors and how they can be quantified and/or analyzed, then figuring out the best methods and order to do so.

From that start, we can then do other calculations and adjustments. For example, we can recalculate seasonal total goals per game based on the schedule adjusted data. Then we calculate them - when possible - separately for ES, PP and SH. We also look at EN goals. We get new factors for ES, PP and SH, which we apply separately.

I don't see how the total goals per game is going to change based on schedule adjustments, it seems like only the adjusted distribution of those goals is going to change, can you explain this more?

The ES/PP/SH/EN goals would have to be applied separately as you suggest, since each player scores different proportions of each. Where do you find the assist data for these categories? That would be essential in properly adjusting each season.

I also think you sometimes post a lot of numbers, but you don't not say very much about what they actually mean/tell, or how you think one can use them? Maybe that is intentionally, as you're still are developing your thoughts? Maybe you are at this stage posting different adjustment components, that later on are going to be merged together?

You are right, at least as far as this thread goes, there's an abundance of numbers and often a lack of explanations. I have explained methodology mostly, so that those who are interested can at least try to understand how the numbers were obtained. I don't have a magic answer that will solve everything instantly.

Basically, I have presented a lot of data and as it stand now:

Year over Year study: I detailed the methodology and also gave an example of how the best adjusted point seasons (e.g. from WWII until expansion) would change substantially using this method. While my explanation wasn't nearly exhaustive, I think explaining much more at this point would only further confuse most people. I'll gladly answer any questions or try to correct any confusion.

Tiers of players: The difficulty with this data is that it would be more easily understood in graphical form. I have some graphs in Excel format, but not sure how to post them here. Maybe someone can advise how I could so? I can still add some commentary even without the graphs, but looking at the graphs reinforces the trends much more than digging through an overwhelming amount of data.

Power play data: I think most people can look at this data and see the trends or spikes and correlate them with certain season or eras. These could also be presented in graphical form, although it's not quite as essential as for the tiered data.

Adjusted playoff data: I only presented this, because at least one person specifically requested this type of data. This isn't further refinement of adjusted numbers, it is adjusting playoff scoring to catch up to the adjustments already done to regular season data. I will present some applications of the adjusted playoff GPG numbers that were calculated, but may start separate thread at some point to do so.

I do think you're on an interesting path regarding looking at what you call "tiers". I intend to do that too, but not tonight. In the thread that seemed to inspire this thread, I started doing it, but didn't complete it. (Well, in a way I did complete it, as I chose the "average player" as the reference.)
But let's suppose we list scoring by "tiers". It is interesting, but using it to adjust scoring would still be arbitrary. If we adjust so that one tier looks good, then other tiers may not look good. When adjusting them, the first one gets worse. Still, we probably need to be consistent in the way we adjust based on tiers.

You are right, as someone has to score the points... if one tier scores less, then other tier(s) have to score more to balance it out. However, it does show the trends for each tier and the gaps between tiers. I think there are a lot of interesting things to be learned from this approach, but how easily that will be translated into further refining adjusted scoring for the top tiers is not something I can definitively say at this time.
 

livewell68

Registered User
Jul 20, 2007
8,680
52
Another thing about adjusted stats is that they make jaromir jagr look very god like. If you adjust his four year peak from 1998-2001 and put him in the nhl from 1980 to 1983, his stats look like this.

1997-98: 102 points. adjust to 1979-80: 135 points in 77 games
1998-99: 127 points. adjust to 1980-81: 184 points
1999-00: 96 points. adjust to 1981-82: 140 points in 63 games
2000-01: 121 points. adjust to 1982-83: 169 points

This is without including his 1994-95, 1995-96, 1996-97 seasons which are a part of his peak or prime years and then add 2005-06 to the mix and I would be interested to see where he ranks in terms of adjusted points for peak.
 

plusandminus

Registered User
Mar 7, 2011
1,404
268
That's an interesting approach, I see your point. For example, in studying a fixed group of players over time (the year over year study... similar to studies of league quality) it would make sense to adjust for schedule first (if you were going to do so at all).

Nice that we agree.

How do you deal with the games each player misses or trade players? Is that already built into your adjustment?

That would require game by game analysis. Unfortunately that kind of data is either missing completely for seasons, or unfortunately contains errors.

Player misses games... Well, I only look at how favoured the player's team was during the whole sesaon. So, the method "assumes" he missed a well balanced set of games. That's OK unless the player misses mostly games against either very weak or very strong opposition. I think we'll have to live with this. (Treating games missed is tricky as it is.)

Trades... Well, I can look at games played for each team they played with during the season. But extremes may bias.


Do you deduct the team's GF against an opponent from that opponent's GA?

Yes.

I can also run my method consecutive times in order to refine. First time, I use real GFs and GAs. Second time, I can use the schedule adjusted ones to refine further. I didn't do that, but can do it.


I think if the methodology is correct, it makes sense to adjust based on schedule strength. The effect seems significant enough to make it potentially worthwhile. I hadn't given much thought to chronology of making adjustments, but you are right it can be important. I was thinking more along the lines of looking at different factors and how they can be quantified and/or analyzed, then figuring out the best methods and order to do so.

OK. Well, I sometimes improvise and sometimes am more systematic. I just reacted because you several times wrote about being systematic (maybe not necessarily that particular word). One can improvise first, and then tie things together.


I don't see how the total goals per game is going to change based on schedule adjustments, it seems like only the adjusted distribution of those goals is going to change, can you explain this more?

Yes, the total sum of goals for the season should remain the same.
And the distribution within teams too.
It's if one starts to factor in things like team GF and stuff, but that's not first on my agenda. ;)


The ES/PP/SH/EN goals would have to be applied separately as you suggest, since each player scores different proportions of each. Where do you find the assist data for these categories? That would be essential in properly adjusting each season.

Lack of data is a problem. SHG and PPG goals seems are missing for older seasons (early 60s and earlier). SHA and PPA are missing even more. I don't remember to what extent. It can be frustrating to not have the necessary data.
Here are game by game data (although partly incomplete and also containing many errors):
http://sports.groups.yahoo.com/group/hockey_summary_project
(You need to subscribe.)

There's another group too, containing seasonal data. Can't look it up at this moment, but hockey database or something.


(You mention what you have done so far.)

Good.

(About tiers.)
You are right, as someone has to score the points... if one tier scores less, then other tier(s) have to score more to balance it out. However, it does show the trends for each tier and the gaps between tiers. I think there are a lot of interesting things to be learned from this approach, but how easily that will be translated into further refining adjusted scoring for the top tiers is not something I can definitively say at this time.

We'll see what comes out of it.
 

Ad

Upcoming events

Ad

Ad