Black Gold Extractor
Registered User
- May 4, 2010
- 3,068
- 4,853
I suspect that the biggest contention between people who do historical rankings of players is whether league depth can be accounted for, as well as how it should be done. (See this thread page for a glimpse of that.)
VsX assumes that the league can reliably generate a typical 2nd-place scorer (or typical highest non-outlier scorer to be more accurate) in every era and that this is a good benchmark to use to compare other players. Personally, I think that averaged over several seasons that this is a good assumption. However, can we test this?
RATIONALE
First, let's consider a completely fictional list of numbers:
50, 43, 39, 38, 37, 36, 35, 34, 33, 32, ...
The VsX benchmark for this list would be 43. This is a little bit of an outlier, but 39/43 = 0.907 >= 0.9, so it stays put. The VsX score for the outlier would be 116.
Now, let's say that our sample size is halved.
50, 39, 37, 35, 33, ...
The VsX benchmark for this list would be 39. The VsX score for the outlier would be 128. That's a pretty big difference. But wait, I'm kind of cheating, aren't I? Let's halve the sample size another way:
50, 43, 38, 36, 34, ...
The VsX benchmark for this list would be 38. This is because 38/43 = 0.884 < 0.9, so instead of the 2nd place scorer we use the 3rd. The VsX score for the outlier would be 132 (as opposed to 116 for the full sample).
Yeah, I'm still kind of cheating. What if the 2nd number was 42 instead of 43?
50, 42, 39, 38, 37, 36, 35, 34, 33, 32, ... benchmark 42, VsX 119
50, 39, 37, 35, 33, ... benchmark 39, VsX 128
50, 42, 38, 36, 34, ... benchmark 42, VsX 119
While the above are heavily constructed examples showing the worst possible outcomes, it does raise the possibility that the loss of "resolution" due to a smaller talent pool could benefit the outlier's VsX score. The rest of the field is basically the same with regard to the outlier, but the outlier can get some fairly different scores.
What if we wanted to find some value X, Y, Z for these lists, then:
X, a, 39, 38, 37, 36, 35, 34, 33, 32, ...
Y, b, 37, 35, 33, ...
Z, c, 38, 36, 34, ...
It's not too hard given the above lists. We would find that X = 41, Y = 41, and Z = 42. (If you're wondering, a = 40, b = 39, and c = 40.) As such, given the outlier of 50, the VsX = VsY = 50/41 = 122, and VsZ = 50/42 = 119. That's a much smaller variation, and it doesn't matter if the second number is 43, 42, or 49.
So that's basically it. We estimate the highest non-outlier using a decent sample size and then we compare the real numbers to the estimate, hence VsEst. (Bonus: it sounds like "VsX" if you say it really quickly, too!)
SPECIFICS
It's a bit tough to find a balance between a large sample size and equal opportunity for the sample being analyzed. Using the top 12 scorers basically guarantees a first-line-caliber player in the O6 era and provides an okay sample size. (If we're really pushing it, the top 12 is basically the starters for the 4-team era pre-consolidation, i.e. 1919-20 onward.)
I make two estimates for all years: one using a sample from 2-12*, and one using a sample from 5-12. Then I average the estimates to give a final VsEst benchmark.
The estimate for each sample is made using OpenOffice Calc/Microsoft Excel by adding the slope and intercept of each sample.
*Edited on March 31, 2018.
BENCHMARK COMPARISON
I'm going to compare the VsEst benchmark with the actual VsX benchmarks and what I'm going to call VsX*. VsX* is going to be VsX without the wartime fudges, Bathgate rule, and Orr rules applied. Additionally, the averaging for benchmarks is contracted by one on each end, e.g. the VsX* benchmark for 1988-89 is the average of 168, 155, 150, 115, 113, and 110, giving 135 instead of the original 139.
[THEAD]
[/THEAD]
[TBODY] [/TBODY]The "*" and "!" denote seasons where the VsEst is greater than the VsX or VsX* value respectively by at least 2 points. This becomes more frequent as we go further back in time.
BRIEF PLAYER COMPARISONS
This is the usual best 7-year VsX, unweighted.
[TBODY]
[/TBODY]IS IT WORTH IT?
It's not massively different from VsX. And there are a couple of problems:
1) VsEst uses linear extrapolation on the tail end of a distribution. I'd probably be kicked out of stats class... if I had ever taken stats. (This may also be an issue.)
2) Sometimes, the reduction in the talent pool isn't random... which is a problem when it comes to statistical arguments. There's a possibility that the healthiest players were more fit to go war (and thus the higher-end players were more likely to be away). The WHA obviously tried to poach the biggest stars from the NHL.
Thoughts?
EDIT (March 31, 2018): I modified the benchmark to include the 2nd place scorer for the first sample. I also included the debut date to see the distribution of talent over the eras in the top 20.
1910's: 1 (Denneny)
1920's: 1 (Morenz)
1930's: 1 (Cowley)
1940's: 3 (Richard, Lindsay, Howe)
1950's: 4 (Beliveau, Bathgate, Hull, Mikita)
1960's: 2 (Esposito, Orr)
1970's: 3 (Dionne, Lafleur, Gretzky)
1980's: 2 (Lemieux, Sakic)
1990's: 1 (Jagr)
2000's: 2 (Crosby, Ovechkin)
VsX assumes that the league can reliably generate a typical 2nd-place scorer (or typical highest non-outlier scorer to be more accurate) in every era and that this is a good benchmark to use to compare other players. Personally, I think that averaged over several seasons that this is a good assumption. However, can we test this?
RATIONALE
First, let's consider a completely fictional list of numbers:
50, 43, 39, 38, 37, 36, 35, 34, 33, 32, ...
The VsX benchmark for this list would be 43. This is a little bit of an outlier, but 39/43 = 0.907 >= 0.9, so it stays put. The VsX score for the outlier would be 116.
Now, let's say that our sample size is halved.
50, 39, 37, 35, 33, ...
The VsX benchmark for this list would be 39. The VsX score for the outlier would be 128. That's a pretty big difference. But wait, I'm kind of cheating, aren't I? Let's halve the sample size another way:
50, 43, 38, 36, 34, ...
The VsX benchmark for this list would be 38. This is because 38/43 = 0.884 < 0.9, so instead of the 2nd place scorer we use the 3rd. The VsX score for the outlier would be 132 (as opposed to 116 for the full sample).
Yeah, I'm still kind of cheating. What if the 2nd number was 42 instead of 43?
50, 42, 39, 38, 37, 36, 35, 34, 33, 32, ... benchmark 42, VsX 119
50, 39, 37, 35, 33, ... benchmark 39, VsX 128
50, 42, 38, 36, 34, ... benchmark 42, VsX 119
While the above are heavily constructed examples showing the worst possible outcomes, it does raise the possibility that the loss of "resolution" due to a smaller talent pool could benefit the outlier's VsX score. The rest of the field is basically the same with regard to the outlier, but the outlier can get some fairly different scores.
What if we wanted to find some value X, Y, Z for these lists, then:
X, a, 39, 38, 37, 36, 35, 34, 33, 32, ...
Y, b, 37, 35, 33, ...
Z, c, 38, 36, 34, ...
It's not too hard given the above lists. We would find that X = 41, Y = 41, and Z = 42. (If you're wondering, a = 40, b = 39, and c = 40.) As such, given the outlier of 50, the VsX = VsY = 50/41 = 122, and VsZ = 50/42 = 119. That's a much smaller variation, and it doesn't matter if the second number is 43, 42, or 49.
So that's basically it. We estimate the highest non-outlier using a decent sample size and then we compare the real numbers to the estimate, hence VsEst. (Bonus: it sounds like "VsX" if you say it really quickly, too!)
SPECIFICS
It's a bit tough to find a balance between a large sample size and equal opportunity for the sample being analyzed. Using the top 12 scorers basically guarantees a first-line-caliber player in the O6 era and provides an okay sample size. (If we're really pushing it, the top 12 is basically the starters for the 4-team era pre-consolidation, i.e. 1919-20 onward.)
I make two estimates for all years: one using a sample from 2-12*, and one using a sample from 5-12. Then I average the estimates to give a final VsEst benchmark.
The estimate for each sample is made using OpenOffice Calc/Microsoft Excel by adding the slope and intercept of each sample.
*Edited on March 31, 2018.
BENCHMARK COMPARISON
I'm going to compare the VsEst benchmark with the actual VsX benchmarks and what I'm going to call VsX*. VsX* is going to be VsX without the wartime fudges, Bathgate rule, and Orr rules applied. Additionally, the averaging for benchmarks is contracted by one on each end, e.g. the VsX* benchmark for 1988-89 is the average of 168, 155, 150, 115, 113, and 110, giving 135 instead of the original 139.
Yr. | VsX | VsX* | VsEst | |
2016-17 | 89 | 89 | 92 | * |
2015-16 | 89 | 89 | 86 | |
2014-15 | 86 | 86 | 85 | |
2013-14 | 87 | 87 | 85 | |
2012-13 | 57 | 57 | 58 | |
2011-12 | 97 | 97 | 90 | |
2010-11 | 99 | 99 | 99 | |
2009-10 | 109 | 109 | 105 | |
2008-09 | 110 | 110 | 102 | |
2007-08 | 106 | 106 | 104 | |
2006-07 | 114 | 114 | 109 | |
2005-06 | 106 | 106 | 113 | * |
2003-04 | 87 | 87 | 86 | |
2002-03 | 104 | 104 | 105 | |
2001-02 | 90 | 90 | 84 | |
2000-01 | 96 | 96 | 103 | * |
1999-00 | 94 | 94 | 92 | |
1998-99 | 107 | 107 | 106 | |
1997-98 | 91 | 91 | 95 | * |
1996-97 | 109 | 109 | 104 | |
1995-96 | 120 | 120 | 128 | * |
1994-95 | 70 | 70 | 66 | |
1993-94 | 120 | 120 | 117 | |
1992-93 | 148 | 148 | 147 | |
1991-92 | 123 | 123 | 117 | |
1990-91 | 115 | 115 | 124 | * |
1989-90 | 129 | 129 | 124 | |
1988-89 | 139 | 135 | 145 | * |
1987-88 | 131 | 131 | 128 | |
1986-87 | 108 | 108 | 112 | * |
1985-86 | 141 | 141 | 139 | |
1984-85 | 135 | 135 | 131 | |
1983-84 | 121 | 126 | 128 | * |
1982-83 | 124 | 124 | 117 | |
1981-82 | 147 | 147 | 143 | |
1980-81 | 135 | 135 | 122 | |
1979-80 | 119 | 118 | 120 | ! |
1978-79 | 116 | 117 | 127 | * |
1977-78 | 109 | 108 | 106 | |
1976-77 | 105 | 105 | 107 | * |
1975-76 | 119 | 119 | 118 | |
1974-75 | 121 | 127 | 130 | * |
1973-74 | 91 | 101 | 100 | * |
1972-73 | 104 | 104 | 101 | |
1971-72 | 109 | 117 | 112 | * |
1970-71 | 98 | 116 | 111 | * |
1969-70 | 86 | 86 | 85 | |
1968-69 | 107 | 107 | 101 | |
1967-68 | 84 | 84 | 83 | |
1966-67 | 70 | 70 | 74 | * |
1965-66 | 78 | 78 | 84 | * |
1964-65 | 83 | 83 | 78 | |
1963-64 | 78 | 78 | 85 | * |
1962-63 | 81 | 81 | 78 | |
1961-62 | 84 | 84 | 79 | |
1960-61 | 90 | 90 | 85 | |
1959-60 | 80 | 80 | 81 | |
1958-59 | 83 | 84 | 84 | |
1957-58 | 71 | 71 | 75 | * |
1956-57 | 77 | 72 | 80 | * |
1955-56 | 71 | 71 | 78 | * |
1954-55 | 74 | 74 | 71 | |
1953-54 | 61 | 59 | 59 | |
1952-53 | 61 | 61 | 67 | * |
1951-52 | 69 | 69 | 64 | |
1950-51 | 66 | 66 | 68 | * |
1949-50 | 69 | 69 | 67 | |
1948-49 | 54 | 54 | 57 | * |
1947-48 | 60 | 60 | 60 | |
1946-47 | 63 | 63 | 66 | * |
1945-46 | 60 | 52 | 54 | ! |
1944-45 | 78 | 61 | 65 | ! |
1943-44 | 95 | 77 | 81 | ! |
1942-43 | 72 | 66 | 66 | |
1941-42 | 54 | 54 | 55 | |
1940-41 | 44 | 44 | 47 | * |
1939-40 | 43 | 43 | 44 | |
1938-39 | 44 | 44 | 44 | |
1937-38 | 44 | 44 | 45 | |
1936-37 | 45 | 45 | 43 | |
1935-36 | 40 | 40 | 41 | |
1934-35 | 47 | 47 | 48 | |
1933-34 | 43 | 42 | 44 | ! |
1932-33 | 44 | 44 | 44 | |
1931-32 | 50 | 50 | 52 | * |
1930-31 | 43 | 43 | 48 | * |
1929-30 | 62 | 62 | 65 | * |
1928-29 | 29 | 29 | 31 | * |
1927-28 | 35 | 35 | 38 | * |
1926-27 | 32 | 32 | 35 | * |
1925-26 | 29 | 31 | ! | |
1924-25 | 36 | 41 | ! | |
1923-24 | 23 | 24 | ||
1922-23 | 34 | 32 | ||
1921-22 | 33 | 37 | ! | |
1920-21 | 33 | 38 | ! | |
1919-20 | 40 | 43 | ! |
[TBODY] [/TBODY]
BRIEF PLAYER COMPARISONS
This is the usual best 7-year VsX, unweighted.
Rk. | Player | VsX | VsEst | Debut |
1 | Wayne Gretzky | 155.6 | 155.3 | 1979-80 |
2 | Phil Esposito | 130.4 | 124.5 | 1963-64 |
3 | Gordie Howe | 125.5 | 123.4 | 1946-47 |
4 | Mario Lemieux | 119.8 | 119.1 | 1984-85 |
5 | Jaromir Jagr | 114.2 | 111.5 | 1990-91 |
6 | Bobby Orr | 114.8 | 108.6 | 1966-67 |
7 | Stan Mikita | 107.8 | 106.9 | 1958-59 |
8 | Sidney Crosby | 102.4 | 104.5 | 2005-06 |
9 | Bobby Hull | 108.3 | 104.2 | 1957-58 |
10 | Ted Lindsay | 104.4 | 103.3 | 1944-45 |
10 | Maurice Richard | 102.4 | 103.3 | 1942-43 |
12 | Marcel Dionne | 103.3 | 103.1 | 1971-72 |
13 | Guy Lafleur | 104.5 | 102.8 | 1971-72 |
14 | Jean Beliveau | 105.7 | 102.6 | 1950-51 |
15 | Bill Cowley | 97.0 | 100.2 | 1934-35 |
16 | Cy Denneny | N/A | 99.8 | 1914-15 |
17 | Alex Ovechkin | 98.4 | 99.7 | 2005-06 |
18 | Andy Bathgate | 101.1 | 99.6 | 1952-53 |
19 | Howie Morenz | 102.2 | 98.9 | 1923-24 |
20 | Joe Sakic | 97.7 | 97.1 | 1988-89 |
It's not massively different from VsX. And there are a couple of problems:
1) VsEst uses linear extrapolation on the tail end of a distribution. I'd probably be kicked out of stats class... if I had ever taken stats. (This may also be an issue.)
2) Sometimes, the reduction in the talent pool isn't random... which is a problem when it comes to statistical arguments. There's a possibility that the healthiest players were more fit to go war (and thus the higher-end players were more likely to be away). The WHA obviously tried to poach the biggest stars from the NHL.
Thoughts?
EDIT (March 31, 2018): I modified the benchmark to include the 2nd place scorer for the first sample. I also included the debut date to see the distribution of talent over the eras in the top 20.
1910's: 1 (Denneny)
1920's: 1 (Morenz)
1930's: 1 (Cowley)
1940's: 3 (Richard, Lindsay, Howe)
1950's: 4 (Beliveau, Bathgate, Hull, Mikita)
1960's: 2 (Esposito, Orr)
1970's: 3 (Dionne, Lafleur, Gretzky)
1980's: 2 (Lemieux, Sakic)
1990's: 1 (Jagr)
2000's: 2 (Crosby, Ovechkin)
Last edited: