Thanks for the thoughtful comments.
Thanks.
I'm not sure you understood me fully (or I understood you). But if you're happy with your method the way it is, it's OK.
I mentioned the importance of separating home games from away games, and (just like another poster often does when I point out things like these) you automatically seemed to assume I probably was wrong. That is a phenomenom I often encounter on this board when I try to point out less obvious things that apparantly goes against more or less fixed beliefs.
I'm skeptical. I know that teams generally play better and win more at home, but how many goalies are going to have a significant difference between home and away starts? I don't have the data to look into this, but if there are major differences let me know.
Just a few examples from the 2008-09 to 2010-11 ("2008" is in this case 2008-2009):
Seas|Team|Name|hGP|aGP|Diff
2009|NAS|Pekka Rinne|38|20|18
2008|FLO|Tomas Vokoun|36|23|13
2010|MIN|Niklas Backstrom|32|19|13
2010|PHO|Ilya Bryzgalov|40|28|12
2009|MIN|Niklas Backstrom|35|25|10
2010|TBL|Dwayne Roloson|22|12|10
2008|SJS|Brian Boucher|6|16|-10
2009|VAN|Andrew Raycroft|5|16|-11
2008|OTT|Alex Auld|16|27|-11
2009|STL|Ty Conklin|7|19|-12
2010|PHO|Jason LaBarbera|2|15|-13
2010|MIN|Jose Theodore|8|24|-16
2009|NAS|Dan Ellis|6|25|-19
The data shoud show not only starts, but all games a goalie actually played in.
Data is mainly from hockeyreference. (I have spent many hours putting it into my own database, which means I have data for all seasons from 1987-88 to 2010-11 and can do more or less advanced studies on it.)
Do you still think this is not a significant difference?
If you don't trust me and my stats, I can give you a link to hockeyreference:
http://www.hockey-reference.com/players/r/rinnepe01/splits/2010/
And what do we see?
Name|Home/Away|GP|Save%|GAA
Pekka Rinne|Home|38|.921|2.31
Pekka Rinne|Away|20|.889|3.21
Quite a significant difference, wouldn't you say?
We can convert the save_percentage to goals_allowed_percentage, by the simple formula of 1 - save_percentage. At home, he allows 7.9 % of all shots, and on the road he allows 11.1 % of all shots. That's 1.4 times more.
You mentioned presentation. Well, the presentations of official goalie stats may look appealing to North Americans (I cannot comment upon that), but it comes with the price of over simplifying things. You already were aware that situational play "bias" the official stats presented. Now you hopefully are aware that home and away play also does it. (Yet another thing is strength of opposition.)
I think you thus provide mainly just yet another more or less biased study. Following the advices I gave you would (in my opinion) have improved it.
Regarding normalizing save percentage:
This goes back to my point about presenting & selling your analysis. If I tell everyone that Hasek has a career average adjusted save percentage of 93.0% (compared to a league average of 91.0%), people immediately understand that he was as dominant as his name suggests. If I say his career save ratio is 1.022, many people (myself included) would find that less informative because it's an abstract ratio rather than something intuitive and understandable.
In itself, save percentage says nothing about how good the goalie has performed (save percentage wise) compared to other goalies in the league. A save percentage of .890 can be very good or below average depending on era and context. Adding a column showing normalized percentage thus would add information, especially when comparing seasons.
I actually prefer goal_allowance_percentage (which is 1 minus save_percentage), as I find it more telling and better shows the actual difference between having a great goalie and a poor. Let's compare save_percentages of .93 and .86, which means goal_allowance_percentages of .07 and .14. The .93 goalie in that regard is twice as good as .86, as the .93 goalie allows half the number of goals that the .86 goalie allows (given the faced the same amount of shots). The worse goalie allows 14 goals when the better one allows 7. This is to me a telling stat, that immediately gives an indication of how team and skater stats are affected by goalie performance.
I'm not sure how much you have used normalized save percentage, but I'm sure you would find it very useful to include in your goaltending formulas.
The main purpose is this puts goaltending into the currency of winning, which is ultimately the goal of any team. Save percentage is a good statistic but it can be rather abstract - it's not immediately obvious how many games a goalie with a 93% save percentage would help his team win relative to a goalie with say a 92% save percentage.
On one hand, if a goalie plays on a stronger team, it increases the number of games we'd expect him to win per the formula. However, it should also increase the number of games he'd actually win in real life, so he wouldn't be penalized.
I tried to show you a (in my opinion) simplier, but yet maybe even more "fair"/"accurate" (to the goalies) way of doing it. I haven't dug deep into this, but my suggestions would be "fairer" to the goalies. In your presentation, you focus on goalies, and rank goalies, as if individual goalie importance is the key thing you're after. If you want to "rank" goalies, I don't think your method is as good as what I suggested. To me, it at this points looks as if you have done a combined "goalie and skaters performance" study, but present it as being a goalie study.
My method would "isolate" goalie performance, by putting all goalies into an average team. But if one wants to focus on it the way you do - which according to comments in the thread appears to be popular - then it's of course OK to do so.
Regarding "not mixing apples and oranges"...
What do you mean expected and factual?
a. We can use pythagoran win formula to calculate expected amount of wins if having an average goalie.
b. We can also use it to calculate expected amount of wins based on the particular goalie's actual stats.
c. We can also use factual wins ("decisions").
In this case, I would probably prefer to use a and b, or possibly a and a combined b/c.
I haven't dug into this more than shallowly.
The NHL has a very strange way of rewarding draws, by handing out an extra point in games where teams succeeds in having a draw after 60 minutes of play. Best is to always hand out 2 points (or always 3 points, as they do internationally) in games. I personally recalculate points when doing studies based on team performance, for example to 2, 1.5, 0.5, 0 rather than the bizarre 2, 2, 1, 0 system currently in use.
In this particular study of yours, shootouts will be special. Normally, they are often considered sort of a "lottery", but in this study - where the focus is on goaltending - it's a bit different.