What specific variables does one need to account for when presenting a statistic that shows what happens when a given player is on the ice? How do those variables impact that number? Again, the intent isn't necessarily to assign responsibility or lay blame, but to show, as much as possible, what happens when a player is on the ice which, combined with other measures (such as how other players do without them) can provide some insight as to who is doing what out there. Over a big enough sample those variables will smooth out and you're going to glean some useful information from what's left.
I don't think I really understand what you're trying to say here.
The metric is event based. Player A acts in a specific way that facilitates Action X that generates result Y. Result Y is equally attributed to Player A AND Players B-J (not accounting for goalies here). Anything that affects the process in which 'Action X' come to be is a variable. Anything. There are dozen's, if not hundreds of different results Y possible through Action X. What effects those outcomes are the variables at work. I'm sure you can think of many different things at work during as fast paces game that will affect the result of player A's shot that is intended to make it's way to the net.
Now, ignoring the actual on-ice event, the biggest glaring omission in player assessment, particularly player vs. player, is the non-existence of what I'll call n-count variance. If one is to treat every event, regardless of variable, as an equal metric, than the
variance in the number of events measured to compare player vs. player is vital. If Player A's GF% is based upon 25% more events than Player B, the comparison is inherently flawed from a statistical viewpoint. Not to mention that if what you say is true, and the sample size smooths it out, it's important to understand the baseline at which it does actually "smooth out".
This is the reason why possession numbers work far better for predicting team success. Much of the variables at play are altered to make analysis far less variable driven. Team A has X number of events in a game, which means Team B has an equal number of events. Each result is attributed to two entities only, the team who facilitates the action and the team on the result of the action went against. That's it. Outliers are naturally accounted for because the results take every event during a game into consideration. (McDavid events are weighted equally compared to Larsson events.)
Now, take that same concept and apply it to units, which is the actual attributed metric of result Y. 5 players on each side get an equal weighted measure of result Y that measures the outcome of event X. This can be useful as the n-count of events gets larger and larger when measuring the effectiveness of the
unit. Again, because variables such as different teammates is removed (therefore eliminating performance outliers to a large degree) the unit based metrics can be useful when viewed over a sample size large enough to "smooth out" the variable effect.
Now we get to the individual player aspect. I'll use an example to best illustrate this. I'm sure you would agree that there was some hype around the metrics that Caleb Jones was putting up early in his season. Using the understanding of the above concepts, it's not hard to understand that metrics were incredibly skewed as it pertained to the usefulness of the player in comparison to other teammates and most certainly offered no credible insight into his future trajectory. The reason for this is the exact reason you give for overcoming the variable effect; no sample size. When comparing player vs. player, a lot of stuff is in flux that isn't when comparing team vs. team or the effectiveness of three forwards playing 5v5 together for 15+ games. This is because of how the metrics are measured.
I get that this is likely TL;DR, but this has always been the rub with possession stats and it continues to be the same. It's fairly telling that it's not the accounting for results that changed, or the accounting for outliers or even the idea of a shift baseline (what events transpire on an
average McDavid shift) that can be used to further understand player contribution, it's has been the introduction of a subjective assessment of "shot quality" under the guise of an objective measurement. In the end, the outcomes are the same. Very good predicator for team success, above average predicator for unit production, dubious for individual player assessment.