Again, you appear to be conflating the "how" and the "what". I don't know how Kuznetsov has turned around his numbers. Maybe Laviolette has helped him regain his form. Maybe Kuznetsov himself did some soul-searching this offseason and decided he needed to train better and dedicate himself to the game moreso than the past. Perhaps a wizard came and cast a spell on him making him a dominant player in terms of xGF%. It'd be an odd spell, but I suppose it's possible. Maybe it's all of the above.
Unfortunately we are not privy to the "how" (at least I am not), but we do know that his on-ice performance metrics are significantly different, and better, than they were for the prior two seasons. I'm not trying to claim how he did it, I'm just saying it's more likely than not that his level of play is going to continue based on past data. I don't care if it's mostly attributable to Peter Laviolette, or Kuzy's wife, or his kids, or a wizard. I'm just saying he is likely to continue to have a positive net effect on the team's success as measured by on-ice goal differential. As someone who wants the Capitals to keep winning, that's all that I can ask for.
Regarding sample size:
Sample Size Calculator - Confidence Level, Confidence Interval, Sample Size, Population Size, Relevant Population - Creative Research Systems
View attachment 402629
View attachment 402630
Sample Size Calculator
View attachment 402632
Indeed at some point once the population size becomes large enough the sample size needed doesn't depend on the population size at all. It does depend on the margin of error you are looking for: I think 3% is a reasonable margin of error, so maybe r^2 is as low as 0.20 or as high as 0.26, but for some reason I don't think this would alter your argument. I consulted the formulas and calculators as you asked, and I came to the same conclusion I did beforehand: 1000 is plenty big as a sample size, even if there were an infinite population size. Perhaps you have a better calculator, and perhaps these sites are both using the incorrect formula? I'd love to learn something new!
And while the answers to polling are typically multiple choice, and the data presented in our discussion are "highly variable numbers," fortunately we can still build confidence intervals around "highly variable numbers." That's the whole point of building confidence intervals.
Confidence interval