Predictive of what? All play? I don't think so and I've seen anything showing as much. Most of the support for using 5 on 5 stats has been fairly ad hoc reasoning: ''the game is played 5 on 5'' for example. PP ability is a skill. There may be greater variance in PP metrics, but that's not quite the same thing as it not being as predictive.
Interestingly, predictiveness is what advanced stats in hockey most struggles with.
Getting a bit of topic with this discussion, but anyways.
The more something is related to individual ability, the less variance (when looking at the set of all players in the league) you should expect from Period N to Period N+1, from teammates changes, from coaching changes, etc. This is also inherently linked to predictability. As I implied, 5v5 metrics involve less variance from Period N to Period N+1 than PP metrics. This doesn't mean that PP does not involve any individual ability, nor that 5v5 only involves individual ability. What this means is that if you want to predict Period N+1 metrics using Period N, you'll get better success predicting 5v5 metrics than predicting PP metrics. Therefore, by giving more importance to 5v5 metrics in predicting models, you'll perform better in your prediction endeavours.
The other reasoning you mention regarding 5v5 is one that is also valid in a way, and more commonly used because it's repeated ad nauseam by lots of people. While the per minute impact of 5v5 is lesser than that of PP or PK, the fact it's the situation most of the game is played at makes it the most important situation. That's why you're much more likely to see a good 5v5/bad PP team do better than a bad 5v5/good (even great) PP team. Correlating 5v5 goal differential, PP goals for, and PK goals against with wins or total points support this statement.
Don't get me wrong though, as I'm fully aware exceptions happen. This is an inherent flaw of common statistics, in that basic or even advanced statistics models or machine learning models will perform worse the farther you get from the mean of the distribution you are analysing.
Regarding predictiveness, there are several aspects you can attempt to predict, such as individual performance, impact on teammates, or team performance. And for each, there's a lot of things that can be predicted, whether it be individual goal scoring, overall game impact, individual points, defensive impact, etc. The thing is that it's true that predictiveness is something advanced stats struggle with - but it also happens to be what anything (eye test, basic stats alone, etc.) struggle with. There is no denying that hockey is a fast paced game that involves many components which each have a small impact and might be hard to predict. It also involves a lot of unpredictable factors, such as coaching, impact from teammates, injuries, aging (both in progression and regression), etc. I remember reading an article that concluded the highest success rate in predicting just game outcomes (win or lose) one could attain on a constant basis is about 65%, and we still haven't reached this end goal.
Despite that, models that use advanced statistics to improve on those that use basic statistics only fare better. You're going to be more successful, for instance, at predicting Period N+1 iGF using Period N iGF and ixGF than using only Period N iGF. He stopped doing this, but Manny from Corsica used to have a "models tournament" up until 1-2 years ago, and several models got about 58-59% success rate, which is really good. For info, the baseline (always picking the home team) yields about a 55% success rate. It's almost impossible to get the same set of data from "hockey experts", but I'm ready to bet they do not perform much better than 58-59%, if they even perform at that rate. There's still a lot of improvement to be done, obviously, which will come as technologies improve.