I wonder, since NHL does stat correction - and it reflects in the HTMLs (I mostly work with HTMLs) - do the corrections make it to the JSONs as well?
As for HTML parsing I haven't had any problems using HTML::TreeBuilder of Perl, with the exception of some really old ones that were truly broken beyond repair. No cleanups except of washing out.
I would bet that both the JSON and the HTML are built from the "raw" underlying data. Further, I think there's access to that raw underlying data for special partners.
Evidence for this is that for games such as 2008-2009 Reg Season 259 (two away players with Jersey #5) and 1077 (two away players with Jersey #23) where the data problems apparently were enough to keep the HTML from getting built. The data DOES exist, though, since the ESPN Gamecast feed contains the full play-by-play. (2008-2009 was before the play-by-play was included in the NHL's JSON live/feed file -- that only started in 2010-2011)
And clock discontinuities appear in both HTML and JSON (example: 2016-2017 Playoffs Game 0136 where Event #188 is a stoppage with 13:55 left in the 2nd period and Event #189 is a faceoff with 13:51 left in the second) [Events 191 and 192 in JSON]
Also, while I noted that there was stuff in the JSON that wasn't in the HTML, that goes the other way, too. The HTML has the players on the ice for each team for each event. The live/feed JSON doesn't. (The live.nhle.com/.../PlayByPlay.json DOES have that information but does NOT have all the events; only some of them).
Bottom line is that if you want to get as complete a picture as possible, you need to be parsing and consolidating information from many sources.
Ultimately, the more sources the better; hence the question about wondering if anyone had additional feed URLs beyond the ones in my post...