Each time the league owner sends in new scores it simulates the rest of the season by randomly picking scores for each remaining game.
Randomly picking scores is statistical garbage. You have each team's GF/G and GA/G for both the current season and prior season, it's not difficult to come up with an expected GF/G and GA/G and adjust that going forward based on simulated results.
That is what you should be using to simulate scores, not numbers picked out of the air. If you want to get really fancy, you could derive this with each team's roster and the individual players' expected GF (again, using current and past seasons) to get a more precise calculation, but that's getting into statistical theory. You could
also do things like simulate roster movements (recalls, games missed, etc.) and adjust for who's playing in goal (do we really think the Blues are more likely to win with Allen or Johnson) and adjust for hot/cold streaks, but that's getting way into the details.
But, if you're trying to argue you've got a great model that's super reliable for projecting the rest of the season, it seems kind of stupid to
not include those kinds of things into your model.
The weighted method takes the opponents record and home field advantage into account when randomly picking scores, so the better team is more likely to win.
See the point above. It's not difficult to break this down into home GF/G and away GF/G, but for the players it starts to run into credibility issues. But really, you should be using
that measure instead of just W/L records to determine chances to win.
The 50/50 method gives each opponent an equal chance of winning each game. Both methods let an appropriate percent of games end in a tie or go into overtime in leagues where that matters.
An "appropriate percent". Oh, OK. It's the equivalent of saying "I need X games to go into OT, so ... this batch will do" without regard for how likely it is for any given team to play OT games. This is actually one of my lesser objections, because
somehow OT games have to get determined - but I'd like for simulated results to let them happen, not just have the OT Fairy descend and choose select games for it.
It repeats this random playing out of the season million of times
What absolute crap. At the end of the day, you're trying to figure out the percentage chance of some event happening
and be certain to within some threshold. That has much more to do with how reliable the model underlying your work is and the assumptions contained within that model than "ooh, I ran this eleventy million times!" If you need that for this model, either you have volatility in your results [which should cause you to re-evaluate your model] or you're wasting time with unnecessary work. Even at this point in the season, results should start settling within several thousand simulations but they're still subject to considerable error because of how much of the season remains [and all the unknowns not being contemplated in the model]. Focus on that, not running your model zillions of times in some false quest for accuracy.
To help flush out each team’s highest and lowest possible seeds, I force them to win or lose all their remaining games for a small percentage of the simulation runs.
Yeah, sure, someone
might be able to reel off 61 wins in a row. The statistical likelihood of that is beyond incredibly small. Plus, it looks contrived and artificial to say "in 4 million simulations it was more likely for [team] to accumulate 22-69 points than it was to finish with 70-75 points." Really? Just say "here's the maximum/minimum points achievable" and simulate results using the assumptions going in. If someone really happens to reel off the remaining 61 games, fantastic - but let it happen via the model, don't artificially cram it in. Quit being cute trying to look ultra-accurate.