News Article: Estimating Player Contribution (w/) Regularized Logistic Regression (~Dats is best)

Fugu

RIP Barb
Nov 26, 2004
36,952
220
϶(°o°)ϵ
Hat tip to Kukla's Korner for the articles. I did some further digging to find the original paper (PDF): Jensen_2013_Estimating_Player_1.pdf

Estimating Player Contribution in Hockey with Regularized Logistic Regression

Robert B. Gramacy, Matthew A. Taddy, Shane T. Jensen
(Submitted on 22 Sep 2012 (v1), last revised 12 Jan 2013 (this version, v2))
We present a regularized logistic regression model for evaluating player contributions in hockey. The traditional metric for this purpose is the plus-minus statistic, which allocates a single unit of credit (for or against) to each player on the ice for a goal. However, plus-minus scores measure only the marginal effect of players, do not account for sample size, and provide a very noisy estimate of performance. We investigate a related regression problem: what does each player on the ice contribute, beyond aggregate team performance and other factors, to the odds that a given goal was scored by their team? Due to the large-p (number of players) and imbalanced design setting of hockey analysis, a major part of our contribution is a careful treatment of prior shrinkage in model estimation. We showcase two recently developed techniques -- for posterior maximization or simulation -- that make such analysis feasible. Each approach is accompanied with publicly available software and we include the simple commands used in our analysis. Our results show that most players do not stand out as measurably strong (positive or negative) contributors. This allows the stars to really shine, reveals diamonds in the rough overlooked by earlier analyses, and argues that some of the highest paid players in the league are not making contributions worth their expense.
...
An exception is Pavel Datsyuk, who stands out as the leagues very best, having a coefficient that is unmoved even after considering the strong team e ffect of his Red Wings.

Yes, Sid and Geno are not worth their pay, according to this metric. :)


PRNewswire story that links to a release from U of Chicago:
http://www.prnewswire.com/news-rele...w-research-from-chicago-booth-210979841.html?

NHL teams currently assign a plus-minus value to players, counting the goals scored while players are on the ice and comparing them with goals given up. This, the researchers argue, flatters some players' statistics, while undervaluing others. A player could theoretically score lots of goals but still have a negative plus-minus value if the opposition scored more.
To correct this imbalance, the researchers created what they call a regularized logistic regression model. They applied this new measure to data from four NHL seasons, 2007-11, and found that far fewer players stood out from their teams' average performances.
"A better measure of performance would be the partial effect of each player, having controlled for the contributions of teammates, opponents and possibly other variables," they write in their paper, "Estimating Player Contribution in Hockey with Regularized Logistic Regression," published in the Journal of Quantitative Analysis in Sports.
Hockey fans may find some of the researchers' results surprising.
For example, the Pittsburgh Penguins' Sidney Crosby is considered by many to be the best player in the NHL. But using the more precise measure shows that he made a much smaller contribution to goals than his plus-minus rating suggests. And by the authors' estimates, some other players stuck out as undervalued. The Detroit Red Wings' Pavel Datsyuk was actually the league's best player, by the new metric.
Stanley Cup fans on both sides may view their teams' captains as standout players. But the research suggests that even beloved Chicago Blackhawks captain Jonathan Toews and Boston Bruins captain Zdeno Chara have numbers that get a boost from their strong teams. "Jonathan Toews' and Zdeno Chara's effects show similar behavior, the latter having no player-team effect," the authors note. "As [Crosby, Toews and Chara] captain their respective (consistently competitive) teams, we should perhaps not be surprised that team success is so tightly coupled to player success in these cases."
U of Chicago article:
http://www.chicagobooth.edu/about/newsroom/news/2013/2013-02-13-hockey

While baseball has been transformed by statisticians, hockey remains less affected. That is partly due to the fact that baseball generates more data than hockey does. Moreover in hockey, it’s far more difficult to isolate individual performance.
...
While its simple formulation is appealing, the plus-minus statistic has important flaws, according to the researchers. A key weakness is that a player’s plus-minus score depends partly on the performance of his teammates and opponents, which makes evaluating a player’s performance based on his own abilities more challenging.
...
Through new techniques developed in the study, Gramacy, Jensen, and Taddy were able to come up with a more precise measure of performance—one that can isolate each player’s unique contribution to a goal. Using a type of statistical analysis called regularized logistic regression, which can estimate the credit or blame that should be apportioned to each player every time a goal is scored, they drew conclusions about player performance that were markedly different from traditional plus-minus figures. When the authors applied this new performance measure to data from four regular NHL seasons (2007 to 2011), they found that far fewer players stood out from their team’s average performance, and they were able to identify overvalued and undervalued players.
For example, the Pittsburgh Penguins’ Sidney Crosby is considered by many to be the best player in the NHL. But using the more precise measure shows that he made a much smaller contribution to goals than his plus-minus rating suggests. The same was true of Alex Ovechkin of the Washington Capitals, who had the largest plus-minus statistic of the league. Evgeni Malkin of the Pittsburgh Penguins and Tampa Bay Lightning’s Vincent Lecavalier both received huge salaries, but the authors’ estimates show that these players did not make significant contributions to goals after taking team effects into account.
 

Kronwalled55

Detroit vs. Everybody
Jan 7, 2011
6,914
897
Atlanta, GA
Is there a list of players they ranked the best? It's one thing to say Datsyuk contributes the most, but I would like to see who else fits this mold as well.

I'm not a fan of sabremetric stats, so I'm naturally skeptical.
 

jaster

Take me off ignore, please.
Jun 8, 2007
13,291
8,533
The Booth school is filled with brilliant people. Smarter than anyone here. Bow down to your advanced stats overlords!

They appear to have done a really good job here though. They make it clear that their formula does not demonstrate an individual's overall value to a franchise, but solely how much they contribute to goal scoring, and how much they benefit from a strong team around them. Pretty solid work they did.
 

Fugu

RIP Barb
Nov 26, 2004
36,952
220
϶(°o°)ϵ
Is there a list of players they ranked the best? It's one thing to say Datsyuk contributes the most, but I would like to see who else fits this mold as well.

I'm not a fan of sabremetric stats, so I'm naturally skeptical.


You have to download the PDF and then enlarge. They have a couple of graphs showing several different ways of looking at it.
 

grsbmd

Registered User
Aug 2, 2006
296
25
My guess is that Datsyuk's good performance here is a combination of:

-- Great offense
-- Great defense
-- Lots of ice time
-- Lots of different linemates

I think what's really going on in their analysis is that Datsyuk is the player who is least able to have his performance explained by his team. Even as they increase their model's prior probability of assuming a player's performance is due to his team, Datsyuk consistently contributes to goals for his own team and keeps pucks out of his net. The fact that he gets a lot of ice time and plays with a lot of different linemates keeps his data from being sparse, which means that their model is less able to explain away his individual contributions.
 

Ad

Upcoming events

Ad

Ad