Delete

schuckers

Registered User
Feb 21, 2013
80
0
Other ways

It is possible to buy the raw XML feed which I believe (I've never actually seen it) has additional information that scraping from NHL.com won't give you. Here's a link to the site I know of that sells the data www.sportsnetworkdata.com

If you know R, then using the nhlscrapr package does the scraping for you. Again, kudos and thanks to Andrew Thomas and Sam Ventura for putting this package together.

-Schuckers
 

Doctor No

Registered User
Oct 26, 2005
9,250
3,971
hockeygoalies.org
R is a bit more esoteric than C++, but ultimately they're similar - if you know one, you can pick the other up pretty quickly.

The number of user-created packages in R makes it pretty nice for people that just like trying things.
 

hvs

Registered User
Jun 16, 2014
2
0
Saint Paul, MN
Lots of places

At Hockey-Reference.com we get it from a number of sources. We purchase data from Dan Diamond, who is the official stat source for the NHL, we get daily data from XML Team, and we have a number of volunteers that collect other types of data.
 

Eb

Registered User
Feb 27, 2011
7,806
610
Toronto
Lots of people speaking Greece in this thread.

Is R and C++ macros or something?
 

Czech Your Math

I am lizard king
Jan 25, 2006
5,169
303
bohemia
At Hockey-Reference.com we get it from a number of sources. We purchase data from Dan Diamond, who is the official stat source for the NHL, we get daily data from XML Team, and we have a number of volunteers that collect other types of data.

Any chance you will be listing leaders in ES assists/points for seasons and career? It seems that you have that data on the site now, which is great.
 

Mathletic

Registered User
Feb 28, 2002
15,777
407
Ste-Foy
There's the nhlscrapr package in R that will download everything for you. I used it before the NHL changed its policy, so I don't know if you can still download the whole thing. Otherwise, there's the XML package in R that works great. I use C++ to format some of the data. I'm also working on an interface in QT. That said, C++ isn't well suited for scraping stuff on the net as it's a low level language.
 

djdub

This Space for Rent
Oct 1, 2011
1,383
159
Calgary, AB
Has anyone heard how how the big analytics sites are dealing with the NHL.com ToS change? Just business as usual? Is there a way to get a license from the NHL for scraping?

I'm currently working on a project and built my scraping program last year, works great but I'm a bit concerned about legal action now.
 

mkwong268

Registered User
Dec 30, 2011
122
0
Has anyone heard how how the big analytics sites are dealing with the NHL.com ToS change? Just business as usual? Is there a way to get a license from the NHL for scraping?

I'm currently working on a project and built my scraping program last year, works great but I'm a bit concerned about legal action now.

The more likely action is getting IP banned from NHL.com. I don't see too much legal action they would bother with except maybe some cease and desists.
 

FlowMaster

Registered User
Jan 28, 2009
520
203
BC
So what's the general consensus on how to obtain raw NHL data now with the new terms of service?

What's the pros and cons to using JSON vs R ?

Sorry if my questions aren't detailed enough, I'm just getting into this.
 

JetsFan815

Registered User
Jan 16, 2012
19,246
24,438
So what's the general consensus on how to obtain raw NHL data now with the new terms of service?

What's the pros and cons to using JSON vs R ?

Sorry if my questions aren't detailed enough, I'm just getting into this.

Scraping NHL.com is still the way to go. I don't think the NHL is going to care if you follow scraping best practices and are not abusive.

JSON and R are serving two distinct purposes. JSON defines the format of the data and R is programming language that you use to extract interesting information from the data.
 

iamitter

Thornton's Hen
May 19, 2011
4,029
392
NYC
You don't need to use R to download the data, though.

I wrote my scraper in Java. I can probably put it on github if anyone here wants to use it.
 

Ad

Upcoming events

Ad

Ad