Stat help needed, how do I find if home/away PP% difference is correlated or random

Rogie

ALIVE
May 17, 2013
1,742
235
Kyoungsan
I noticed some teams - last years numbers - (esp. the Leafs) have a large difference in their home PP% 20.0% and away PP% 28.1%; a difference of 8%.

This struck me as a large difference and some teams have large differences in PK% home and away.

So, I often post the special team numbers in our GDT and I usually the home away numbers since, well, they seem to more relevant than just the general special team numbers.

I mentioned this difference in a post and an astute poster asked if it wasn't maybe just random differences.

So, it got me thinking about how to test this.

I've used SPSS in the past and used to be able to do multivariate analysis. That was a long time ago though.

I can still do a simple bi-variate correlation (or Pearson) just using Excel.

But, and even if I don't have the software tools to do it, I want to try to figure out how I would test if this differnce in statistically significant or just random.

Any ideas on what the variables would be. Does this call for nominal variables - home/away, or would it just it be required to make a variable for the differences between teams' home and away PP% and then use that variable to correlate with another variable.

I've gotten a bit older (not beating up on myself either - I'm 65) and I think it would be good for my brain if I could understand what the statistical variables and what is needed to figure this out. I've got a feeling it's not that hard but by my poor brain is just not getting it!

There is 41. home games for each team and a PP% - that's 2 variables there I understand those okay.
There is 41 away games for the same teams and another PP% - these are 2 different variables.
There is the 82 games and the PP% overall; this variable would include the data in the other 2 variables - I get that.

Anyways, as some smarter stats people than I will be able to see, I'm missing something - possibly something easy after I see it.

What new varialbe/s do I have to create and then, which variables would I run correlations on? and/or, is it going to be a multivariate correlation?

Any ideas - any help.
Mark
 
May 31, 2006
10,457
1,320
You can test to see if the difference between the average home and away PP% is statistically significant (p-value) with a t-test. No need for a regression.
 

Rogie

ALIVE
May 17, 2013
1,742
235
Kyoungsan
I noticed some teams - last years numbers - (esp. the Leafs) have a large difference in their home PP% 20.0% and away PP% 28.1%; a difference of 8%.

This struck me as a large difference and some teams have large differences in PK% home and away.

So, I often post the special team numbers in our GDT and I usually the home away numbers since, well, they seem to more relevant than just the general special team numbers.

I mentioned this difference in a post and an astute poster asked if it wasn't maybe just random differences.

So, it got me thinking about how to test this.

I've used SPSS in the past and used to be able to do multivariate analysis. That was a long time ago though.

I can still do a simple bi-variate correlation (or Pearson) just using Excel.

But, and even if I don't have the software tools to do it, I want to try to figure out how I would test if this differnce in statistically significant or just random.

Any ideas on what the variables would be. Does this call for nominal variables - home/away, or would it just it be required to make a variable for the differences between teams' home and away PP% and then use that variable to correlate with another variable.

I've gotten a bit older (not beating up on myself either - I'm 65) and I think it would be good for my brain if I could understand what the statistical variables and what is needed to figure this out. I've got a feeling it's not that hard but by my poor brain is just not getting it!

There is 41. home games for each team and a PP% - that's 2 variables there I understand those okay.
There is 41 away games for the same teams and another PP% - these are 2 different variables.
There is the 82 games and the PP% overall; this variable would include the data in the other 2 variables - I get that.

Anyways, as some smarter stats people than I will be able to see, I'm missing something - possibly something easy after I see it.

What new varialbe/s do I have to create and then, which variables would I run correlations on? and/or, is it going to be a multivariate correlation?

Any ideas - any help.
Mark
You can test to see if the difference between the average home and away PP% is statistically significant (p-value) with a t-test. No need for a regression.

Thanks for the response. I think I did it correctly, but, not sure. - using Excel.

Like you suggested, I used one array as the home PP for each team and the second array as the away PP for each team. And, I used both a 1 tail test and a 2 tailed test since I couldn't predict the direction, since, some teams PP is higher on the road, and other teams the PP is higher at home, so, anyways, I used both 1 tail and 2 tail. Perhaps this should have been the first tip off to me that there is no relationship between the differences and that the differences are just random.
I chose 1 for the design - which I think was correct - because they were paired data - the same team's home/away PP.

So, tests returned a .283 for 1 tail and .566 for the 2 tail test.

So, I think I can interpret this that the differences are random or just by chance. (null hypothesis is there should be no differences between the home and away PP's.
If p is less than 0.05 is statistically significant ,then, .283 means the differences are due to chance.
I guess this wasn't so intuitive to me, but, maybe I was trying to imagine reasons for differences, (less stress on PP on the road; different building; etc etc, but, when I think about it again, I guess I can't think of good (empirical ones) reasons that there should be a great difference between home/away PP's.
Thanks again for the help.

Any feedback is welcome and appreciated.

I've still got a feeling there's a reason for these differences though! But, yes, I know the numbers say there isn't!
 
Last edited:
May 31, 2006
10,457
1,320
Thanks for the response. I think I did it correctly, but, not sure. - using Excel.

Like you suggested, I used one array as the home PP for each team and the second array as the away PP for each team. And, I used both a 1 tail test and a 2 tailed test since I couldn't predict the direction, since, some teams PP is higher on the road, and other teams the PP is higher at home, so, anyways, I used both 1 tail and 2 tail. Perhaps this should have been the first tip off to me that there is no relationship between the differences and that the differences are just random.
I chose 1 for the design - which I think was correct - because they were paired data - the same team's home/away PP.

So, tests returned a .283 for 1 tail and .566 for the 2 tail test.

So, I think I can interpret this that the differences are random or just by chance. (null hypothesis is there should be no differences between the home and away PP's.
If p is less than 0.05 is statistically significant ,then, .283 means the differences are due to chance.
I guess this wasn't so intuitive to me, but, maybe I was trying to imagine reasons for differences, (less stress on PP on the road; different building; etc etc, but, when I think about it again, I guess I can't think of good (empirical ones) reasons that there should be a great difference between home/away PP's.
Thanks again for the help.

Any feedback is welcome and appreciated.

I've still got a feeling there's a reason for these differences though! But, yes, I know the numbers say there isn't!
Are you only testing for one season? Add more seasons to increase your sample size and see if that's enough to reject the null hypothesis. After all, I do think that home-ice advantage when it comes to win/loss and goal differential is statistically significant. Maybe that holds for PP%.
 

Rogie

ALIVE
May 17, 2013
1,742
235
Kyoungsan
Are you only testing for one season? Add more seasons to increase your sample size and see if that's enough to reject the null hypothesis. After all, I do think that home-ice advantage when it comes to win/loss and goal differential is statistically significant. Maybe that holds for PP%.

That's a good idea. Though, when I look back over the seasons for just one team (Leafs for example), there seems to be no pattern at all. They are like 5% better at home than away, then, the next season, it's the opposite, then, for 3 seasons in a row, they are better at home by a few % or more, and then, they are better on the road.
But, good idea to do the stats anyways; as I said it's good for my brain.

Also, I've now got (not saying how lol) SPSS, so, i can do all kinds of tests, including Ttest with that program.

Btw, the difference between the means was .5 or 1/2%. That's very very very small for what we are measuring - I think.

Using home as the 1st variable, and away as the 2nd variable, the largest difference between home/away was 13% (homePP was 13% greater than awayPP), but, in the opposite direction, the largest difference is 8% (awaypp is 8% greater than homePP).

So, does that perhaps indicate there is some directionality in the measures. Maybe it will become more apparent with a larger sample.

And, it does seem intuitive that if there is any advantage, it would maybe be at home. Although, in my mind (maybe confirmation bias), I was thinking the Leafs perhaps had a better PP on the road because they had less pressure on them to score.
 

Ad

Upcoming events

Ad

Ad