Hardy Distribution | |||||||||||||||||||||||||||||
Type: | mass | ||||||||||||||||||||||||||||
Pdf Caption: | The horizontal axis represents the hole score . The vertical axis represents the probability of the hole score given the par of the hole and the probabilities = 0.20 and = 0.10. The blue points represent the probabilities for a par three, the green points for a par four and the red points for a par five The function is defined only at integer values of . The connecting lines are only guides for the eye. | ||||||||||||||||||||||||||||
Cdf Caption: | The horizontal axis represents the hole score . The vertical axis represents the cumulative probability of the hole score given the par of the hole and the probabilities = 0.20 and = 0.10. The blue points represent the probabilities for a par three, the green points for a par four and the red points for a par five. The cumulative probability density (CDF) is discontinuous at the integers of and flat everywhere else because a variable that is Hardy distributed takes on only integer values. | ||||||||||||||||||||||||||||
Notation: | \operatorname{Hardy}(p,q;m) | ||||||||||||||||||||||||||||
Parameters: | p,q\in(0,1) p+q\in(0,1) m=1,2,3,... | ||||||||||||||||||||||||||||
Support: | n\inN0 | ||||||||||||||||||||||||||||
Pdf: | For m is odd: P\left(X=n\right)=\sum
{n-1\choosen-j}{q}n-j\left(A{j,m For m is even: P\left(X=n\right)=\sum
{n-1\choosen-j}{q}n-j\left(A{j,m with A{j,m and B{j,m | ||||||||||||||||||||||||||||
Mean: | -\sum
| ||||||||||||||||||||||||||||
Mgf: | For m is odd: Mm\left(t\right)=\sum
{\it}\right)~ej For m is even: Mm\left(t\right)=\sum
{\it}\right)~ej with X{{\it and Y{{\it |
In probability theory and statistics, the Hardy distribution is a discrete probability distribution that expresses the probability of the hole score for a given golf player. It is based on Hardy's (Hardy, 1945) basic assumption that there are three types of shots: good
(G)
(B)
(O)
p
q
1-p-q
a value of 2 to a good stroke, a value of 0 to a bad stroke and a value of 1 to a regular or ordinary stroke.
Once the sum of the values is greater than or equal to the value of the par of the hole, the number of strokes in question is equal to the score achieved on that hole. A birdie on a par three could then have come about in three ways:
OG
GO
GG
(1-p-q)p
p(1-p-q)
p2
A discrete random variable is said to have a Hardy distribution, with parameters
p
q
m
P\left(X=n\right)=\sum
m | ||||
|
{n-1\choosen-j}{q}n-j\left(A{j,m
and
P\left(X=n\right)=\sum
m | ||||
|
{n-1\choosen-j}{q}n-j\left(A{j,m
with
A{j,m
and
B{j,m
where
m=1,2,\ldots
n=
m | |
2 |
,
m | |
2 |
+1,
m | |
2 |
+2,\ldots
m
n=
m+1 | |
2 |
,
m+1 | |
2 |
+1,
m+1 | |
2 |
+2,\ldots
m
0<p<1
0<q<1
0<p+q<1
The moment generating function is given by:
Mm\left(t\right)=\sum
m | ||||||||
|
\left(X{\it | |
+Y |
{\it}\right)~ej
and
Mm\left(t\right)=\sum
m | ||||
|
\left(X{\it | |
+Y |
{\it}\right)~ej
with
X{{\it
and
Y{{\it
Each raw moment and each central moment can be easily determined with the moment generating function, but the formulas involved are too large to present here.
For a par three:
For a par four:
Note the resemblance with
P(T3=n)
Note the resemblance with the formulas for
P(T3=n)
P(T4=n)
When trying to make a probability distribution in golf that describes the frequency distribution of the number of strokes on a hole, the simplest setup is to assume that there are only two types of strokes:
A good stroke with a probability of
p
1-p
Once the sum of the shot values equals the par of the hole, that is the number of strokes needed for the hole. It is clear that with this setup, a birdie is not possible. After all, the smallest number of strokes one can get is the par of the hole. Hardy (1945) probably realized that too and then came up with the idea not to assume that there were just two types of strokes: good
(G)
(B)
good
(G)
p
(B)
q
(O)
1-p-q
In fact, Hardy called a good shot a supershot and a bad shot a subshot. [1] Minton later called Hardy's supershot an excellent shot
(E)
(B)
(G)
Hardy assumed that the probability of a good stroke was equal to the probability of a bad stroke, namely
p=q
In retrospect, Hardy might well have been right, as the data in Table 2 in van der Ven (2013) show. This table shows the estimated
p
q
p
q
For the Hardy distribution the values of
p
q
The Hardy distribution gives the probability distribution of a single player's hole score. It takes several observations to perform a goodness-of-fit test (see Goodness of fit test) to check whether the Hardy distribution applies or not. This can be done with a single individual by having the individual play the same hole multiple times. Goodness-of-fit tests assume pure replications (see Replication (statistics)). This means that there should be no change in the player's golfing ability during repeated play of the hole. For example, there should not be an ongoing learning process (see Learning). Such effects cannot really be ruled out. One way around this problem is to use multiple players who can be assumed to have approximately the same golf proficiency. Such players are the participants in professional golf tournaments (see PGA Tour). Before using a goodness-of-fit test, it should first be checked that the participants indeed have approximately the same golf proficiency. This can be done separately for each hole by using, for example, the Pearson correlation coefficient between the hole score on the first day and the second day of a tournament. If there are no systematic differences (see Classical test theory) between players, the correlation (see Correlation) between the score achieved on Day 1 on a hole and the score achieved on Day 2 on that hole will not differ significantly (see Statistical significance) from zero. This can be easily tested statistically. In a study by van der Ven,[4] the results of a goodness-of-fit test of the Hardy distribution were reported using the hole-by-hole scores from the 2012 Open Championship played at the St Andrews Golf Club. The distribution has been tested separately for each hole. Pearson's chi-squared test was used to determine whether the observed sample frequencies of the hole scores differed significantly from the expected frequencies according to the Hardy distribution. The fit between observed and expected frequencies was generally very satisfactory.
Notes