In statistics, the correlation ratio is a measure of the curvilinear relationship between the statistical dispersion within individual categories and the dispersion across the whole population or sample. The measure is defined as the ratio of two standard deviations representing these types of variation. The context here is the same as that of the intraclass correlation coefficient, whose value is the square of the correlation ratio.
Suppose each observation is yxi where x indicates the category that observation is in and i is the label of the particular observation. Let nx be the number of observations in category x and
\overline{y} | ||||
|
\overline{y}= | \sumxnx\overline{y |
x}{\sum |
xnx},
where
\overline{y}x
\overline{y}
η2=
\sumxnx(\overline{y | |
x-\overline{y}) |
2}{\sum | |
x,i |
(yxi-\overline{y})2}
which can be written as
η2=
{\sigma\overline{y | |
x
\overline{y}x
The correlation ratio
η
η=0
η=1
η
Suppose there is a distribution of test scores in three topics (categories):
Then the subject averages are 36, 33 and 78, with an overall average of 52.
The sums of squares of the differences from the subject averages are 1952 for Algebra, 308 for Geometry and 600 for Statistics, adding to 2860. The overall sum of squares of the differences from the overall average is 9640. The difference of 6780 between these is also the weighted sum of the squares of the differences between the subject averages and the overall average:
5(36-52)2+4(33-52)2+6(78-52)2=6780.
η2=
6780 | |
9640 |
=0.7033\ldots
η=\sqrt{
6780 | |
9640 |
η=1
The limit
η=0
The correlation ratio was introduced by Karl Pearson as part of analysis of variance. Ronald Fisher commented:
"As a descriptive statistic the utility of the correlation ratio is extremely limited. It will be noticed that the number of degrees of freedom in the numerator ofto which Egon Pearson (Karl's son) responded by sayingdepends on the number of the arrays"[1]η2
"Again, a long-established method such as the use of the correlation ratio [§45 The "Correlation Ratio" η] is passed over in a few words without adequate description, which is perhaps hardly fair to the student who is given no opportunity of judging its scope for himself."[2]