Bennett, Alpert and Goldstein's S explained

Bennett, Alpert & Goldstein’s S is a statistical measure of inter-rater agreement. It was created by Bennett et al. in 1954.^[1]

Rationale for use

Bennett et al. suggested adjusting inter-rater reliability to accommodate the percentage of rater agreement that might be expected by chance was a better measure than simple agreement between raters.^[2] They proposed an index which adjusted the proportion of rater agreement based on the number of categories employed.

Mathematical formulation

The formula for S is

	QP_a-1
	Q-1

where Q is the number of categories and P_a is the proportion of agreement between raters.

The variance of S is

\operatorname{Var}(S)=\left(

	Q
	Q-1

\right)²

	P_a(P_a-1)
	n-1

Notes

This statistic is also known as Guilford’s G.^[3] Guilford was the first person to use the approach extensively in the determination of inter-rater reliability.

Notes and References

Bennett . EM . Alpert . R . Goldstein . AC . 1954 . Communications through limited response questioning . Public Opinion Quarterly . 18 . 3. 303–308 . 10.1086/266520.
10.1016/j.stamet.2011.09.001 . 9 . 3 . The effect of combining categories on Bennett, Alpert and Goldstein's . Statistical Methodology . 341–352. May 2012 . Warrens . Matthijs J. . 1887/18383 . free .
Holley . JW . Guilford . JP . 1964 . A note on the G index of agreement . Educ Psych Measurement . 24 . 4. 749–753 . 10.1177/001316446402400402. 143846590 .