In statistics, Samuelson's inequality, named after the economist Paul Samuelson,[1] also called the Laguerre - Samuelson inequality,[2] [3] after the mathematician Edmond Laguerre, states that every one of any collection x1, ..., xn, is within uncorrected sample standard deviations of their sample mean.
If we let
\overline{x}=
x1+ … +xn | |
n |
be the sample mean and
s=\sqrt{
1 | |
n |
n | |
\sum | |
i=1 |
(xi-\overline{x})2}
be the standard deviation of the sample, then
\overline{x}-s\sqrt{n-1}\lexj\le\overline{x}+s\sqrt{n-1} forj=1,...,n.
Equality holds on the left (or right) for
xj
xi
xj
xj.
If you instead define
s=\sqrt{
1 | |
n-1 |
n | |
\sum | |
i=1 |
(xi-\overline{x})2}
\overline{x}-s\sqrt{n-1}\lexj\le\overline{x}+s\sqrt{n-1}
\overline{x}-s\tfrac{n-1}{\sqrt{n}}\lexj\le\overline{x}+s\tfrac{n-1}{\sqrt{n}}.
Chebyshev's inequality locates a certain fraction of the data within certain bounds, while Samuelson's inequality locates all the data points within certain bounds.
The bounds given by Chebyshev's inequality are unaffected by the number of data points, while for Samuelson's inequality the bounds loosen as the sample size increases. Thus for large enough data sets, Chebyshev's inequality is more useful.
Samuelson's inequality may be considered a reason why studentization of residuals should be done externally.
Samuelson was not the first to describe this relationship: the first was probably Laguerre in 1880 while investigating the roots (zeros) of polynomials.[2] [5] Consider a polynomial with all roots real:
n | |
a | |
0x |
+
n-1 | |
a | |
1x |
+ … +an-1x+an=0
a0=1
t1=\sumxi
t2=\sum
2 | |
x | |
i |
a1=-\sumxi=-t1
and
a2=\sumxixj=
| ||||||||||
2 |
wherei<j
In terms of the coefficients
t2=
2 | |
a | |
1 |
-2a2
Laguerre showed that the roots of this polynomial were bounded by
-a1/n\pmb\sqrt{n-1}
where
b=
| |||||||||||||
Inspection shows that
-\tfrac{a1}{n}
Laguerre failed to notice this relationship with the means and standard deviations of the roots, being more interested in the bounds themselves. This relationship permits a rapid estimate of the bounds of the roots and may be of use in their location.
When the coefficients
a1
a2