In statistics, inverse-variance weighting is a method of aggregating two or more random variables to minimize the variance of the weighted average. Each random variable is weighted in inverse proportion to its variance (i.e., proportional to its precision).
Given a sequence of independent observations with variances, the inverse-variance weighted average is given by[1]
\hat{y}=
| |||||||||||||
|
.
The inverse-variance weighted average has the least variance among all weighted averages, which can be calculated as
Var(\hat{y})=
1 | ||||||||||||
|
.
If the variances of the measurements are all equal, then the inverse-variance weighted average becomes the simple average.
Inverse-variance weighting is typically used in statistical meta-analysis or sensor fusion to combine the results from independent measurements.
Suppose an experimenter wishes to measure the value of a quantity, say the acceleration due to gravity of Earth, whose true value happens to be
\mu
n
X1,X2,...,Xn
E[Xi]=\mu
\foralli
Var(Xi):=
2 | |
\sigma | |
i |
\sigmai
\sigma
n
\mu
\hat{\mu}
\overline{X}=
1 | |
n |
\sumiXi
E[\overline{X}]
\mu
Var(\overline{X})=
1 | |
n2 |
\sumi
2 | |
\sigma | |
i |
=\left(
\sigma | |
\sqrt{n |
\sigmai
n
1/\sqrt{n}
Instead of
n
n
n
\sigmai
g
\overline{X}
2, | |
\sigma | |
1 |
2, | |
\sigma | |
2 |
...,
2 | |
\sigma | |
n |
\mu
\hat{\mu}=
\sumiwiXi | |
\sumiwi |
wi=
2 | |
1/\sigma | |
i |
Var(\hat{\mu})=
| |||||||||||||||
\left(\sumiwi\right)2 |
Var(\hat{\mu}opt)=\left(\sumi
-2 | |
\sigma | |
i |
\right)-1.
Note that since
Var(\hat{\mu}opt)<minj
2 | |
\sigma | |
j |
\hat{\mu}opt
Consider a generic weighted sum
Y=\sumiwiXi
wi
\sumiwi=1
Xi
Y
Var(Y)=\sumi
2 | |
w | |
i |
2. | |
\sigma | |
i |
Var(Y)
Var(Y)
\sumiwi=1
w0
Var(Y)=\sumi
2 | |
w | |
i |
2 | |
\sigma | |
i |
-w0(\sumiwi-1).
For
k>0
0=
\partial | |
\partialwk |
Var(Y)=2wk\sigma
2 | |
k |
-w0,
which implies that:
wk=
w0/2 | ||||||
|
.
The main takeaway here is that
wk\propto
2 | |
1/\sigma | |
k |
\sumiwi=1
2 | |
w0 |
=\sumi
1 | ||||||
|
:=
1 | ||||||
|
.
The individual normalised weights are:
wk=
1 | ||||||
|
\left(\sumi
1 | ||||||
|
\right)-1.
It is easy to see that this extremum solution corresponds to the minimum from the second partial derivative test by noting that the variance is a quadratic function of the weights. Thus, the minimum variance of the estimator is then given by:
Var(Y)=\sumi
| |||||||
|
2 | |
\sigma | |
i |
=
4\sum | |
\sigma | |
i |
1 | ||||||
|
=
| ||||||||||
\sigma | ||||||||||
0 |
=
2 | |
\sigma | |
0 |
=
1 | ||||||||||||
|
.
For normally distributed random variables inverse-variance weighted averages can also be derived as the maximum likelihood estimate for the true value. Furthermore, from a Bayesian perspective the posterior distribution for the true value given normally distributed observations
yi
Var(Y)
For potentially correlated multivariate distributions an equivalent argument leads to an optimal weighting based on the covariance matrices
Ci
xi
\hat{x
\hat{C
For multivariate distributions the term "precision-weighted" average is more commonly used.