Uniformly most powerful test explained

In statistical hypothesis testing, a uniformly most powerful (UMP) test is a hypothesis test which has the greatest power

1-\beta

among all possible tests of a given size α. For example, according to the Neyman–Pearson lemma, the likelihood-ratio test is UMP for testing simple (point) hypotheses.

Setting

Let

denote a random vector (corresponding to the measurements), taken from a parametrized family of probability density functions or probability mass functions

f_\theta(x)

, which depends on the unknown deterministic parameter

\theta\in\Theta

. The parameter space

\Theta

is partitioned into two disjoint sets

\Theta₀

and

\Theta₁

. Let

H₀

denote the hypothesis that

\theta\in\Theta₀

, and let

H₁

denote the hypothesis that

\theta\in\Theta₁

.The binary test of hypotheses is performed using a test function

\varphi(x)

with a reject region

(a subset of measurement space).

\varphi(x)=\begin{cases} 1&ifx\inR\\ 0&ifx\inR^{c
\end{cases}}

meaning that

H₁

is in force if the measurement

X\inR

and that

H₀

is in force if the measurement

X\inR^c

.Note that

R\cupR^c

is a disjoint covering of the measurement space.

Formal definition

A test function

\varphi(x)

is UMP of size

\alpha

if for any other test function

\varphi'(x)

satisfying

\sup
	\theta\in\Theta₀

\operatorname{E}[\varphi'(X)\|\theta]=\alpha'\leq\alpha=\sup
	\theta\in\Theta₀

\operatorname{E}[\varphi(X)|\theta]

we have

\forall\theta\in\Theta_1, \operatorname{E}[\varphi'(X)|\theta]=1-\beta'(\theta)\leq1-\beta(\theta)=\operatorname{E}[\varphi(X)|\theta].

The Karlin–Rubin theorem

The Karlin–Rubin theorem can be regarded as an extension of the Neyman–Pearson lemma for composite hypotheses.^[1] Consider a scalar measurement having a probability density function parameterized by a scalar parameter θ, and define the likelihood ratio

l(x)=

f
	\theta₁

(x)/

f
	\theta₀

(x)

.If

l(x)

is monotone non-decreasing, in

, for any pair

\theta₁\geq\theta₀

(meaning that the greater

is, the more likely

H₁

is), then the threshold test:

\varphi(x)=\begin{cases} 1&ifx>x₀\\ 0&ifx<x_{0
\end{cases}}

where

x₀

is chosen such that

\operatorname{E}
	\theta₀

\varphi(X)=\alpha

is the UMP test of size α for testing

H_0:\theta\leq\theta₀vs.H_1:\theta>\theta₀.

Note that exactly the same test is also UMP for testing

H_0:\theta=\theta₀vs.H_1:\theta>\theta₀.

Important case: exponential family

Although the Karlin-Rubin theorem may seem weak because of its restriction to scalar parameter and scalar measurement, it turns out that there exist a host of problems for which the theorem holds. In particular, the one-dimensional exponential family of probability density functions or probability mass functions with

f_\theta(x)=g(\theta)h(x)\exp(η(\theta)T(x))

has a monotone non-decreasing likelihood ratio in the sufficient statistic

T(x)

, provided that

η(\theta)

is non-decreasing.

Example

Let

X=(X₀,\ldots,X_M-1)

denote i.i.d. normally distributed

-dimensional random vectors with mean

\thetam

and covariance matrix

. We then have

\begin{align} f_\theta(X)={}&(2\pi)^-MN/2|R|^-M/2\exp\left\{-

	1
	2

	M-1
\sum
	n=0

(X_n-\thetam)^TR^-1(X_n-\thetam)\right\}\\[4pt] ={}&(2\pi)^-MN/2|R|^-M/2\exp\left\{-

	1
	2

	M-1
\sum
	n=0

\left(\theta²m^TR^-1m\right)\right\}\\[4pt] &\exp\left\{-

	1
	2

	M-1
\sum
	n=0

	T
X
	n

R^-1X_n\right\}\exp\left\{\thetam^TR^-1

	M-1
\sum
	n=0

X_n\right\} \end{align}

which is exactly in the form of the exponential family shown in the previous section, with the sufficient statistic being

T(X)=m^TR^-1

	M-1
\sum
	n=0

X_n.

Thus, we conclude that the test

\varphi(T)=\begin{cases}1&T>t₀\ 0&T<t₀\end{cases}

\operatorname{E}
	\theta₀

\varphi(T)=\alpha

is the UMP test of size

\alpha

for testing

H_0:\theta\leqslant\theta₀

vs.

H_1:\theta>\theta₀

Further discussion

Finally, we note that in general, UMP tests do not exist for vector parameters or for two-sided tests (a test in which one hypothesis lies on both sides of the alternative). The reason is that in these situations, the most powerful test of a given size for one possible value of the parameter (e.g. for

\theta₁

where

\theta₁>\theta₀

) is different from the most powerful test of the same size for a different value of the parameter (e.g. for

\theta₂

where

\theta₂<\theta₀

). As a result, no test is uniformly most powerful in these situations.

Notes and References

Casella, G.; Berger, R.L. (2008), Statistical Inference, Brooks/Cole. (Theorem 8.3.17)

Uniformly most powerful test explained

Setting

Formal definition

The Karlin–Rubin theorem

Important case: exponential family

Example

Further discussion

Further reading

Notes and References