In statistical decision theory, where we are faced with the problem of estimating a deterministic parameter (vector)
\theta\in\Theta
x\inl{X},
\deltaM
\theta
\deltaM
Consider the problem of estimating a deterministic (not Bayesian) parameter
\theta\in\Theta
x\inl{X}
P(x\mid\theta)
\delta(x)
\theta
R(\theta,\delta)
R
L(\theta,\delta)
P(x\mid\theta)
L(\theta,\delta)=\|\theta-\delta\|2
Unfortunately, in general, the risk cannot be minimized since it depends on the unknown parameter
\theta
\theta
Definition : An estimator
\deltaM:l{X} → \Theta
R(\theta,\delta)
\sup\thetaR(\theta,\deltaM)=inf\delta\sup\thetaR(\theta,\delta).
Logically, an estimator is minimax when it is the best in the worst case. Continuing this logic, a minimax estimator should be a Bayes estimator with respect to a least favorable prior distribution of
\theta
\delta\pi
\pi
r\pi=\intR(\theta,\delta\pi)d\pi(\theta)
Definition: A prior distribution
\pi
\pi'
r\pi\geqr\pi
Theorem 1: If
r\pi=\sup\thetaR(\theta,\delta\pi),
\delta\pi
\delta\pi
\pi
Corollary: If a Bayes estimator has constant risk, it is minimax. Note that this is not a necessary condition.
Example 1: Unfair coin[2] : Consider the problem of estimating the "success" rate of a binomial variable,
x\simB(n,\theta)
\theta\simBeta(\sqrt{n}/2,\sqrt{n}/2)
| ||||
\delta |
with constant Bayes risk
r= | 1 |
4(1+\sqrt{n |
)2}
and, according to the Corollary, is minimax.
Definition: A sequence of prior distributions
\pin
\pi'
\limn
r | |
\pin |
\geqr\pi.
Theorem 2: If there are a sequence of priors
\pin
\delta
\sup\thetaR(\theta,\delta)=\limn
r | |
\pin |
\delta
\pin
Notice that no uniqueness is guaranteed here. For example, the ML estimator from the previous example may be attained as the limit of Bayes estimators with respect to a uniform prior,
\pin\simU[-n,n]
\pin\simN(0,n\sigma2)
Example 2: Consider the problem of estimating the mean of
p
x\simN(\theta,Ip\sigma2)
\theta
\deltaML=x
R(\theta,\deltaML)=E{\|\deltaML
p | |
-\theta\| | |
i=1 |
E(xi-\theta
2=p | |
i) |
\sigma2.
The risk is constant, but the ML estimator is actually not a Bayes estimator, so the Corollary of Theorem 1 does not apply. However, the ML estimator is the limit of the Bayes estimators with respect to the prior sequence
\pin\simN(0,n\sigma2)
p>2
p>2
p\sigma2
\|\theta\| → infty
\|\theta\|
In general, it is difficult, often even impossible to determine the minimax estimator. Nonetheless, in many cases, a minimax estimator has been determined.
Example 3: Bounded normal mean: When estimating the mean of a normal vector
x\simN(\theta,In\sigma2)
\|\theta\|2\leqM
M\leqn
| ||||
\delta |
x,
where
Jn(t)
The difficulty of determining the exact minimax estimator has motivated the study of estimators of asymptotic minimax – an estimator
\delta'
c
\sup\theta\in\ThetaR(\theta,\delta')\leqcinf\delta\sup\thetaR(\theta,\delta).
For many estimation problems, especially in the non-parametric estimation setting, various approximate minimax estimators have been established. The design of the approximate minimax estimator is intimately related to the geometry, such as the metric entropy number, of
\Theta
Sometimes, a minimax estimator may take the form of a randomised decision rule. An example is shown on the left. The parameter space has just two elements and each point on the graph corresponds to the risk of a decision rule: the x-coordinate is the risk when the parameter is
\theta1
\theta2
\delta1
1-p
\delta2
p
Robust optimization is an approach to solve optimization problems under uncertainty in the knowledge of underlying parameters,. For instance, the MMSE Bayesian estimation of a parameter requires the knowledge of parameter correlation function. If the knowledge of this correlation function is not perfectly available, a popular minimax robust optimization approach is to define a set characterizing the uncertainty about the correlation function, and then pursuing a minimax optimization over the uncertainty set and the estimator respectively. Similar minimax optimizations can be pursued to make estimators robust to certain imprecisely known parameters. For instance, a recent study dealing with such techniques in the area of signal processing can be found in.
In R. Fandom Noubiap and W. Seidel (2001) an algorithm for calculating a Gamma-minimax decision rule has been developed, when Gamma is given by a finite number of generalized moment conditions. Such a decision rule minimizes the maximum of the integrals of the risk function with respect to all distributions in Gamma. Gamma-minimax decision rules are of interest in robustness studies in Bayesian statistics.
. J.O. Berger (statistician) . 1985 . Statistical Decision Theory and Bayesian Analysis . Springer-Verlag. New York . 2 . xv+425 . 0580664 . 0-387-96098-8.