In probability theory and statistics, empirical likelihood (EL) is a nonparametric method for estimating the parameters of statistical models. It requires fewer assumptions about the error distribution while retaining some of the merits in likelihood-based inference. The estimation method requires that the data are independent and identically distributed (iid). It performs well even when the distribution is asymmetric or censored.[1] EL methods can also handle constraints and prior information on parameters. Art Owen pioneered work in this area with his 1988 paper.[2]
Given a set of
n
yi
Yi
n | |
\hat{F}(y):=\sum | |
i=1 |
\piiI(Yi<y)
I
\pii
n | |
L:=\prod | |
i=1 |
\hat{F | |
(y |
i)-\hat{F}(yi-\deltay)}{\deltay},
\deltay
Empirical likelihood estimation can be augmented with side information by using further constraints (similar to the generalized estimating equations approach) for the empirical distribution function. E.g. a constraint like the following can be incorporated using a Lagrange multiplier
infty | |
E[h(Y;\theta)]=\int | |
-infty |
h(y;\theta)dF=0
n | |
\hat{E}[h(y;\theta)]=\sum | |
i=1 |
h(yi;\theta)\pii=0
With similar constraints, we could also model correlation.
The empirical-likelihood method can also be also employed for discrete distributions.[4] Given
pi:=\hat{F}(yi)-\hat{F}(yi-\deltay), i=1,...,n
pi\geq0and
n p | |
\sum | |
i |
=1.
Then the empirical likelihood is again
L(p1,...,pn)=
n | |
\prod | |
i=1 |
pi
Using the Lagrangian multiplier method to maximize the logarithm of the empirical likelihood subject to the trivial normalization constraint, we find
pi=1/n
\hat{F}
EL estimates are calculated by maximizing the empirical likelihood function (see above) subject to constraints based on the estimating function and the trivial assumption that the probability weights of the likelihood function sum to 1.[5] This procedure is represented as:
max | |
\pii,\theta |
ln(L)=
max | |
\pii,\theta |
n | |
\sum | |
i=1 |
ln\pii
s.t.
n\pi | |
\sum | |
i |
=1,
n\pi | |
\sum | |
i |
h(yi;\theta)=0,\foralli\in[1..n] 0\le\pii.
The value of the theta parameter can be found by solving the Lagrangian function
l{L}=
n | |
\sum | |
i=1 |
ln\pii+\mu(1-
n | |
\sum | |
i=1 |
\pii)-n\tau'
n | |
\sum | |
i=1 |
\piih(yi;\theta).
There is a clear analogy between this maximization problem and the one solved for maximum entropy.
The parameters
\pii
An empirical likelihood ratio function is defined and used to obtain confidence intervals parameter of interest θ similar to parametric likelihood ratio confidence intervals.[7] [8] Let L(F) be the empirical likelihood of function
F
R(F)=L(F)/L(Fn)
Consider sets of the form
C=\{T(F)|R(F)\geqr\}
Under such conditions a test of
T(F)=t
C
T(F)=t
L(F)\geqrL(Fn)
The central result is for the mean of X. Clearly, some restrictions on
F
C=\realsp
r<1
F=\epsilon\deltax+(1-\epsilon)Fn
If
\epsilon
\epsilon>0
R(F)\geqr
But then, as
x
\realsp
F
C=\realsp
F\llFn
F
t
C
The use of empirical likelihood is not limited to confidence intervals. In efficient quantile regression, an EL-based categorization[9] procedure helps determine the shape of the true discrete distribution at level p, and also provides a way of formulating a consistent estimator. In addition, EL can be used in place of parametric likelihood to form model selection criteria.[10] Empirical likelihood can naturally be applied in survival analysis[11] or regression problems[12]
\hat{F | |
(y |
i)-\hat{F}(yi-\deltay)}{\deltay}