Hajek projection explained

T

on a set of independent random vectors

X1,...,Xn

is a particular measurable function of

X1,...,Xn

that, loosely speaking, captures the variation of

T

in an optimal way. It is named after the Czech statistician Jaroslav Hájek .

Definition

Given a random variable

T

and a set of independent random vectors

X1,...,Xn

, the Hájek projection

\hat{T}

of

T

onto

\{X1,...,Xn\}

is given by[1]

\hat{T}=\operatorname{E}(T)+

n
\sum
i=1

\left[\operatorname{E}(T\midXi)-\operatorname{E}(T)\right]

n
= \sum
i=1

\operatorname{E}(T\midXi)-(n-1)\operatorname{E}(T)

Properties

\hat{T}

is an

L2

projection of

T

onto a linear subspace of all random variables of the form
n
\sum
i=1

gi(Xi)

, where
d
g
i:R

\toR

are arbitrary measurable functions such that
2(X
\operatorname{E}(g
i))<infty
for all

i=1,...,n

\operatorname{E}(\hat{T}\midXi)=\operatorname{E}(T\midXi)

and hence

\operatorname{E}(\hat{T})=\operatorname{E}(T)

Tn=Tn(X1,...,Xn)

and the sequence of its Hájek projections

\hat{T}n=\hat{T}n(X1,...,Xn)

coincide, namely, if

\operatorname{Var}(Tn)/\operatorname{Var}(\hat{T}n)\to1

, then
Tn-\operatorname{E
(T

n)}{\sqrt{\operatorname{Var}(Tn)}}-

\hat{T
n-\operatorname{E}(\hat{T}

n)}{\sqrt{\operatorname{Var}(\hat{T}n)}}

converges to zero in probability.

Notes and References

  1. Book: Vaart, Aad W. van der (1959-....).. Asymptotic statistics. 2012. Cambridge University Press. 9780511802256. 928629884.