Donsker's theorem explained

In probability theory, Donsker's theorem (also known as Donsker's invariance principle, or the functional central limit theorem), named after Monroe D. Donsker, is a functional extension of the central limit theorem for empirical distribution functions. Specifically, the theorem states that an appropriately centered and scaled version of the empirical distribution function converges to a Gaussian process.

Let

X_1,X_2,X_3,\ldots

be a sequence of independent and identically distributed (i.i.d.) random variables with mean 0 and variance 1. Let

S_n:=\sum

	n

	i=1

X_i

. The stochastic process

S:=(S_n)_n\in\N

is known as a random walk. Define the diffusively rescaled random walk (partial-sum process) by

W⁽ⁿ⁾(t):=

	S_\lfloor
	\sqrt{n

}, \qquad t\in [0,1].

The central limit theorem asserts that

W⁽ⁿ⁾(1)

converges in distribution to a standard Gaussian random variable

W(1)

n\toinfty

. Donsker's invariance principle^[1] extends this convergence to the whole function

W⁽ⁿ⁾:=(W⁽ⁿ⁾(t))_t\in

. More precisely, in its modern form, Donsker's invariance principle states that: As random variables taking values in the Skorokhod space

l{D}[0,1]

, the random function

W⁽ⁿ⁾

converges in distribution to a standard Brownian motion

W:=(W(t))_t\in

n\toinfty.

Formal statement

Let F_n be the empirical distribution function of the sequence of i.i.d. random variables

X_1,X_2,X_3,\ldots

with distribution function F. Define the centered and scaled version of F_n by

G_n(x)=\sqrtn(F_n(x)-F(x))

indexed by x ∈ R. By the classical central limit theorem, for fixed x, the random variable G_n(x) converges in distribution to a Gaussian (normal) random variable G(x) with zero mean and variance F(x)(1 − F(x)) as the sample size n grows.

l{D}(-infty,infty)

, converges in distribution to a Gaussian process G with zero mean and covariance given by

\operatorname{cov}[G(s),G(t)]=E[G(s)G(t)]=min\{F(s),F(t)\}-F(s)

{F}(t).

The process G(x) can be written as B(F(x)) where B is a standard Brownian bridge on the unit interval.

Proof sketch

For continuous probability distributions, it reduces to the case where the distribution is uniform on

[0,1]

by the inverse transform.

Given any finite sequence of times

0<t₁<t₂<...<t_n<1

, we have that

NF_N(t₁₎

is distributed as a binomial distribution with mean

Nt₁

and variance

Nt_1(1-t₁₎

Similarly, the joint distribution of

F_N(t_1),F_N(t_2),...,F_N(t_n)

is a multinomial distribution. Now, the central limit approximation for multinomial distributions shows that

\lim_N\sqrtN(F_N(t_i)-t_i)

converges in distribution to a gaussian process with covariance matrix with entries

min(t_i,t_j)-t_it_j

, which is precisely the covariance matrix for the Brownian bridge.

History and related results

Kolmogorov (1933) showed that when F is continuous, the supremum

\scriptstyle\sup_tG_n(t)

and supremum of absolute value,

\scriptstyle\sup_t|G_n(t)|

converges in distribution to the laws of the same functionals of the Brownian bridge B(t), see the Kolmogorov–Smirnov test. In 1949 Doob asked whether the convergence in distribution held for more general functionals, thus formulating a problem of weak convergence of random functions in a suitable function space.^[2]

In 1952 Donsker stated and proved (not quite correctly)^[3] a general extension for the Doob–Kolmogorov heuristic approach. In the original paper, Donsker proved that the convergence in law of G_n to the Brownian bridge holds for Uniform[0,1] distributions with respect to uniform convergence in t over the interval [0,1].^[4]

However Donsker's formulation was not quite correct because of the problem of measurability of the functionals of discontinuous processes. In 1956 Skorokhod and Kolmogorov defined a separable metric d, called the Skorokhod metric, on the space of càdlàg functions on [0,1], such that convergence for d to a continuous function is equivalent to convergence for the sup norm, and showed that G_n converges in law in

l{D}[0,1]

to the Brownian bridge.

Later Dudley reformulated Donsker's result to avoid the problem of measurability and the need of the Skorokhod metric. One can prove that there exist X_i, iid uniform in [0,1] and a sequence of sample-continuous Brownian bridges B_n, such that

\|G_n-B_n\|_infty

is measurable and converges in probability to 0. An improved version of this result, providing more detail on the rate of convergence, is the Komlós–Major–Tusnády approximation.

Notes and References

An invariance principle for certain probability limit theorems. Donsker. M.D.. Memoirs of the American Mathematical Society . 1951 . 6 . Monroe D. Donsker . 0040613.
Joseph L. . Doob. Joseph L. Doob . Heuristic approach to the Kolmogorov–Smirnov theorems . . 20 . 3. 393–403 . 1949 . 10.1214/aoms/1177729991 . 30732 . 0035.08901. free.
Book: Dudley, R.M. . Richard M. Dudley . Uniform Central Limit Theorems . Cambridge University Press . 1999 . 978-0-521-46102-3.
M. D. . Donsker . Monroe D. Donsker . Justification and extension of Doob's heuristic approach to the Kolmogorov–Smirnov theorems . . 23 . 2. 277–281 . 1952 . 10.1214/aoms/1177729445 . 47288 . 0046.35103. free.

Donsker's theorem explained

Formal statement

Proof sketch

History and related results

See also

Notes and References