Probability integral transform explained

In probability theory, the probability integral transform (also known as universality of the uniform) relates to the result that data values that are modeled as being random variables from any given continuous distribution can be converted to random variables having a standard uniform distribution. This holds exactly provided that the distribution being used is the true distribution of the random variables; if the distribution is one fitted to the data, the result will hold approximately in large samples.

The result is sometimes modified or extended so that the result of the transformation is a standard distribution other than the uniform distribution, such as the exponential distribution.

The transform was introduced by Ronald Fisher in his 1932 edition of the book Statistical Methods for Research Workers.[1]

Applications

One use for the probability integral transform in statistical data analysis is to provide the basis for testing whether a set of observations can reasonably be modelled as arising from a specified distribution. Specifically, the probability integral transform is applied to construct an equivalent set of values, and a test is then made of whether a uniform distribution is appropriate for the constructed dataset. Examples of this are P–P plots and Kolmogorov–Smirnov tests.

A second use for the transformation is in the theory related to copulas which are a means of both defining and working with distributions for statistically dependent multivariate data. Here the problem of defining or manipulating a joint probability distribution for a set of random variables is simplified or reduced in apparent complexity by applying the probability integral transform to each of the components and then working with a joint distribution for which the marginal variables have uniform distributions.

A third use is based on applying the inverse of the probability integral transform to convert random variables from a uniform distribution to have a selected distribution: this is known as inverse transform sampling.

Statement

Suppose that a random variable

X

has a continuous distribution for which the cumulative distribution function (CDF) is

FX.

Then the random variable

Y

defined as

Y:=FX(X),

has a standard uniform distribution.[2] [3]

Equivalently, if

\mu

is the uniform measure on

[0,1]

, the distribution of

X

on

\R

is the pushforward measure

\mu\circ

-1
F
X
.

Proof

Given any random continuous variable

X

, define

Y=FX(X)

. Given

y\in[0,1]

, if
-1
F
X(y)
exists (i.e., if there exists a unique

x

such that

FX(x)=y

), then:

\begin{align} FY(y)&=\operatorname{P}(Y\leqy)\\ &=\operatorname{P}(FX(X)\leqy)\\ &=\operatorname{P}(X\leq

-1
F
X

(y))\\ &=FX

-1
(F
X

(y))\\ &=y \end{align}

If

-1
F
X(y)
does not exist, then it can be replaced in this proof by the function

\chi

, where we define

\chi(0)=-infty

,

\chi(1)=infty

, and

\chi(y)\equivinf\{x:FX(x)\gey\}

for

y\in(0,1)

, with the same result that

FY(y)=y

. Thus,

FY

is just the CDF of a

Uniform(0,1)

random variable, so that

Y

has a uniform distribution on the interval

[0,1]

.

Examples

For a first, illustrative example, let

X

be a random variable with a standard normal distribution

l{N}(0,1)

. Then its CDF is

\Phi(x)=

1
\sqrt{2\pi
} \int_^x ^ \, t = \frac12\Big[\, 1 + \operatorname{erf}\Big(\frac{x}{\sqrt{2}}\Big)\,\Big],\quad x\in\mathbb,\,where

\operatorname{erf},

is the error function. Then the new random variable

Y,

defined by

Y:=\Phi(X),

is uniformly distributed.

As second example, if

X

has an exponential distribution with unit mean, then its CDF is

F(x)=1-\exp(-x),

and the immediate result of the probability integral transform is that

Y=1-\exp(-X)

has a uniform distribution. Moreover, by symmetry of the uniform distribution,

Z=\exp(-X)

also has a uniform distribution.

See also

References

  1. David . F. N. . Johnson . N. L. . 1948 . The Probability Integral Transformation When Parameters are Estimated from the Sample . Biometrika . 35 . 1/2 . 182 . 10.2307/2332638. 2332638 .
  2. Dodge, Y. (2006) The Oxford Dictionary of Statistical Terms, Oxford University Press
  3. Book: Casella . George . Berger . Roger L. . Statistical Inference . 2002 . 2nd . Theorem 2.1.10, p.54.