In probability theory and statistics, the generalized extreme value (GEV) distribution[1] is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known as type I, II and III extreme value distributions. By the extreme value theorem the GEV distribution is the only possible limit distribution of properly normalized maxima of a sequence of independent and identically distributed random variables.[2] Note that a limit distribution needs to exist, which requires regularity conditions on the tail of the distribution. Despite this, the GEV distribution is often used as an approximation to model the maxima of long (finite) sequences of random variables.
In some fields of application the generalized extreme value distribution is known as the Fisher–Tippett distribution, named after Ronald Fisher and L. H. C. Tippett who recognised three different forms outlined below. However usage of this name is sometimes restricted to mean the special case of the Gumbel distribution. The origin of the common functional form for all 3 distributions dates back to at least Jenkinson, A. F. (1955),[3] though allegedly[4] it could also have been given by von Mises, R. (1936).[5]
Using the standardized variable
s\equiv
x-\mu | |
\sigma |
,
\mu ,
\sigma>0
where
\xi ,
\xi>0 ,
s>-\tfrac{ 1 }{\xi} ,
\xi<0
s<-\tfrac{ 1 }{\xi}~.
-\tfrac{ 1 }{\xi}
F
-\tfrac{ 1 }{\xi}
F
\xi=0
\xi\to0
s
In the special case of
x=\mu ,
s=0
F( 0; \xi )=e-1 ≈ 0.368
\xi
\sigma
The probability density function of the standardized distribution is
again valid for
s>-\tfrac{ 1 }{\xi}
\xi>0 ,
s<-\tfrac{ 1 }{\xi}
\xi<0~.
\xi=0
Since the cumulative distribution function is invertible, the quantile function for the GEV distribution has an explicit expression, namely
and therefore the quantile density function,
q\equiv
d Q | |
d p |
,
q( p; \sigma, \xi )=
\sigma | |
l(-ln(p) r)\xi p |
for~~p\in( 0 , 1 ) ,
valid for
\sigma>0
\xi~.
Some simple statistics of the distribution are:
\operatorname{E}(X)=\mu+
(g | ||||
|
\xi<1
\operatorname{Var}(X)=(g2-g
| ||||
1 |
,
\operatorname{Mode}(X)=\mu+
\sigma | |
\xi |
[(1+\xi)-\xi-1].
The skewness is for ξ>0
\operatorname{skewness}(X)=
| |||||||||||||
|
For ξ < 0, the sign of the numerator is reversed.
The excess kurtosis is:
\operatorname{kurtosis excess}(X)=
| |||||||||||||
|
-3~.
where
gk=\Gamma(1-k \xi) ,
k=1,2,3,4 ,
\Gamma(t)
The shape parameter
\xi
\xi=0 ,
\xi>0 ,
\xi<0 ;
~\xi=0 ,
x\inl( -infty , +infty r) :
F( x; \mu, \sigma, 0 )=\exp\left(-\exp\left(-
x-\mu | |
\sigma |
\right)\right)~.
~\xi>0 ,
x\in\left( \mu-\tfrac{\sigma}{ \xi } , +infty \right) :
Let
\alpha\equiv\tfrac{ 1 }{\xi}>0
y\equiv1+\tfrac{\xi}{\sigma}(x-\mu) ;
F( x; \mu, \sigma, \xi )=\begin{cases}0&y\leq0 ~or equiv.~ x\leq\mu-\tfrac{\sigma}{ \xi }\ \exp\left(-
1 | |
~y\alpha |
\right)&y>0 ~or equiv.~ x>\mu-\tfrac{\sigma}{ \xi }~.\end{cases}
~\xi<0 ,
x\in\left(-infty , \mu+\tfrac{\sigma}{ | \xi | } \right) :
Let
\alpha\equiv-\tfrac{1}{ \xi }>0
y\equiv1-\tfrac{ | \xi | }{\sigma}(x-\mu) ;
F( x; \mu, \sigma, \xi )=\begin{cases}\exp\left(-y\alpha\right)&y>0 ~or equiv.~ x<\mu+\tfrac{\sigma}{ | \xi | }\ 1&y\leq0 ~or equiv.~ x\geq\mu+\tfrac{\sigma}{ | \xi | }~.\end{cases}
The subsections below remark on properties of these distributions.
The theory here relates to data maxima and the distribution being discussed is an extreme value distribution for maxima. A generalised extreme value distribution for data minima can be obtained, for example by substituting
-x
x
F(x)
The ordinary Weibull distribution arises in reliability applications and is obtained from the distribution here by using the variable
t=\mu-x ,
Note the differences in the ranges of interest for the three extreme value distributions: Gumbel is unlimited, Fréchet has a lower limit, while the reversed Weibull has an upper limit.More precisely, Extreme Value Theory (Univariate Theory) describes which of the three is the limiting law according to the initial law and in particular depending on its tail.
One can link the type I to types II and III in the following way: If the cumulative distribution function of some random variable
X
F( x; 0, \sigma, \alpha ) ,
lnX
F( x; ln\sigma, \tfrac{1}{ \alpha }, 0 )~.
X
F( x; 0, \sigma, -\alpha ) ,
ln(-X)
F( x; -ln\sigma, \tfrac{ 1 }{\alpha}, 0 )~.
Multinomial logit models, and certain other types of logistic regression, can be phrased as latent variable models with error variables distributed as Gumbel distributions (type I generalized extreme value distributions). This phrasing is common in the theory of discrete choice models, which include logit models, probit models, and various extensions of them, and derives from the fact that the difference of two type-I GEV-distributed variables follows a logistic distribution, of which the logit function is the quantile function. The type-I GEV distribution thus plays the same role in these logit models as the normal distribution does in the corresponding probit models.
The cumulative distribution function of the generalized extreme value distribution solves the stability postulate equation. The generalized extreme value distribution is a special case of a max-stable distribution, and is a transformation of a min-stable distribution.
Let
\left\{ Xi | 1\lei\len \right\}
max\{ Xi | 1\lei\len \}\simGEV(\mun,\sigman,0) ,
\begin{align} \mun&=\Phi-1\left(1-
1 | |
n |
\right)\\ \sigman&=\Phi-1\left(1-
1 | |
n e |
\right)-\Phi-1\left(1-
1 | |
n |
\right)~. \end{align}
This allow us to estimate e.g. the mean of
max\{ Xi | 1\lei\len \}
\begin{align} \operatorname{E}\left\{ max\left\{ Xi | 1\lei\len \right\} \right\} & ≈ \mun+\gammaE \sigman\\ &=(1-\gammaE) \Phi-1\left(1-
1 | |
n |
\right)+\gammaE \Phi-1\left(1-
1 | |
e n |
\right)\\ &=\sqrt{log\left(
n2 | |||||
|
\right)~} ⋅ \left(1+
\gamma | |
logn |
+l{o}\left(
1 | |
logn |
\right)\right) , \end{align}
where
\gammaE
X\simrm{GEV}(\mu,\sigma,\xi)
mX+b\simrm{GEV}(m\mu+b, m\sigma, \xi)
X\simrm{Gumbel}(\mu, \sigma)
X\simrm{GEV}(\mu,\sigma,0)
X\simrm{Weibull}(\sigma,\mu)
\mu\left(1-\sigmalog\tfrac{X}{\sigma}\right)\simrm{GEV}(\mu,\sigma,0)
X\simrm{GEV}(\mu,\sigma,0)
\sigma\exp(-\tfrac{X-\mu}{\mu\sigma})\simrm{Weibull}(\sigma,\mu)
X\simrm{Exponential}(1)
\mu-\sigmalogX\simrm{GEV}(\mu,\sigma,0)
X\simGumbel(\alphaX,\beta)
Y\simGumbel(\alphaY,\beta)
X-Y\simLogistic(\alphaX-\alphaY,\beta)
X
Y\simGumbel(\alpha,\beta)
X+Y\nsimLogistic(2\alpha,\beta)
Note that
\operatorname{E}\{ X+Y \}=2\alpha+2\beta\gamma ≠ 2\alpha=\operatorname{E}\left\{ \operatorname{Logistic}(2\alpha,\beta) \right\}~.
4. Let
X\simrm{Weibull}(\sigma,\mu) ,
g(x)=\mu\left(1-\sigmalog
X | |
\sigma |
\right)
\begin{align} \operatorname{P}\left\{ \mu\left(1-\sigmalog | X |
\sigma |
\right)<x \right\}&=
\operatorname{
|
>
1-x/\mu | |
\sigma |
\right\}\ {}\\ & Since the logarithm is always increasing: \ {}\\ &=\operatorname{P}\left\{ X>\sigma\exp\left[
1-x/\mu | |
\sigma |
\right] \right\}\\ &=\exp\left(-\left(\cancel{\sigma}\exp\left[
1-x/\mu | |
\sigma |
\right] ⋅ \cancel{
1 | |
\sigma |
which is the cdf for
\simrm{GEV}(\mu,\sigma,0)~.
5. Let
X\simrm{Exponential}(1) ,
g(X)=\mu-\sigmalogX
\begin{align} \operatorname{P}\left\{ \mu-\sigmalogX<x \right\}&=\operatorname{P}\left\{ logX>
\mu-x | |
\sigma |
\right\}\ {}\\ & Since the logarithm is always increasing: \ {}\\ &=\operatorname{P}\left\{ X>\exp\left(
\mu-x | |
\sigma |
\right) \right\}\\ &=\exp\left[-\exp\left(
\mu-x | |
\sigma |
\right)\right]\\ &=\exp\left[-\exp(-s)\right] , ~where~ s\equiv
x-\mu | |
\sigma |
; \end{align}
which is the cumulative distribution of
\operatorname{GEV}(\mu,\sigma,0)~.