Hypoexponential distribution explained

In probability theory the hypoexponential distribution or the generalized Erlang distribution is a continuous distribution, that has found use in the same fields as the Erlang distribution, such as queueing theory, teletraffic engineering and more generally in stochastic processes. It is called the hypoexponetial distribution as it has a coefficient of variation less than one, compared to the hyper-exponential distribution which has coefficient of variation greater than one and the exponential distribution which has coefficient of variation of one.

Overview

The Erlang distribution is a series of k exponential distributions all with rate

λ

. The hypoexponential is a series of k exponential distributions each with their own rate

λi

, the rate of the

ith

exponential distribution. If we have k independently distributed exponential random variables

\boldsymbol{X}i

, then the random variable,
k
\boldsymbol{X}=\sum
i=1

\boldsymbol{X}i

is hypoexponentially distributed. The hypoexponential has a minimum coefficient of variation of

1/k

.

Relation to the phase-type distribution

As a result of the definition it is easier to consider this distribution as a special case of the phase-type distribution.[1] The phase-type distribution is the time to absorption of a finite state Markov process. If we have a k+1 state process, where the first k states are transient and the state k+1 is an absorbing state, then the distribution of time from the start of the process until the absorbing state is reached is phase-type distributed. This becomes the hypoexponential if we start in the first 1 and move skip-free from state i to i+1 with rate

λi

until state k transitions with rate

λk

to the absorbing state k+1. This can be written in the form of a subgenerator matrix,

\left[\begin{matrix}1&λ1&0&...&0&0\\ 0&2&λ2&\ddots&0&0\\ \vdots&\ddots&\ddots&\ddots&\ddots&\vdots\\ 0&0&\ddots&k-2&λk-2&0\\ 0&0&...&0&k-1&λk-1\\ 0&0&...&0&0&k\end{matrix}\right].

For simplicity denote the above matrix

\Theta\equiv\Theta(λ1,...,λk)

. If the probability of starting in each of the k states is

\boldsymbol{\alpha}=(1,0,...,0)

then

Hypo(λ1,...,λk)=PH(\boldsymbol{\alpha},\Theta).

Two parameter case

Where the distribution has two parameters (

λ1λ2

) the explicit forms of the probability functions and the associated statistics are:[2]

CDF:

F(x)=1-

λ2
λ2-λ1
1x
e

+

λ1
λ2-λ1
2x
e

PDF:

f(x)=

λ2
λ1-λ2

(

-xλ2
e

-

-xλ1
e

)

Mean:

1+
λ1
1
λ2

Variance:

1+
2
λ
1
1
2
λ
2

Coefficient of variation:

2
\sqrt{λ+
2
λ
2
1
}

The coefficient of variation is always less than 1.

Given the sample mean (

\bar{x}

) and sample coefficient of variation (

c

), the parameters

λ1

and

λ2

can be estimated as follows:

λ1=

2
\bar{x

}\left[1+\sqrt{1+2(c2-1)}\right]-1

λ2=

2
\bar{x

}\left[1-\sqrt{1+2(c2-1)}\right]-1

These estimators can be derived from the methods of moments by setting

1+
λ1
1
λ2

=\barx

and
2
\sqrt{λ
2
}=c

.

The resulting parameters

λ1

and

λ2

are real values if

c2\in[0.5,1]

.

Characterization

A random variable

\boldsymbol{X}\simHypo(λ1,...,λk)

has cumulative distribution function given by,

F(x)=1-\boldsymbol{\alpha}ex\Theta\boldsymbol{1}

and density function,

f(x)=-\boldsymbol{\alpha}ex\Theta\Theta\boldsymbol{1},

where

\boldsymbol{1}

is a column vector of ones of the size k and

eA

is the matrix exponential of A. When

λi\neλj

for all

i\nej

, the density function can be written as

f(x)=

k
\sum
i=1

λi

-xλi
e
k
\left(\prod
j=1,j\nei
λj
λji

\right)=

k
\sum
i=1

\elli(0)λi

-xλi
e

where

\ell1(x),...,\ellk(x)

are the Lagrange basis polynomials associated with the points

λ1,...,λk

.

The distribution has Laplace transform of

l{L}\{f(x)\}=-\boldsymbol{\alpha}(sI-\Theta)-1\Theta\boldsymbol{1}

Which can be used to find moments,

E[Xn]=(-1)nn!\boldsymbol{\alpha}\Theta-n\boldsymbol{1}.

General case

In the general casewhere there are

a

distinct sums of exponential distributionswith rates

λ1,λ2,,λa

and a number of terms in eachsum equals to

r1,r2,,ra

respectively. The cumulativedistribution function for

t\geq0

is given by

F(t) =1-

a
\left(\prod
j=1
rj
λ
j
a
\right) \sum
k=1
rk
\sum
l=1
\Psi(k)
rk-l
t
\exp(kt)
k,l
(rk-l)!(l-1)!

,

with

\Psik,l(x) =-

\partiall-1
\partialxl-1
a
\left(\prod
j=0,jk
-rj
\left(λ
j+x\right)

\right).

with the additional convention

λ0=0,r0=1

.[3]

Uses

This distribution has been used in population genetics,[4] cell biology,[5] [6] and queuing theory.[7] [8]

See also

Further reading

Notes and References

  1. Legros . Benjamin . Jouini . Oualid . 2015 . A linear algebraic approach for the computation of sums of Erlang random variables . Applied Mathematical Modelling . 39 . 16 . 4971–4977 . 10.1016/j.apm.2015.04.013 . free.
  2. Book: Bolch . Gunter . Greiner . Stefan . de Meer . Hermann . Trivedi . Kishor S. . Kishor S. Trivedi . 2006 . Queuing Networks and Markov Chains: Modeling and Performance Evaluation with Computer Science Applications . 2nd . Wiley . 24–25 . 10.1002/0471791571 . 978-0-471-79157-7.
  3. Amari . Suprasad V. . Misra . Ravindra B. . 1997 . Closed-form expressions for distribution of sum of exponential random variables . IEEE Transactions on Reliability . 46 . 4 . 519–522 . 10.1109/24.693785.
  4. Strimmer . Korbinian . Pybus . Oliver G. . 2001 . Exploring the demographic history of DNA sequences using the generalized skyline plot . . 18 . 12 . 2298–2305 . 10.1093/oxfordjournals.molbev.a003776 . free . 11719579.
  5. Yates . Christian A. . Ford . Matthew J. . Mort . Richard L. . 2017 . A multi-stage representation of cell proliferation as a Markov process . Bulletin of Mathematical Biology . 79 . 12 . 2905–2928 . 1705.09718 . 10.1007/s11538-017-0356-4 . free . 5709504 . 29030804.
  6. Gavagnin . Enrico . Ford . Matthew J. . Mort . Richard L. . Rogers . Tim . Yates . Christian A. . 2019 . The invasion speed of cell migration models with realistic cell cycle time distributions . . 481 . 91–99 . 1806.03140 . 10.1016/j.jtbi.2018.09.010 . 30219568.
  7. Web site: Forecasting and capacity planning for ambulance services . Călinescu . Malenia . August 2009 . Faculty of Sciences . . https://web.archive.org/web/20100215173841/http://www.few.vu.nl/en/Images/stageverslag-calinescu_tcm39-105827.pdf . 15 February 2010.
  8. Bekker . René . Koeleman . Paulien M. . 2011 . Scheduling admissions and reducing variability in bed demand . Health Care Management Science . 14 . 3 . 237–249 . 10.1007/s10729-011-9163-x . free . 3158339 . 21667090.