Compound Poisson distribution explained

In probability theory, a compound Poisson distribution is the probability distribution of the sum of a number of independent identically-distributed random variables, where the number of terms to be added is itself a Poisson-distributed variable. The result can be either a continuous or a discrete distribution.

Definition

Suppose that

N\sim\operatorname{Poisson}(λ),

i.e., N is a random variable whose distribution is a Poisson distribution with expected value λ, and that

X_1,X_2,X_3,...

are identically distributed random variables that are mutually independent and also independent of N. Then the probability distribution of the sum of

i.i.d. random variables

	N
\sum
	n=1

X_n

is a compound Poisson distribution.

In the case N = 0, then this is a sum of 0 terms, so the value of Y is 0. Hence the conditional distribution of Y given that N = 0 is a degenerate distribution.

The compound Poisson distribution is obtained by marginalising the joint distribution of (Y,N) over N, and this joint distribution can be obtained by combining the conditional distribution Y | N with the marginal distribution of N.

Properties

The expected value and the variance of the compound distribution can be derived in a simple way from law of total expectation and the law of total variance. Thus

\operatorname{E}(Y)=\operatorname{E}\left[\operatorname{E}(Y\midN)\right]=\operatorname{E}\left[N\operatorname{E}(X)\right]=\operatorname{E}(N)\operatorname{E}(X),

\begin{align} \operatorname{Var}(Y)&=\operatorname{E}\left[\operatorname{Var}(Y\midN)\right]+\operatorname{Var}\left[\operatorname{E}(Y\midN)\right]=\operatorname{E}\left[N\operatorname{Var}(X)\right]+\operatorname{Var}\left[N\operatorname{E}(X)\right],\\[6pt] &=\operatorname{E}(N)\operatorname{Var}(X)+\left(\operatorname{E}(X)\right)²\operatorname{Var}(N). \end{align}

Then, since E(N) = Var(N) if N is Poisson-distributed, these formulae can be reduced to

\operatorname{E}(Y)=\operatorname{E}(N)\operatorname{E}(X)=λ\operatorname{E}(X),

\operatorname{Var}(Y)=\operatorname{E}(N)(\operatorname{Var}(X)+(\operatorname{E}(X))²⁾⁼\operatorname{E}(N){\operatorname{E}(X^2)}=λ{\operatorname{E}(X^2)}.

The probability distribution of Y can be determined in terms of characteristic functions:

\varphi_Y(t)=\operatorname{E}(e^itY)=\operatorname{E}\left(\left(\operatorname{E}(e^itX\midN)\right)^N\right)=\operatorname{E}

	N\right),
\left((\varphi
	X(t))

and hence, using the probability-generating function of the Poisson distribution, we have

\varphi_Y(t)=

	λ(\varphi_X(t)-1)
rm{e}

An alternative approach is via cumulant generating functions:

K_Y(t)=ln\operatorname{E}[e^tY]=ln\operatornameE[\operatornameE[e^tY\midN]]=ln\operatorname

	NK_X(t)
E[e

]=K_N(K_X(t)).

Via the law of total cumulance it can be shown that, if the mean of the Poisson distribution λ = 1, the cumulants of Y are the same as the moments of X₁.

Every infinitely divisible probability distribution is a limit of compound Poisson distributions.^[1] And compound Poisson distributions is infinitely divisible by the definition.

Discrete compound Poisson distribution

When

X_1,X_2,X_3,...

are positive integer-valued i.i.d random variables with

P(X₁=k)=\alpha_k, (k=1,2,\ldots)

, then this compound Poisson distribution is named discrete compound Poisson distribution^[2] ^[3] ^[4] (or stuttering-Poisson distribution^[5]) . We say that the discrete random variable

satisfying probability generating function characterization

P_Y(z)=

	infty
\sum\limits
	i=0

P(Y=i)zⁱ=

	infty
\exp\left(\sum\limits
	k=1

\alpha_kλ(z^k-1)\right), (|z|\le1)

has a discrete compound Poisson(DCP) distribution with parameters

(\alpha₁λ,\alpha₂λ,\ldots)\inR^infty

(where

\sum_^\infty \alpha_i = 1

, with

\alpha_i \ge 0,\lambda > 0

), which is denoted by

X\sim{DCP

}(\lambda,\lambda, \ldots)

Moreover, if

X\sim{\operatorname{DCP}}(λ{\alpha_1},\ldots,λ{\alpha_r})

, we say

has a discrete compound Poisson distribution of order

. When

r=1,2

, DCP becomes Poisson distribution and Hermite distribution, respectively. When

r=3,4

, DCP becomes triple stuttering-Poisson distribution and quadruple stuttering-Poisson distribution, respectively.^[6] Other special cases include: shift geometric distribution, negative binomial distribution, Geometric Poisson distribution, Neyman type A distribution, Luria–Delbrück distribution in Luria–Delbrück experiment. For more special case of DCP, see the reviews paper^[7] and references therein.

Feller's characterization of the compound Poisson distribution states that a non-negative integer valued r.v.

is infinitely divisible if and only if its distribution is a discrete compound Poisson distribution.^[8] The negative binomial distribution is discrete infinitely divisible, i.e., if X has a negative binomial distribution, then for any positive integer n, there exist discrete i.i.d. random variables X₁, ..., X_n whose sum has the same distribution that X has. The shift geometric distribution is discrete compound Poisson distribution since it is a trivial case of negative binomial distribution.

This distribution can model batch arrivals (such as in a bulk queue^[5] ^[9]). The discrete compound Poisson distribution is also widely used in actuarial science for modelling the distribution of the total claim amount.^[3]

When some

\alpha_k

are negative, it is the discrete pseudo compound Poisson distribution.^[3] We define that any discrete random variable

satisfying probability generating function characterization

G_Y(z)=

	infty
\sum\limits
	i=0

P(Y=i)zⁱ=

	infty
\exp\left(\sum\limits
	k=1

\alpha_kλ(z^k-1)\right), (|z|\le1)

has a discrete pseudo compound Poisson distribution with parameters

(λ₁,λ_2,\ldots)=:(\alpha₁λ,\alpha₂λ,\ldots)\inR^infty

where

\sum_^\infty = 1

and

\sum_^\infty < \infty

, with

{\alpha_i}\inR,λ>0

Compound Poisson Gamma distribution

If X has a gamma distribution, of which the exponential distribution is a special case, then the conditional distribution of Y | N is again a gamma distribution. The marginal distribution of Y is a Tweedie distribution^[10] with variance power 1 < p < 2 (proof via comparison of characteristic function (probability theory)). To be more explicit, if

N\sim\operatorname{Poisson}(λ),

and

X_i\sim\operatorname{\Gamma}(\alpha,\beta)

i.i.d., then the distribution of

	N
\sum
	i=1

X_i

ED(\mu,\sigma²⁾

with

\begin{align} \operatorname{E}[Y]&=λ

	\alpha
	\beta

=:\mu,\\[4pt] \operatorname{Var}[Y]&=λ

	\alpha(1+\alpha)
	\beta²

=:\sigma²\mu^p. \end{align}

The mapping of parameters Tweedie parameter

\mu,\sigma^2,p

to the Poisson and Gamma parameters

λ,\alpha,\beta

is the following:

\begin{align} λ&=

	\mu^2-p
	(2-p)\sigma²

, \\[4pt] \alpha&=

	2-p
	p-1

, \\[4pt] \beta&=

	\mu^1-p
	(p-1)\sigma²

. \end{align}

Compound Poisson processes

See main article: Compound Poisson process.

A compound Poisson process with rate

λ>0

and jump size distribution G is a continuous-time stochastic process

\{Y(t):t\geq0\}

given by

Y(t)=

	N(t)
\sum
	i=1

D_i,

where the sum is by convention equal to zero as long as N(t) = 0. Here,

\{N(t):t\geq0\}

is a Poisson process with rate

, and

\{D_i:i\geq1\}

are independent and identically distributed random variables, with distribution function G, which are also independent of

\{N(t):t\geq0\}.

^[11]

For the discrete version of compound Poisson process, it can be used in survival analysis for the frailty models.^[12]

Applications

A compound Poisson distribution, in which the summands have an exponential distribution, was used by Revfeim to model the distribution of the total rainfall in a day, where each day contains a Poisson-distributed number of events each of which provides an amount of rainfall which has an exponential distribution.^[13] Thompson applied the same model to monthly total rainfalls.^[14]

There have been applications to insurance claims^[15] ^[16] and x-ray computed tomography.^[17] ^[18] ^[19]

Notes and References

Book: Lukacs, E. . 1970 . Characteristic functions . London . Griffin . 0-85264-170-2 .
Johnson, N.L., Kemp, A.W., and Kotz, S. (2005) Univariate Discrete Distributions, 3rd Edition, Wiley, .
Zhang . Huiming . Yunxiao Liu . Bo Li . Notes on discrete compound Poisson model with applications to risk theory . Insurance: Mathematics and Economics . 59 . 2014. 325–336 . 10.1016/j.insmatheco.2014.09.012.
Zhang . Huiming . Bo Li . Characterizations of discrete compound Poisson distributions . Communications in Statistics - Theory and Methods . 45 . 22 . 2016. 6789–6802 . 10.1080/03610926.2014.901375. 125475756 .
"Stuttering – Poisson" distributions . C. D. . Kemp . Journal of the Statistical and Social Enquiry of Ireland . 1967 . 21 . 5 . 151–157 . 2262/6987 .
Patel, Y. C. (1976). Estimation of the parameters of the triple and quadruple stuttering-Poisson distributions. Technometrics, 18(1), 67-73.
Wimmer, G., Altmann, G. (1996). The multiple Poisson distribution, its characteristics and a variety of forms. Biometrical journal, 38(8), 995-1011.
Book: Feller, W. . 1968 . An Introduction to Probability Theory and its Applications . I . 3rd . Wiley . New York .
Adelson . R. M. . 1966 . Compound Poisson Distributions . Journal of the Operational Research Society. 17 . 1 . 73–75 . 10.1057/jors.1966.8 .
Book: Jørgensen, Bent . 1997 . The theory of dispersion models . Chapman & Hall . 978-0412997112.
Book: S. M. Ross. Introduction to Probability Models. ninth. Academic Press. Boston. 2007. 978-0-12-598062-3.
Ata . N. . Özel . G. . 2013 . Survival functions for the frailty models based on the discrete compound Poisson process . Journal of Statistical Computation and Simulation . 83 . 11 . 2105–2116 . 10.1080/00949655.2012.679943 . 119851120 .
Revfeim . K. J. A. . 1984 . An initial model of the relationship between rainfall events and daily rainfalls . Journal of Hydrology . 75 . 1–4 . 357–364 . 10.1016/0022-1694(84)90059-3 . 1984JHyd...75..357R .
Thompson . C. S. . 1984 . Homogeneity analysis of a rainfall series: an application of the use of a realistic rainfall model . J. Climatology . 4 . 6 . 609–619 . 10.1002/joc.3370040605 . 1984IJCli...4..609T .
Jørgensen . Bent . Paes De Souza . Marta C. . Fitting Tweedie's compound poisson model to insurance claims data . Scandinavian Actuarial Journal . January 1994 . 1994 . 1 . 69–93 . 10.1080/03461238.1994.10413930.
Smyth . Gordon K. . Jørgensen . Bent . Fitting Tweedie's Compound Poisson Model to Insurance Claims Data: Dispersion Modelling . ASTIN Bulletin . 29 August 2014 . 32 . 1 . 143–157 . 10.2143/AST.32.1.1020. free .
Whiting . Bruce R. . Larry E. . Martin J. . Antonuk . Yaffe . Signal statistics in x-ray computed tomography . Medical Imaging 2002: Physics of Medical Imaging . 3 May 2002 . 4682 . 53–60 . 10.1117/12.465601 . International Society for Optics and Photonics. 2002SPIE.4682...53W . 116487704 .
Elbakri . Idris A. . Fessler . Jeffrey A. . J. Michael . Fitzpatrick . Milan . Sonka . Efficient and accurate likelihood for iterative image reconstruction in x-ray computed tomography . Medical Imaging 2003: Image Processing . 16 May 2003 . 5032 . 1839–1850 . 10.1117/12.480302 . SPIE. 2003SPIE.5032.1839E . 12215253 . 10.1.1.419.3752 .
Whiting . Bruce R. . Massoumzadeh . Parinaz . Earl . Orville A. . O'Sullivan . Joseph A. . Snyder . Donald L. . Williamson . Jeffrey F. . Properties of preprocessed sinogram data in x-ray computed tomography . Medical Physics . 24 August 2006 . 33 . 9 . 3290–3303 . 10.1118/1.2230762. 17022224 . 2006MedPh..33.3290W .