Inverse-gamma distribution explained

In probability theory and statistics, the inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is the distribution of the reciprocal of a variable distributed according to the gamma distribution.

Perhaps the chief use of the inverse gamma distribution is in Bayesian statistics, where the distribution arises as the marginal posterior distribution for the unknown variance of a normal distribution, if an uninformative prior is used, and as an analytically tractable conjugate prior, if an informative prior is required.^[1] It is common among some Bayesians to consider an alternative parametrization of the normal distribution in terms of the precision, defined as the reciprocal of the variance, which allows the gamma distribution to be used directly as a conjugate prior. Other Bayesians prefer to parametrize the inverse gamma distribution differently, as a scaled inverse chi-squared distribution.

Characterization

Probability density function

x>0

f(x;\alpha,\beta) =

	\beta^\alpha
	\Gamma(\alpha)

(1/x)^\alpha\exp\left(-\beta/x\right)

\alpha

and scale parameter

\beta

.^[2] Here

\Gamma( ⋅ )

denotes the gamma function.

Unlike the gamma distribution, which contains a somewhat similar exponential term,

\beta

is a scale parameter as the density function satisfies:

f(x;\alpha,\beta)=

	f(x/\beta;\alpha,1)
	\beta

Cumulative distribution function

The cumulative distribution function is the regularized gamma function

F(x;\alpha,\beta)=

\Gamma\left(\alpha,	\beta	\right)
	x

\Gamma(\alpha)

=Q\left(\alpha,

	\beta
	x

\right)

where the numerator is the upper incomplete gamma function and the denominator is the gamma function. Many math packages allow direct computation of

, the regularized gamma function.

Moments

Provided that

\alpha>n

, the

-th moment of the inverse gamma distribution is given by^[3]

E[X^n]=\betaⁿ

	\Gamma(\alpha-n)
	\Gamma(\alpha)

	\betaⁿ
	(\alpha-1) … (\alpha-n)

Characteristic function

The inverse gamma distribution has characteristic function $\fracK_\left(\sqrt\right)$ where

K_\alpha

is the modified Bessel function of the 2nd kind.

Properties

For

\alpha>0

and

\beta>0

E[ln(X)]=ln(\beta)-\psi(\alpha)

and

E[X^-1]=

	\alpha
	\beta

The information entropy is

\begin{align} \operatorname{H}(X)&=\operatorname{E}[-ln(p(X))]\\ &=\operatorname{E}\left[-\alphaln(\beta)+ln(\Gamma(\alpha))+(\alpha+1)ln(X)+

	\beta
	X

\right]\\ &=-\alphaln(\beta)+ln(\Gamma(\alpha))+(\alpha+1)ln(\beta)-(\alpha+1)\psi(\alpha)+\alpha\\ &=\alpha+ln(\beta\Gamma(\alpha))-(\alpha+1)\psi(\alpha). \end{align}

where

\psi(\alpha)

is the digamma function.

The Kullback-Leibler divergence of Inverse-Gamma(α_p, β_p) from Inverse-Gamma(α_q, β_q) is the same as the KL-divergence of Gamma(α_p, β_p) from Gamma(α_q, β_q):

D_KL(\alpha_p,\beta_p;\alpha_q,\beta_q)=E\left[log

	\rho(X)
	\pi(X)

\right]=E\left[log

	\rho(1/Y)
	\pi(1/Y)

\right]=E\left[log

	\rho_G(Y)
	\pi_G(Y)

\right],

where

\rho,\pi

are the pdfs of the Inverse-Gamma distributions and

\rho_G,\pi_G

are the pdfs of the Gamma distributions,

is Gamma(α_p, β_p) distributed.

\begin{align} D_KL(\alpha_p,\beta_p;\alpha_q,\beta_q)={}&(\alpha_p-\alpha_q)\psi(\alpha_p)-log\Gamma(\alpha_p)+log\Gamma(\alpha_q)+\alpha_q(log\beta_p-log\beta_q)+

\alpha

p	\beta_q-\beta_p
	\beta_p

. \end{align}

Related distributions

X\simInv-Gamma(\alpha,\beta)

then

kX\simInv-Gamma(\alpha,k\beta)

, for

k>0

X\simInv-Gamma(\alpha,\tfrac{1}{2})

then

X\simInv-\chi²⁽²\alpha)

(inverse-chi-squared distribution)

X\simInv-Gamma(\tfrac{\alpha}{2},\tfrac{1}{2})

then

X\simScaledInv-\chi^{2(\alpha,\tfrac{1}{\alpha})}

(scaled-inverse-chi-squared distribution)

X\simrm{Inv-Gamma}(\tfrac{1}{2},\tfrac{c}{2})

then

X\simrm{Levy}(0,c)

(Lévy distribution)

X\simrm{Inv-Gamma}(1,c)

then

\tfrac{1}{X}\simrm{Exp}(c)

(Exponential distribution)

X\simGamma(\alpha,\beta)

(Gamma distribution with rate parameter

\beta

) then

\tfrac{1}{X}\simInv-Gamma(\alpha,\beta)

(see derivation in the next paragraph for details)

Note that If

X\simGamma(k,\theta)

(Gamma distribution with scale parameter

\theta

) then

1/X\simInv-Gamma(k,1/\theta)

Inverse gamma distribution is a special case of type 5 Pearson distribution
A multivariate generalization of the inverse-gamma distribution is the inverse-Wishart distribution.
For the distribution of a sum of independent inverted Gamma variables see Witkovsky (2001)

Derivation from Gamma distribution

Let

X\simGamma(\alpha,\beta)

, and recall that the pdf of the gamma distribution is

f_X(x)=

	\beta^\alpha
	\Gamma(\alpha)

x^\alpha-1e^-\beta

x>0

Note that

\beta

is the rate parameter from the perspective of the gamma distribution.

Define the transformation

Y=g(X)=\tfrac{1}{X}

. Then, the pdf of

\begin{align} f_Y(y)&=f_X\left(g^-1(y)\right)\left|

	d
	dy

g^-1(y)\right|\\[6pt] &=

	\beta^\alpha
	\Gamma(\alpha)

\left(

	1
	y

\right)^\alpha-1\exp\left(

	-\beta
	y

\right)

	1
	y²

\\[6pt] &=

	\beta^\alpha
	\Gamma(\alpha)

\left(

	1
	y

\right)^\alpha+1\exp\left(

	-\beta
	y

\right)\\[6pt] &=

	\beta^\alpha
	\Gamma(\alpha)

\left(y\right)^-\alpha-1\exp\left(

	-\beta
	y

\right)\\[6pt] \end{align}

Note that

{\beta}

is the scale parameter from the perspective of the inverse gamma distribution. This can be straightforwardly demonstrated by seeing that

{\beta}

satisfies the conditions for being a scale parameter.

\begin{align}	f(y/\beta;\alpha,1)
	\beta

	1
	\beta

	1
	\Gamma(\alpha)

\left(

	y
	\beta

\right)^-\alpha-1\exp(-

	y
	\beta

)\\[6pt] &=

	\beta^\alpha
	\Gamma(\alpha)

\left(y\right)^-\alpha-1\exp(-

	\beta
	y

)\\[6pt] &=f(y;\alpha,\beta) \end{align}

Occurrence

Hitting time distribution of a Wiener process follows a Lévy distribution, which is a special case of the inverse-gamma distribution with

\alpha=0.5

.^[4]

References

V. . Witkovsky . 2001 . Computing the Distribution of a Linear Combination of Inverted Gamma Variables . Kybernetika . 37 . 1 . 79–90 . 1263.62022 . 1825758 .

Notes and References

Book: Hoff, P. . 2009 . A First Course in Bayesian Statistical Methods . Springer . 978-0-387-92299-7 . The normal model . 67–88 .
Web site: InverseGammaDistribution—Wolfram Language Documentation. reference.wolfram.com. 9 April 2018.
Web site: InverseGammaDistribution. John D. Cook. Oct 3, 2008. 3 Dec 2018.
Web site: Ludkovski. Mike. 2007. Math 526: Brownian Motion Notes. UC Santa Barbara. 5–6. 2021-04-13. 2022-01-26. https://web.archive.org/web/20220126104109/http://ludkovski.faculty.pstat.ucsb.edu/bmNotes.pdf. dead.

Inverse-gamma distribution explained

Characterization

Probability density function

Cumulative distribution function

Moments

Characteristic function

Properties

Related distributions

Derivation from Gamma distribution

Occurrence

See also

References

Notes and References