In probability and statistics, the generalized beta distribution[1] is a continuous probability distribution with four shape parameters (however it's customary to make explicit the scale parameter as a fifth parameter, while the location parameter is usually left implicit), including more than thirty named distributions as limiting or special cases. It has been used in the modeling of income distribution, stock returns, as well as in regression analysis. The exponential generalized beta (EGB) distribution follows directly from the GB and generalizes other common distributions.
A generalized beta random variable, Y, is defined by the following probability density function:
GB(y;a,b,c,p,q)=
|a|yap-1(1-(1-c)(y/b)a)q-1 | |
bapB(p,q)(1+c(y/b)a)p+q |
for0<ya<
ba | |
1-c |
,
a\ne0
0\lec\le1
b
p
q
b
1
0
It can be shown that the hth moment can be expressed as follows:
\operatorname{E}GB(Yh)=
bhB(p+h/a,q) | |
B(p,q) |
{}2F1\begin{bmatrix} p+h/a,h/a;c\\ p+q+h/a; \end{bmatrix},
{}2F1
The generalized beta encompasses many distributions as limiting or special cases. These are depicted in the GB distribution tree shown above. Listed below are its three direct descendants, or sub-families.
The generalized beta of the first kind is defined by the following pdf:
GB1(y;a,b,p,q)=
|a|yap-1(1-(y/b)a)q-1 | |
bapB(p,q) |
0<ya<ba
b
p
q
GB1(y;a,b,p,q)=GB(y;a,b,c=0,p,q).
\operatorname{E}GB1(Yh)=
bhB(p+h/a,q) | |
B(p,q) |
.
B1(y;b,p,q)=GB1(y;a=1,b,p,q),
GG(y;a,\beta,p)=\limqGB1(y;a,b=q1/a\beta,p,q),
PARETO(y;b,p)=GB1(y;a=-1,b,p,q=1).
The GB2 is defined by the following pdf:
GB2(y;a,b,p,q)=
|a|yap-1 | |
bapB(p,q)(1+(y/b)a)p+q |
0<y<infty
GB2(y;a,b,p,q)=GB(y;a,b,c=1,p,q).
\operatorname{E}GB2(Yh)=
bhB(p+h/a,q-h/a) | |
B(p,q) |
.
The GB2 is also known as the Generalized Beta Prime (Patil, Boswell, Ratnaparkhi (1984)),[2] the transformed beta (Venter, 1983),[3] the generalized F (Kalfleisch and Prentice, 1980),[4] and is a special case (μ≡0) of the Feller-Pareto (Arnold, 1983)[5] distribution. The GB2 nests common distributions such as the generalized gamma (GG), Burr type 3, Burr type 12, Dagum, lognormal, Weibull, gamma, Lomax, F statistic, Fisk or Rayleigh, chi-square, half-normal, half-Student's t, exponential, asymmetric log-Laplace, log-Laplace, power function, and the log-logistic.[6]
The beta family of distributions (B) is defined by:[1]
B(y;b,c,p,q)=
yp-1(1-(1-c)(y/b))q-1 | |
bpB(p,q)(1+c(y/b))p+q |
0<y<b/(1-c)
B(y;b,c,p,q)=GB(y;a=1,b,c,p,q).
c=0
b=1
The generalized gamma distribution (GG) is a limiting case of the GB2. Its PDF is defined by:[8]
GG(y;a,\beta,p)=\limqGB2(y,a,b=q1/a\beta,p,q)=
| |||||||||||
\betaap\Gamma(p) |
h
h) | |
\operatorname{E}(Y | |
GG |
=
\betah\Gamma(p+h/a) | |
\Gamma(p) |
.
As noted earlier, the GB distribution family tree visually depicts the special and limiting cases (see McDonald and Xu (1995)).
The Pareto (PA) distribution is the following limiting case of the generalized gamma:
PA(y;\beta,\theta)=\limaGG(y;a,\beta,p=-\theta/a)=\lima\left(
| |||||||
\beta-\theta(-\theta/a)\Gamma(-\theta/a) |
\right)=
\lima\left(
| |||||||
\beta-\theta\Gamma(1-\theta/a) |
\right)=
\thetay-\theta | |
\beta-\theta |
\beta<y
0
The power (P) distribution is the following limiting case of the generalized gamma:
P(y;\beta,\theta)=\lima → inftyGG(y;a=\theta/p,\beta,p)=\lima → infty
| |||||||||
\beta\theta\Gamma(p) |
=\lima → infty
\thetay\theta | |
p\Gamma(p)\beta\theta |
-(y/\beta)a | |
e |
=
\lima → infty
\thetay\theta | |
\Gamma(p+1)\beta\theta |
-(y/\beta)a | |
e |
=\lima → infty
\thetay\theta | ||||
|
-(y/\beta)a | |
e |
=
\thetay\theta | |
\beta\theta |
,
0\leqy\leq\beta
\theta>0
The asymmetric log-Laplace distribution (also referred to as the double Pareto distribution [9]) is defined by:[10]
ALL(y;b,λ1,λ2)=\limaGB2(y;a,b,p=λ1/a,q=λ2/a)=
λ1λ2 | |
y(λ1+λ2) |
\begin{cases} (
y | |
b |
λ1 | |
) |
&for0<y<b\\ (
b | |
y |
λ2 | |
) |
&fory\geb\end{cases}
h
h) | |
\operatorname{E}(Y | |
ALL |
=
bhλ1λ2 | |
(λ1+h)(λ2-h) |
.
When
λ1=λ2
Letting
Y\simGB(y;a,b,c,p,q)
Z=ln(Y)
\delta=ln(b)
\sigma=1/a
EGB(z;\delta,\sigma,c,p,q)=
ep(z-\delta)/\sigma(1-(1-c)e(z-\delta)/\sigma)q-1 | |
|\sigma|B(p,q)(1+ce(z-\delta)/\sigma)p+q |
-infty<
z-\delta | <ln( | |
\sigma |
1 | |
1-c |
)
\delta=ln(b)
b
\sigma=1/a
a
Included is a figure showing the relationship between the EGB and its special and limiting cases.[11]
Using similar notation as above, the moment-generating function of the EGB can be expressed as follows:
MEGB(Z)=
e\deltaB(p+t\sigma,q) | |
B(p,q) |
{}2F1\begin{bmatrix} p+t\sigma,t\sigma;c\\ p+q+t\sigma; \end{bmatrix}.
A multivariate generalized beta pdf extends the univariate distributions listed above. For
n
y=(y1,...,yn)
1xn
a=(a1,...,an)
b=(b1,...,bn)
c=(c1,...,cn)
p=(p1,...,pn)
bi
pi
0
\le
ci
\le
1
q
B(p1,...,pn,q)
\Gamma(p1)...\Gamma(pn)\Gamma(q) | |
\Gamma(\bar{p |
+q)}
\bar{p}
n | |
\sum | |
i=1 |
pi
The pdf of the multivariate generalized beta (
MGB
MGB(y;a,b,p,q,c)=
| |||||||||||||||||||||||||||||||||
|
where
0
<
n | |
\sum | |
i=1 |
(1-c | ||||
|
ai | |
) |
<
1
0
\le
ci
<
1
0
<
yi
ci
1
Like the univariate generalized beta distribution, the multivariate generalized beta includes several distributions in its family as special cases. By imposing certain constraints on the parameter vectors, the following distributions can be easily derived.[12]
When each
ci
MGB1(y;a,b,p,q)=
| |||||||||||||||||||||||||||||
|
where
0
<
n | ||
\sum | ( | |
i=1 |
yi | |
bi |
ai | |
) |
<
1
In the case where each
ci
MGB2(y;a,b,p,q)=
| ||||||||||||||||||||||||||||
|
when
0
<
yi
yi
The multivariate generalized gamma (MGG) pdf can be derived from the MGB pdf by substituting
bi
\betaiq
| ||||
q
\to
infty
MGG(y;a,\beta,p)=(
| ||||||||||||||||
|
| ||||||||||||||||
)e |
=
n | |
\prod | |
i=1 |
GG(yi;ai,\betai,pi)
which is the product of independently but not necessarily identically distributed generalized gamma random variables.
Similar pdfs can be constructed for other variables in the family tree shown above, simply by placing an M in front of each pdf name and finding the appropriate limiting and special cases of the MGB as indicated by the constraints and limits of the univariate distribution. Additional multivariate pdfs in the literature include the Dirichlet distribution (standard form) given by
MGB1(y;a=1,b=1,p,q)
MGB2(y;a=1,b=1,p,q)
MGB2(y;a,b,p,q=1)
The marginal density functions of the MGB1 and MGB2, respectively, are the generalized beta distributions of the first and second kind, and are given as follows:
GB1(yi;ai,bi,pi,\bar{p}-pi+q)=
| ||||||||||||||||||||
GB2(yi;ai,bi,pi,q)=
| ||||||||||||||||||||
|
The flexibility provided by the GB family is used in modeling the distribution of:
Applications involving members of the EGB family include:[1] [6]
The GB2 and several of its special and limiting cases have been widely used as models for the distribution of income. For some early examples see Thurow (1970),[13] Dagum (1977),[14] Singh and Maddala (1976),[15] and McDonald (1984).[6] Maximum likelihood estimations using individual, grouped, or top-coded data are easily performed with these distributions.
Measures of inequality, such as the Gini index (G), Pietra index (P), and Theil index (T) can be expressed in terms of the distributional parameters, as given by McDonald and Ransom (2008):[16]
\begin{align}G=\left({
1 | |
2\mu |
The hazard function, h(s), where f(s) is a pdf and F(s) the corresponding cdf, is defined by
h(s)=
f(s) | |
1-F(s) |
Hazard functions are useful in many applications, such as modeling unemployment duration, the failure time of products or life expectancy. Taking a specific example, if s denotes the length of life, then h(s) is the rate of death at age s, given that an individual has lived up to age s. The shape of the hazard function for human mortality data might appear as follows: decreasing mortality in the first few months of life, then a period of relatively constant mortality and finally an increasing probability of death at older ages.
Special cases of the generalized beta distribution offer more flexibility in modeling the shape of the hazard function, which can call for "∪" or "∩" shapes or strictly increasing (denoted by I}) or decreasing (denoted by D) lines. The generalized gamma is "∪"-shaped for a>1 and p<1/a, "∩"-shaped for a<1 and p>1/a, I-shaped for a>1 and p>1/a and D-shaped for a<1 and p>1/a.[17] This is summarized in the figure below.[18] [19]