In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of successes (denoted
r
r=3
An alternative formulation is to model the number of total trials (instead of the number of failures). In fact, for a specified (non-random) number of successes (r), the number of failures (n − r) is random because the number of total trials (n) is random. For example, we could use the negative binomial distribution to model the number of days n (random) a certain machine works (specified by r) before it breaks down.
The Pascal distribution (after Blaise Pascal) and Polya distribution (for George Pólya) are special cases of the negative binomial distribution. A convention among engineers, climatologists, and others is to use "negative binomial" or "Pascal" for the case of an integer-valued stopping-time parameter (
r
For occurrences of associated discrete events, like tornado outbreaks, the Polya distributions can be used to give more accurate models than the Poisson distribution by allowing the mean and variance to be different, unlike the Poisson. The negative binomial distribution has a variance
\mu/p
p\to1
\mu
The term "negative binomial" is likely due to the fact that a certain binomial coefficient that appears in the formula for the probability mass function of the distribution can be written more simply with negative numbers.[3]
Imagine a sequence of independent Bernoulli trials: each trial has two potential outcomes called "success" and "failure." In each trial the probability of success is
p
1-p
r
X
X\sim\operatorname{NB}(r,p)
The probability mass function of the negative binomial distribution is
f(k;r,p)\equiv\Pr(X=k)=\binom{k+r-1}{k}(1-p)kpr
where r is the number of successes, k is the number of failures, and p is the probability of success on each trial.
Here, the quantity in parentheses is the binomial coefficient, and is equal to
\binom{k+r-1}{k}=
(k+r-1)! | |
(r-1)!(k)! |
=
(k+r-1)(k+r-2)...m(r) | |
k! |
=
\Gamma(k+r) | |
k! \Gamma(r) |
.
There are k failures chosen from k + r − 1 trials rather than k + r because the last of the k + r trials is by definition a success.
This quantity can alternatively be written in the following manner, explaining the name "negative binomial":
\begin{align} &
(k+r-1)...m(r) | |
k! |
\\[10pt] ={}&(-1)k
\overbrace{(-r)(-r-1)(-r-2)...m(-r-k+1) | |
kfactors |
Note that by the last expression and the binomial series, for every and
q=1-p
p-r=(1-q)-r=
infty | |
\sum | |
k=0 |
\binom{-r}{\phantom{-}k}(-q)k=
infty | |
\sum | |
k=0 |
\binom{k+r-1}{k}qk
hence the terms of the probability mass function indeed add up to one as below.
infty | |
\sum | |
k=0 |
\binom{k+r-1}{k}(1-p)kpr=p-rpr=1
To understand the above definition of the probability mass function, note that the probability for every specific sequence of r successes and k failures is, because the outcomes of the k + r trials are supposed to happen independently. Since the rth success always comes last, it remains to choose the k trials with failures out of the remaining k + r − 1 trials. The above binomial coefficient, due to its combinatorial interpretation, gives precisely the number of all these sequences of length k + r − 1.
The cumulative distribution function can be expressed in terms of the regularized incomplete beta function:
F(k;r,p)\equiv\Pr(X\lek)=Ip(r,k+1).
p=r/(r+\mu)
\mu
It can also be expressed in terms of the cumulative distribution function of the binomial distribution:[4]
F(k;r,p)=Fbinomial(k;n=k+r,1-p).
Some sources may define the negative binomial distribution slightly differently from the primary one here. The most common variations are where the random variable X is counting different things. These variations can be seen in the table here:
X is counting... | Probability mass function | Formula | Alternate formula(using equivalent binomial) | Alternate formula(simplified using: ) | Support | ||
---|---|---|---|---|---|---|---|
1 | k failures, given r successes | [5] [6] [7] | [8] [9] [10] | fork=0,1,2,\ldots | |||
2 | n trials, given r successes | [11] [12] [13] | forn=r,r+1,r+2,...c | ||||
3 | n trials, given r failures | ||||||
4 | k successes, given r failures | fork=0,1,2,\ldots | |||||
- | k successes, given n trials | This is the binomial distribution not the negative binomial: | fork=0,1,2,...c,n |
\binom{k+r-1}{k}=
(k+r-1)(k+r-2)...m(r) | |
k! |
=
\Gamma(k+r) | |
k!\Gamma(r) |
After substituting this expression in the original definition, we say that X has a negative binomial (or Pólya) distribution if it has a probability mass function:
f(k;r,p)\equiv\Pr(X=k)=
\Gamma(k+r) | |
k!\Gamma(r) |
(1-p)kpr fork=0,1,2,...c
Here r is a real, positive number.
In negative binomial regression,[14] the distribution is specified in terms of its mean, , which is then related to explanatory variables as in linear regression or other generalized linear models. From the expression for the mean m, one can derive and . Then, substituting these expressions in the one for the probability mass function when r is real-valued, yields this parametrization of the probability mass function in terms of m:
\Pr(X=k)=
\Gamma(r+k) | \left( | |
k!\Gamma(r) |
r | |
r+m |
\right)r\left(
m | |
r+m |
\right)k fork=0,1,2,...c
Sometimes the distribution is parameterized in terms of its mean μ and variance σ2:
\begin{align} &p=
\mu | |
\sigma2 |
,\\[6pt] &r=
\mu2 | |
\sigma2-\mu |
,\\[3pt] &\Pr(X=k)={k+
\mu2 | |
\sigma2-\mu |
-1\choosek}\left(1-
\mu | |
\sigma2 |
\right)k\left(
\mu | |
\sigma2 |
\mu2/(\sigma2-\mu) | |
\right) |
\\ &\operatorname{E}(X)=\mu\\ &\operatorname{Var}(X)=\sigma2. \end{align}
Another popular parameterization uses r and the failure odds β:
\begin{align} &p=
1 | |
1+\beta |
\\ &\Pr(X=k)={k+r-1\choosek}\left(
\beta | |
1+\beta |
\right)k\left(
1 | |
1+\beta |
\right)r\\ &\operatorname{E}(X)=r\beta\\ &\operatorname{Var}(X)=r\beta(1+\beta). \end{align}
Hospital length of stay is an example of real-world data that can be modelled well with a negative binomial distribution via negative binomial regression.[16] [17]
Pat Collis is required to sell candy bars to raise money for the 6th grade field trip. Pat is (somewhat harshly) not supposed to return home until five candy bars have been sold. So the child goes door to door, selling candy bars. At each house, there is a 0.6 probability of selling one candy bar and a 0.4 probability of selling nothing.
What's the probability of selling the last candy bar at the nth house?
Successfully selling candy enough times is what defines our stopping criterion (as opposed to failing to sell it), so k in this case represents the number of failures and r represents the number of successes. Recall that the NegBin(r, p) distribution describes the probability of k failures and r successes in k + r Bernoulli(p) trials with success on the last trial. Selling five candy bars means getting five successes. The number of trials (i.e. houses) this takes is therefore k + 5 = n. The random variable we are interested in is the number of houses, so we substitute k = n − 5 into a NegBin(5, 0.4) mass function and obtain the following mass function of the distribution of houses (for n ≥ 5):
f(n)={(n-5)+5-1\choosen-5} (1-0.4)5 0.4n-5={n-1\choosen-5} 35
2n-5 | |
5n |
.
What's the probability that Pat finishes on the tenth house?
f(10)=0.1003290624.
What's the probability that Pat finishes on or before reaching the eighth house?
To finish on or before the eighth house, Pat must finish at the fifth, sixth, seventh, or eighth house. Sum those probabilities:
f(5)=0.07776
f(6)=0.15552
f(7)=0.18662
f(8)=0.17418
8 | |
\sum | |
j=5 |
f(j)=0.59408.
What's the probability that Pat exhausts all 30 houses that happen to stand in the neighborhood?
This can be expressed as the probability that Pat does not finish on the fifth through the thirtieth house:
30 | |
1-\sum | |
j=5 |
f(j)=1-I0.4(5,30-5+1) ≈ 1-0.99999342=0.00000658.
Because of the rather high probability that Pat will sell to each house (60 percent), the probability of her NOT fulfilling her quest is vanishingly slim.
The expected total number of trials needed to see r successes is
r | |
p |
E[\operatorname{NB}(r,p)]=
r | |
p |
-r=
r(1-p) | |
p |
The expected total number of failures in a negative binomial distribution with parameters is r(1 − p)/p. To see this, imagine an experiment simulating the negative binomial is performed many times. That is, a set of trials is performed until successes are obtained, then another set of trials, and then another etc. Write down the number of trials performed in each experiment: and set . Now we would expect about successes in total. Say the experiment was performed times. Then there are successes in total. So we would expect, so . See that is just the average number of trials per experiment. That is what we mean by "expectation". The average number of failures per experiment is . This agrees with the mean given in the box on the right-hand side of this page.
A rigorous derivation can be done by representing the negative binomial distribution as the sum of waiting times. Let
Xr\sim\operatorname{NB}(r,p)
X
r
p
Yi\simGeom(p)
Yi
Yi
i
(i-1)
Xr=Y1+Y2+ … +Yr.
E[Xr]=E[Y1]+E[Y2]+ … +E[Yr]=
r(1-p) | |
p |
,
E[Yi]=(1-p)/p
When counting the number of failures before the r-th success, the variance is r(1 − p)/p2. When counting the number of successes before the r-th failure, as in alternative formulation (3) above, the variance is rp/(1 − p)2.
Suppose Y is a random variable with a binomial distribution with parameters n and p. Assume p + q = 1, with p, q ≥ 0, then
1=1n=(p+q)n.
Using Newton's binomial theorem, this can equally be written as:
infty | |
(p+q) | |
k=0 |
{n\choosek}pkqn-k,
in which the upper bound of summation is infinite. In this case, the binomial coefficient
{n\choosek}={n(n-1)(n-2) … (n-k+1)\overk!}.
is defined when n is a real number, instead of just a positive integer. But in our case of the binomial distribution it is zero when k > n. We can then say, for example
(p+q)8.3
infty | |
=\sum | |
k=0 |
{8.3\choosek}pkq8.3.
Now suppose r > 0 and we use a negative exponent:
1=pr ⋅ p-r=pr(1-q)-r=pr
infty | |
\sum | |
k=0 |
{-r\choosek}(-q)k.
Then all of the terms are positive, and the term
pr{-r\choosek}(-q)k
is just the probability that the number of failures before the rth success is equal to k, provided r is an integer. (If r is a negative non-integer, so that the exponent is a positive non-integer, then some of the terms in the sum above are negative, so we do not have a probability distribution on the set of all nonnegative integers.)
Now we also allow non-integer values of r. Then we have a proper negative binomial distribution, which is a generalization of the Pascal distribution, which coincides with the Pascal distribution when r happens to be a positive integer.
Recall from above that
The sum of independent negative-binomially distributed random variables r1 and r2 with the same value for parameter p is negative-binomially distributed with the same p but with r-value r1 + r2.
This property persists when the definition is thus generalized, and affords a quick way to see that the negative binomial distribution is infinitely divisible.
The following recurrence relations hold:
For the probability mass function
\begin{cases} (k+1)\Pr(X=k+1)-p\Pr(X=k)(k+r)=0,\\[5pt] \Pr(X=0)=(1-p)r. \end{cases}
For the moments
mk=E(Xk),
mk+1=rPmk+(P2+P){dmk\overdP}, P:=(1-p)/p, m0=1.
For the cumulants
\kappak+1=(Q-1)Q{d\kappak\overdQ}, Q:=1/p, \kappa1=r(Q-1).
\operatorname{Geom}(p)=\operatorname{NB}(1,p).
Consider a sequence of negative binomial random variables where the stopping parameter r goes to infinity, while the probability p of success in each trial goes to one, in such a way as to keep the mean of the distribution (i.e. the expected number of failures) constant. Denoting this mean as λ, the parameter p will be p = r/(r + λ)
\begin{align} Mean: &λ=
(1-p)r | |
p |
⇒ p=
r | |
r+λ |
,\\ Variance: &λ\left(1+
λ | |
r |
\right)>λ, thusalwaysoverdispersed. \end{align}
Under this parametrization the probability mass function will be
f(k;r,p)=
\Gamma(k+r) | |
k! ⋅ \Gamma(r) |
(1-p)kpr=
λk | |
k! |
⋅
\Gamma(r+k) | |
\Gamma(r) (r+λ)k |
⋅
1 | ||||
|
Now if we consider the limit as r → ∞, the second factor will converge to one, and the third to the exponent function:
\limr\toinftyf(k;r,p)=
λk | |
k! |
⋅ 1 ⋅
1 | |
eλ |
,
In other words, the alternatively parameterized negative binomial distribution converges to the Poisson distribution and r controls the deviation from the Poisson. This makes the negative binomial distribution suitable as a robust alternative to the Poisson, which approaches the Poisson for large r, but which has larger variance than the Poisson for small r.
\operatorname{Poisson}(λ)=\limr\operatorname{NB}\left(r,
r | |
r+λ |
\right).
The negative binomial distribution also arises as a continuous mixture of Poisson distributions (i.e. a compound probability distribution) where the mixing distribution of the Poisson rate is a gamma distribution. That is, we can view the negative binomial as a distribution, where λ is itself a random variable, distributed as a gamma distribution with shape r and scale θ = or correspondingly rate β =.
To display the intuition behind this statement, consider two independent Poisson processes, "Success" and "Failure", with intensities p and 1 − p. Together, the Success and Failure processes are equivalent to a single Poisson process of intensity 1, where an occurrence of the process is a success if a corresponding independent coin toss comes up heads with probability p; otherwise, it is a failure. If r is a counting number, the coin tosses show that the count of successes before the rth failure follows a negative binomial distribution with parameters r and p. The count is also, however, the count of the Success Poisson process at the random time T of the rth occurrence in the Failure Poisson process. The Success count follows a Poisson distribution with mean pT, where T is the waiting time for r occurrences in a Poisson process of intensity 1 − p, i.e., T is gamma-distributed with shape parameter r and intensity 1 − p. Thus, the negative binomial distribution is equivalent to a Poisson distribution with mean pT, where the random variate T is gamma-distributed with shape parameter r and intensity . The preceding paragraph follows, because λ = pT is gamma-distributed with shape parameter r and intensity .
The following formal derivation (which does not depend on r being a counting number) confirms the intuition.
\begin{align} &
infty | |
\int | |
0 |
f\operatorname{Poisson(λ)}(k) x f\operatorname{Gamma\left(r,
p | |
1-p |
\right)}(λ)dλ\\[8pt] ={}&
infty | |
\int | |
0 |
λk | |
k! |
e-λ x
1 | \left( | |
\Gamma(r) |
p | |
1-p |
λ\right)r-1
| ||||||
e |
\left(
p{1-p} | |
\right)dλ\\[8pt] ={}&\left(
p | |
1-p |
\right)r
1 | |
k!\Gamma(r) |
infty | |
\int | |
0 |
λr+k-1
| ||||||
e |
dλ\\[8pt] ={}&\left(
p | |
1-p |
\right)r
1 | |
k!\Gamma(r) |
\Gamma(r+k)(1-p)k+r
infty | |
\int | |
0 |
f\operatorname{Gamma\left(k+r,
1 | |
1-p |
\right)}(λ) dλ\\[8pt] ={}&
\Gamma(r+k) | |
k! \Gamma(r) |
(1-p)kpr\\[8pt]={}&f(k;r,p). \end{align}
Because of this, the negative binomial distribution is also known as the gamma–Poisson (mixture) distribution. The negative binomial distribution was originally derived as a limiting case of the gamma-Poisson distribution.[18]
If Yr is a random variable following the negative binomial distribution with parameters r and p, and support, then Yr is a sum of r independent variables following the geometric distribution (on) with parameter p. As a result of the central limit theorem, Yr (properly scaled and shifted) is therefore approximately normal for sufficiently large r.
Furthermore, if Bs+r is a random variable following the binomial distribution with parameters s + r and p, then
\begin{align} \Pr(Yr\leqs)&{}=1-Ip(s+1,r)\\[5pt] &{}=1-Ip((s+r)-(r-1),(r-1)+1)\\[5pt] &{}=1-\Pr(Bs+r\leqr-1)\\[5pt] &{}=\Pr(Bs+r\geqr)\\[5pt] &{}=\Pr(afters+rtrials,thereareatleastrsuccesses). \end{align}
In this sense, the negative binomial distribution is the "inverse" of the binomial distribution.
The sum of independent negative-binomially distributed random variables r1 and r2 with the same value for parameter p is negative-binomially distributed with the same p but with r-value r1 + r2.
The negative binomial distribution is infinitely divisible, i.e., if Y has a negative binomial distribution, then for any positive integer n, there exist independent identically distributed random variables Y1, ..., Yn whose sum has the same distribution that Y has.
The negative binomial distribution NB(r,p) can be represented as a compound Poisson distribution: Let denote a sequence of independent and identically distributed random variables, each one having the logarithmic series distribution Log(p), with probability mass function
f(k;r,p)=
-pk | |
kln(1-p) |
, k\in{N}.
Let N be a random variable, independent of the sequence, and suppose that N has a Poisson distribution with mean . Then the random sum
N | |
X=\sum | |
n=1 |
Yn
is NB(r,p)-distributed. To prove this, we calculate the probability generating function GX of X, which is the composition of the probability generating functions GN and GY1. Using
GN(z)=\exp(λ(z-1)), z\inR,
and
G | (z)= | |
Y1 |
ln(1-pz) | |
ln(1-p) |
, |z|<
1p, | |
we obtain
\begin{align}GX(z)&=GN(G
(z))\\[4pt] &=\expl(λl( | ||
Y1 |
ln(1-pz) | -1r)r)\\[4pt] &=\expl(-r(ln(1-pz)-ln(1-p))r)\\[4pt] &=l( | |
ln(1-p) |
1-p | |
1-pz |
r)r, |z|<
1p, \end{align} | |
which is the probability generating function of the NB(r,p) distribution.
The following table describes four distributions related to the number of successes in a sequence of draws:
With replacements | No replacements | ||
---|---|---|---|
Given number of draws | hypergeometric distribution | ||
Given number of failures | negative binomial distribution | negative hypergeometric distribution |
The negative binomial, along with the Poisson and binomial distributions, is a member of the (a,b,0) class of distributions. All three of these distributions are special cases of the Panjer distribution. They are also members of a natural exponential family.
Suppose p is unknown and an experiment is conducted where it is decided ahead of time that sampling will continue until r successes are found. A sufficient statistic for the experiment is k, the number of failures.
In estimating p, the minimum variance unbiased estimator is
\widehat{p}= | r-1 |
r+k-1 |
.
When r is known, the maximum likelihood estimate of p is
\widetilde{p}= | r |
r+k |
,
but this is a biased estimate. Its inverse (r + k)/r, is an unbiased estimate of 1/p, however.[19]
When r is unknown, the maximum likelihood estimator for p and r together only exists for samples for which the sample variance is larger than the sample mean.[20] The likelihood function for N iid observations (k1, ..., kN) is
N | |
L(r,p)=\prod | |
i=1 |
f(ki;r,p)
from which we calculate the log-likelihood function
\ell(r,p)=
N | |
\sum | |
i=1 |
ln(\Gamma(ki+r))-
N | |
\sum | |
i=1 |
ln(ki!)-Nln(\Gamma(r))+
N | |
\sum | |
i=1 |
kiln(1-p)+Nrln(p).
To find the maximum we take the partial derivatives with respect to r and p and set them equal to zero:
\partial\ell(r,p) | |
\partialp |
=
N | |
-\left[\sum | |
i=1 |
ki
1 | |
1-p |
\right]+Nr
1 | |
p |
=0
\partial\ell(r,p) | |
\partialr |
=
N | |
\left[\sum | |
i=1 |
\psi(ki+r)\right]-N\psi(r)+Nln(p)=0
where
\psi(k)=
\Gamma'(k) | |
\Gamma(k) |
Solving the first equation for p gives:
p=
Nr | ||||||||
|
Substituting this in the second equation gives:
\partial\ell(r,p) | |
\partialr |
=
N | |
\left[\sum | |
i=1 |
\psi(ki+r)\right]-N\psi(r)+Nln\left(
r | ||||||||
|
\right)=0
This equation cannot be solved for r in closed form. If a numerical solution is desired, an iterative technique such as Newton's method can be used. Alternatively, the expectation–maximization algorithm can be used.
For the special case where r is an integer, the negative binomial distribution is known as the Pascal distribution. It is the probability distribution of a certain number of failures and successes in a series of independent and identically distributed Bernoulli trials. For k + r Bernoulli trials with success probability p, the negative binomial gives the probability of k successes and r failures, with a failure on the last trial. In other words, the negative binomial distribution is the probability distribution of the number of successes before the rth failure in a Bernoulli process, with probability p of successes on each trial. A Bernoulli process is a discrete time process, and so the number of trials, failures, and successes are integers.
Consider the following example. Suppose we repeatedly throw a die, and consider a 1 to be a failure. The probability of success on each trial is 5/6. The number of successes before the third failure belongs to the infinite set . That number of successes is a negative-binomially distributed random variable.
When r = 1 we get the probability distribution of number of successes before the first failure (i.e. the probability of the first failure occurring on the (k + 1)st trial), which is a geometric distribution:
f(k;r,p)=(1-p) ⋅ pk
Recent findings suggest that the waiting time in a Bernoulli process is strongly related to fractals and Dirichlet function.Probability distributions with fractal properties that are related to Dirichlet function can be derived from recurrent processes generated by uniform discrete distributions. Such uniform discrete distributions can be pi digits, flips of a fair dice or live casino spins. Consider the following waiting time in a Bernoulli process: A random variableC is repeatedly sampled N times from a discrete uniform distribution, where i ranges from 1 to N. For instance, consider integer values ranging from 1 to 10. Moments of occurrence, T,signify when events C repeat, defined as C = C or C = C, where k ranges from 1 to M, with M being less than N. Subsequently, define S as the interval between successive T, representing the waiting time for an event to occur. Finally, introduce Z as ln(S) – ln(S), where l ranges from 1 to U-1. The random variable Z displays fractal properties, resembling the shape distribution akin to Thomae's or Dirichlet function.[21]
The negative binomial distribution, especially in its alternative parameterization described above, can be used as an alternative to the Poisson distribution. It is especially useful for discrete data over an unbounded positive range whose sample variance exceeds the sample mean. In such cases, the observations are overdispersed with respect to a Poisson distribution, for which the mean is equal to the variance. Hence a Poisson distribution is not an appropriate model. Since the negative binomial distribution has one more parameter than the Poisson, the second parameter can be used to adjust the variance independently of the mean. See Cumulants of some discrete probability distributions.
An application of this is to annual counts of tropical cyclones in the North Atlantic or to monthly to 6-monthly counts of wintertime extratropical cyclones over Europe, for which the variance is greater than the mean.[22] [23] [24] In the case of modest overdispersion, this may produce substantially similar results to an overdispersed Poisson distribution.[25] [26]
Negative binomial modeling is widely employed in ecology and biodiversity research for analyzing count data where overdispersion is very common. This is because overdispersion is indicative of biological aggregation, such as species or communities forming clusters. Ignoring overdispersion can lead to significantly inflated model parameters, resulting in misleading statistical inferences. The negative binomial distribution effectively addresses overdispersed counts by permitting the variance to vary quadratically with the mean. An additional dispersion parameter governs the slope of the quadratic term, determining the severity of overdispersion. The model's quadratic mean-variance relationship proves to be a realistic approach for handling overdispersion, as supported by empirical evidence from many studies. Overall, the NB model offers two attractive features: (1) the convenient interpretation of the dispersion parameter as an index of clustering or aggregation, and (2) its tractable form, featuring a closed expression for the probability mass function.[27]
In genetics, the negative binomial distribution is commonly used to model data in the form of discrete sequence read counts from high-throughput RNA and DNA sequencing experiments.[28] [29] [30] [31]
In epidemiology of infectious diseases, the negative binomial has been used as a better option than the Poisson distribution to model overdispersed counts of secondary infections from one infected case (super-spreading events).[32]
The negative binomial distribution has been the most effective statistical model for a broad range of multiplicity observations in particle collision experiments, e.g.,
p\barp, hh, hA, AA, e+e-
k
r
. Negative Binomial Regression. Joseph Hilbe. Cambridge University Press. 2011. 978-0-521-19815-8. Second. Cambridge, UK.
\langlen\rangle
\langler\rangle
\langlel{n}\rangle-\langler\rangle=k, \langlep\rangle=
\langler\rangle | |
\langlel{n |
\rangle} \implies \langlel{n}\rangle=
k | |
1-\langlep\rangle |
, \langle{r}\rangle=
k\langlep\rangle | |
1-\langlep\rangle |
,
an isomorphic set of equations can be identified with the parameters of a relativistic current density of a canonical ensemble of massive particles, via
c2\langle\rho2\rangle-\langlej2\rangle=
2, | |
c | |
0 |
\langle
2 | |
\beta | |
v |
\rangle=
\langlej2\rangle | |
c2\langle\rho2\rangle |
\implies c2\langle\rho2\rangle=
| ||||||||
|
, \langlej2\rangle=
| ||||||||||||||||
|
,
where
\rho0
\langle\rho2\rangle
\langlej2\rangle
\langle
2 | |
\beta | |
v |
\rangle=\langlev2\rangle/c2
\langlev2\rangle
c
2 | |
c | |
0 |
\mapstok, \langle
2 | |
\beta | |
v |
\rangle\mapsto\langlep\rangle, c2\langle\rho2\rangle\mapsto\langlel{n}\rangle, \langlej2\rangle\mapsto\langler\rangle.
A rigorous alternative proof of the above correspondence has also been demonstrated through quantum mechanics via the Feynman path integral.
This distribution was first studied in 1713 by Pierre Remond de Montmort in his Essay d'analyse sur les jeux de hazard, as the distribution of the number of trials required in an experiment to obtain a given number of successes.[46] It had previously been mentioned by Pascal.[47]