In statistics and econometrics, the multivariate probit model is a generalization of the probit model used to estimate several correlated binary outcomes jointly. For example, if it is believed that the decisions of sending at least one child to public school and that of voting in favor of a school budget are correlated (both decisions are binary), then the multivariate probit model would be appropriate for jointly predicting these two choices on an individual-specific basis. J.R. Ashford and R.R. Sowden initially proposed an approach for multivariate probit analysis.[1] Siddhartha Chib and Edward Greenberg extended this idea and also proposed simulation-based inference methods for the multivariate probit model which simplified and generalized parameter estimation.[2]
In the ordinary probit model, there is only one binary dependent variable
Y
Y*
Y1
Y2
* | |
Y | |
1 |
* | |
Y | |
2 |
Y1=\begin{cases}1&
* | |
ifY | |
1>0, |
\\ 0&otherwise, \end{cases}
Y2=\begin{cases} 1&
* | |
ifY | |
2>0, |
\\ 0&otherwise, \end{cases}
with
* | |
\begin{cases} Y | |
1 |
=X1\beta1+\varepsilon1
* | |
\\ Y | |
2 |
=X2\beta2+\varepsilon2 \end{cases}
and
\begin{bmatrix} \varepsilon1\\ \varepsilon2 \end{bmatrix} \midX \siml{N} \left(\begin{bmatrix} 0\\ 0 \end{bmatrix}, \begin{bmatrix} 1&\rho\\ \rho&1 \end{bmatrix} \right)
Fitting the bivariate probit model involves estimating the values of
\beta1, \beta2,
\rho
\begin{align} L(\beta1,\beta2)=(\prod&P(Y1=1,Y2=1\mid\beta1,\beta
Y1Y2 | |
2) |
P(Y1=0,Y2=1\mid\beta1,\beta
(1-Y1)Y2 | |
2) |
\\[8pt] &{} P(Y1=1,Y2=0\mid\beta1,\beta
Y1(1-Y2) | |
2) |
P(Y1=0,Y2=0\mid\beta1,\beta
(1-Y1)(1-Y2) | |
2) |
) \end{align}
Substituting the latent variables
* | |
Y | |
1 |
* | |
Y | |
2 |
\begin{align} \sum&(Y1Y2lnP(\varepsilon1>-X1\beta1,\varepsilon2>-X2\beta2)\\[4pt] &{} {}+(1-Y1)Y2lnP(\varepsilon1<-X1\beta1,\varepsilon2>-X2\beta2)\\[4pt] &{} {}+Y1(1-Y2)lnP(\varepsilon1>-X1\beta1,\varepsilon2<-X2\beta2)\\[4pt] &{} {}+(1-Y1)(1-Y2)lnP(\varepsilon1<-X1\beta1,\varepsilon2<-X2\beta2)). \end{align}
After some rewriting, the log-likelihood function becomes:
\begin{align} \sum& (Y1Y2ln\Phi(X1\beta1,X2\beta2,\rho)\\[4pt] &{} {}+(1-Y1)Y2ln\Phi(-X1\beta1,X2\beta2,-\rho)\\[4pt] &{} {}+Y1(1-Y2)ln\Phi(X1\beta1,-X2\beta2,-\rho)\\[4pt] &{} {}+(1-Y1)(1-Y2)ln\Phi(-X1\beta1,-X2\beta2,\rho)). \end{align}
Note that
\Phi
Y1
Y2
For the general case,
yi |
=(y1,...,yj), (i=1,...,N)
j
i
yi |
\begin{align} \Pr(yi|Xi\beta, |
\Sigma)=&
\int | |
AJ |
… \int | |
A1 |
* | ||
f | ||
|
\Sigma)
* | |
dy | |
1... |
* | |
dy | |
J |
\\ \Pr(yi|Xi\beta, |
\Sigma)=&\int
1 | |
y*\inA |
* | ||
f | ||
|
\Sigma)
* | |
dy | |
i \end{align} |
A=A1 x … x AJ
Aj=\begin{cases}(-infty,0]&yj=0\\ (0,infty)&yj=1\end{cases}
The log-likelihood function in this case would be
N | |
\sum | |
i=1 |
log\Pr(yi|Xi\beta, |
\Sigma)
Except for
J\leq2