In computational statistics, the pseudo-marginal Metropolis–Hastings algorithm[1] is a Monte Carlo method to sample from a probability distribution. It is an instance of the popular Metropolis–Hastings algorithm that extends its use to cases where the target density is not available analytically. It relies on the fact that the Metropolis–Hastings algorithm can still sample from the correct target distribution if the target density in the acceptance ratio is replaced by an estimate. It is especially popular in Bayesian statistics, where it is applied if the likelihood function is not tractable (see example below).
\pi(\theta)
See also: Metropolis–Hastings algorithm.
Given a current state
\thetan
\theta'\simQ( ⋅ \mid\thetan)
\thetan+1=\theta'
a(\thetan,\theta')=min\left(1,
\pi(\theta') | |
\pi(\thetan) |
Q(\thetan\mid\theta') | |
Q(\theta'\mid\thetan) |
\right)
otherwise the old state is kept, that is,
\thetan+1=\thetan
If the density
\pi
\hat{\pi}\theta
E[\hat{\pi}\theta]=\pi(\theta).
\thetan
\hat{\pi} | |
\thetan |
\theta'\simQ( ⋅ \mid\thetan)
\hat{\pi}\theta'
\thetan+1=\theta'
a(\thetan,\theta')=min\left(1,
\hat{\pi | |
\theta' |
otherwise the old state is kept, that is,
\thetan+1=\thetan
In Bayesian statistics the target of inference is the posterior distribution
p(\theta\midy)=
p\theta(y)p(\theta) | |
p(y) |
,
where
p\theta
p
p(y)
p\theta(y)
\theta
Consider a model consisting of i.i.d. latent real-valued random variables
Z1,\ldots,Zn
Zi\simf\theta( ⋅ )
Yi\midZi=z\simg\theta( ⋅ \midz)
g
y1,\ldots,yn
p(\theta)
p(\theta\midy1,\ldots,yn)\proptop\theta(y1,\ldots,yn)p(\theta)
we need to find the likelihood function
p\theta(y1,\ldots,yn)
y
p\theta(y)=\intg\theta(y\midz)f\theta(z)dz
and the joint likelihood of the observed data
y1,\ldots,yn
p\theta(y1,\ldots,yn)=
n | |
\prod | |
i=1 |
p\theta(yi)=
n | |
\prod | |
i=1 |
\intg\theta(yi\midzi)f\theta(zi)dzi.
If the integral on the right-hand side is not analytically available, importance sampling can be used to estimate the likelihood. Introduce an auxiliary distribution
q
g\theta(y\midz)f\theta(z)>0 ⇒ q(z)>0
z
\hat{p}\theta(y
|
N | |
\sum | |
k=1 |
g\theta(yi\midZk)f\theta(Zk) | |
q(Zk) |
, Zk\overset{i.i.d.}{\sim}q( ⋅ )
is an unbiased estimator of
p\theta(yi)
\hat{p}\theta(y1,\ldots,yn)=
n | |
\prod | |
i=1 |
\hat{p}\theta(yi)=
n | |
\prod | |
i=1 |
1 | |
N |
N | |
\sum | |
k=1 |
g\theta(yi\midZi,k)f\theta(Zi,k) | |
q(Zi,k) |
, Zi,k\overset{i.i.d.}{\sim}q( ⋅ ).
Pseudo-marginal Metropolis-Hastings can be seen as a special case of so-called particle marginal Metropolis-Hastings algorithms. In the case of the latter, unbiased estimators of densities relating to static parameters in state-space models may be obtained using a particle filter. While the algorithm enables inference on both the joint space of static parameters and latent variables, when interest is only in the static parameters the algorithm is equivalent to a pseudo-marginal algorithm.[2]