In mathematics, Laplace's method, named after Pierre-Simon Laplace, is a technique used to approximate integrals of the form
b | |
\int | |
a |
eMf(x)dx,
where
f(x)
M
a
b
In Bayesian statistics, Laplace's approximation can refer to either approximating the posterior normalizing constant with Laplace's method or approximating the posterior distribution with a Gaussian centered at the maximum a posteriori estimate.[1] [2] Laplace approximations are used in the integrated nested Laplace approximations method for fast approximations of Bayesian inference.
Suppose the function
f(x)
x0
M>0
\begin{align} g(x)&=Mf(x),\\ h(x)&=eMf(x). \end{align}
Note that
x0
g
h
\begin{align} | g(x0) |
g(x) |
&=
Mf(x0) | |
Mf(x) |
=
f(x0) | |
f(x) |
,\\[4pt]
h(x0) | |
h(x) |
&=
| |||||
eM |
=
M(f(x0)-f(x)) | |
e |
. \end{align}
As M increases, the ratio for
h
g
x
x0
To state and motivate the method, one must make several assumptions. It is assumed that
x0
f(x)
f(x0)
x
x0
f(x)
f(x)=f(x0)+f'(x0)(x-x0)+
1 | |
2 |
f''(x0)(x-x
2 | |
0) |
+R
where
R=
3\right) | |
O\left((x-x | |
0) |
Since
f
x0
x0
f'(x0)=0
f(x)
f(x) ≈ f(x0)+
1 | |
2 |
f''(x0)
2. | |
(x-x | |
0) |
Then, just one more step is needed to get a Gaussian distribution. Since
x0
f
f''(x0)\le0
f(x) ≈ f(x0)-
1 | |
2 |
|f''(x0)|
2 | |
(x-x | |
0) |
for
x
x0
b | |
\int | |
a |
eMdx ≈
Mf(x0) | |
e |
b | |
\int | |
a |
| |||||||||||
e |
dx
If
f''(x0)<0
-infty
+infty
M
x0
b | |
\int | |
a |
eMdx ≈ \sqrt{
2\pi | |
M|f''(x0)| |
A generalization of this method and extension to arbitrary precision is provided by the book .
Suppose
f(x)
[a,b],
x0\in(a,b)
f(x0)=maxxf(x) and f''(x0)<0.
Then:
\limn\toinfty
| ||||||||||
|
Lower bound: Let
\varepsilon>0
f''
\delta>0
|x0-c|<\delta
f''(c)\gef''(x0)-\varepsilon.
x\in(x0-\delta,x0+\delta),
f(x)\gef(x0)+
1 | |
2 |
(f''(x0)-
2. | |
\varepsilon)(x-x | |
0) |
Then we have the following lower bound:
b | |
\begin{align} \int | |
a |
enf(x)dx&\ge
x0+\delta | |
\int | |
x0-\delta |
enf(x)dx\\ &\ge
nf(x0) | |
e |
x0+\delta | |
\int | |
x0-\delta |
| |||||||||||
e |
dx\\ &=
nf(x0) | ||
e | \sqrt{ |
1 | |
n(-f''(x0)+\varepsilon) |
where the last equality was obtained by a change of variables
y=\sqrt{n(-f''(x0)+\varepsilon)}(x-x0).
Remember
f''(x0)<0
If we divide both sides of the above inequality by
nf(x0) | ||
e | \sqrt{ |
2\pi | |
n(-f''(x0)) |
and take the limit we get:
\limn
| ||||||||||
|
since this is true for arbitrary
\varepsilon
\limn
| ||||||||||
|
Note that this proof works also when
a=-infty
b=infty
Upper bound: The proof is similar to that of the lower bound but there are a few inconveniences. Again we start by picking an
\varepsilon>0
\varepsilon
f''(x0)+\varepsilon<0.
f''
\delta>0
|x-x0|<\delta
f(x)\lef(x0)+
1 | |
2 |
(f''(x0)+
2. | |
\varepsilon)(x-x | |
0) |
Lastly, by our assumptions (assuming
a,b
η>0
|x-x0|\ge\delta
f(x)\lef(x0)-η
Then we can calculate the following upper bound:
b | |
\begin{align} \int | |
a |
enf(x)dx&\le
x0-\delta | |
\int | |
a |
enf(x)dx+
x0+\delta | |
\int | |
x0-\delta |
enf(x)dx+
b | |
\int | |
x0+\delta |
enf(x)dx\\ &\le
n(f(x0)-η) | |
(b-a)e |
+
x0+\delta | |
\int | |
x0-\delta |
endx\\ &\le
n(f(x0)-η) | |
(b-a)e |
+
nf(x0) | |
e |
x0+\delta | |
\int | |
x0-\delta |
| |||||||||||
e |
dx\\ &\le
n(f(x0)-η) | |
(b-a)e |
+
nf(x0) | |
e |
+infty | |
\int | |
-infty |
| |||||||||||
e |
dx\\ &\le
n(f(x0)-η) | |
(b-a)e |
+
nf(x0) | ||
e | \sqrt{ |
2\pi | |
n(-f''(x0)-\varepsilon) |
If we divide both sides of the above inequality by
nf(x0) | ||
e | \sqrt{ |
2\pi | |
n(-f''(x0)) |
and take the limit we get:
\limn
| ||||||||||
|
Since
\varepsilon
\limn
| ||||||||||
|
And combining this with the lower bound gives the result.
Note that the above proof obviously fails when
a=-infty
b=infty
n=1,
b | |
\int | |
a |
enf(x)dx<infty,
and that the number
η
[a,b]
x0-\delta | |
\int | |
a |
enf(x)dx+
b | |
\int | |
x0+\delta |
enf(x)dx\le
b | |
\int | |
a |
ef(x)
(n-1)(f(x0)-η) | |
e |
dx=
(n-1)(f(x0)-η) | |
e |
b | |
\int | |
a |
ef(x)dx.
When we divide by
nf(x0) | ||
e | \sqrt{ |
2\pi | |
n(-f''(x0)) |
we get for this term
| |||||||||||||
|
whose limit as
n\toinfty
0
The given condition in the infinite interval case is, as said above, sufficient but not necessary. However, the condition is fulfilled in many, if not in most, applications: the condition simply says that the integral we are studying must be well-defined (not infinite) and that the maximum of the function at
x0
η>0
n=1
n=N.
This method relies on 4 basic concepts such as
1. Relative errorThe “approximation” in this method is related to the relative error and not the absolute error. Therefore, if we set
s=\sqrt{
2\pi | |
M\left|f''(x0)\right| |
the integral can be written as
\begin{align}
b | |
\int | |
a |
eMdx&=
Mf(x0) | |
se |
1 | |
s |
b | |
\int | |
a |
M(f(x)-f(x0)) | |
e |
dx\ &=
Mf(x0) | |
se |
| ||||
\int | ||||
|
M(f(sy+x0)-f(x0)) | |
e |
dy\end{align}
where
s
M
\left|
| ||||
\int | ||||
|
M(f(sy+x0)-f(x0)) | |
e |
dy-1\right|.
Now, let us separate this integral into two parts:
y\in[-Dy,Dy]
2.
M(f(sy+x0)-f(x0)) | |
e |
\to
-\piy2 | |
e |
M
Let’s look at the Taylor expansion of
M(f(x)-f(x0))
M(f(x)-f(x0))=
Mf''(x0) | |
2 |
s2y2+
Mf'''(x0) | |
6 |
s3y3+ … =-\piy2+O\left(
1 | |
\sqrt{M |
Note that
f'(x0)=0
x0
\tfrac{1}{\sqrt{M}}
\exp(M(f(x)-f(x0)))
infty | |
\int | |
-infty |
-\piy2 | |
e |
dy=1.
3. The larger
M
x
Because we do the comparison in y-space,
y
y\in[-Dy,Dy]
x\in[-sDy,sDy]
s
\sqrt{M}
x
M
4. If the integral in Laplace's method converges, the contribution of the region which is not around the stationary point of the integration of its relative error will tend to zero as
M
Relying on the 3rd concept, even if we choose a very large Dy, sDy will finally be a very small number when
M
M
The basic idea is to find a function
m(x)
m(x)\gef(x)
eMm(x)
M
Mm(x)
m(x)
m(x),
eMf(x)
m(x)
x=sDy
If the interval of the integration of this method is finite, we will find that no matter
f(x)
m(x)
M
eMm(x)
M
If the interval of the integration of this method is infinite,
m(x)
f(x)
eMf(x)
f(x)=\tfrac{\sin(x)}{x},
infty | |
\int | |
0 |
eMf(x)dx
infty | |
\int | |
d |
eMf(x)dx
d
d
m(x)
f(x).
You might ask why not choose
infty | |
\int | |
d |
ef(x)dx
f(x)
-lnx,
ef(x)=\tfrac{1}{x}
M=2,
eMf(x)=\tfrac{1}{x2}
M
M
Based on these four concepts, we can derive the relative error of this method.
Laplace's approximation is sometimes written as
b | |
\int | |
a |
h(x)eMdx ≈ \sqrt{
2\pi | |
M|g''(x0)| |
where
h
Importantly, the accuracy of the approximation depends on the variable of integration, that is, on what stays in
g(x)
h(x).
First, use
x0=0
|R|
b | |
\int | |
a |
h(x)eMdx=h(0)eMg(0)s
b/s | |
\underbrace{\int | |
a/s |
h(x) | |
h(0) |
eM\left[dy}1+R,
s\equiv\sqrt{ | 2\pi |
M\left|g''(0)\right| |
So, if we let
A\equiv
h(sy) | |
h(0) |
eM\left[g(sy)-g(0)
and
A0\equiv
-\piy2 | |
e |
\left|R\right|=\left|
b/s | |
\int | |
a/s |
Ady
infty | |
-\int | |
-infty |
A0dy\right|
since
infty | |
\int | |
-infty |
A0dy=1
For the upper bound, note that
|A+B|\le|A|+|B|,
|R|<\underbrace{\left|
infty | |
\int | |
Dy |
A0dy
\right|} | |
(a1) |
+\underbrace{\left|
b/s | |
\int | |
Dy |
Ady
\right|} | |
(b1) |
+\underbrace{\left|
Dy | |
\int | |
-Dy |
\left(A-A0\right)dy\right|}(c)+\underbrace{\left|
-Dy | |
\int | |
a/s |
Ady
\right|} | |
(b2) |
+\underbrace{\left|
-Dy | |
\int | |
-infty |
A0dy
\right|} | |
(a2) |
where
(a1)
(a2)
(a1)
(b1)
(b2)
(b1)
For
(a1)
z\equiv\piy2
(a1)=\left|
1 | |
2\sqrt{\pi |
This means that as long as
Dy
For
(b1)
(b1)\le\left|
b/s | ||
\int | \left[ | |
Dy |
h(sy) | |
h(0) |
\right]maxeMm(sy)dy\right|
where
m(x)\geg(x)-g(0)asx\in[sDy,b]
and
h(x)
h(0)
m(x)
x=sDy
m(sy)=g(sDy)-g(0)+g'(sDy)\left(sy-sDy\right)
From this figure you can find that when
s
Dy
m(x)
f(x)
(b1)
Dy
e-\alpha
(b1)
Based on Taylor expansion, we can get
\begin{align} M\left[g(sDy)-g(0)\right]&=M\left[
g''(0) | |
2 |
2 | ||
s | + | |
y |
g'''(\xi) | |
6 |
3 | |
s | |
y |
\right]&&as\xi\in[0,sDy]\\ &=-\pi
2 | ||
D | + | |
y |
| |||||||||||||
6\sqrt{M |
| ||||
|g''(0)| |
and
\begin{align} Msg'(sDy)&=Ms\left(g''(0)sDy+
g'''(\zeta) | |
2 |
2\right) | |
s | |
y |
&&as\zeta\in[0,sDy]\\ &=-2\piDy+\sqrt{
2 | |
M |
and then substitute them back into the calculation of
(b1)
M
\begin{align} (b1)&\le\left|\left[
h(sy) | |
h(0) |
\right]max
| |||||||||
e |
b/s-Dy | |
\int | |
0 |
-2\piDyy | |
e |
dy\right|\\ &\le\left|\left[
h(sy) | |
h(0) |
\right]max
| |||||||||
e |
1 | |
2\piDy |
\right|. \end{align}
Therefore, it will tend to zero when
Dy
Dy
About the integration near
x=0
h'(0)\ne0
\begin{align} (c)&\le
Dy | |
\int | |
-Dy |
-\piy2 | |
e |
\left|
sh'(\xi) | |
h(0) |
y\right|dy\\ &<\sqrt{
2 | |
\piM|g''(0)| |
and you can find that it is inversely proportional to the square root of
M
(c)
h(x)
Conclusively, the integral near the stationary point will get smaller as
\sqrt{M}
Dy
Dy
m(x)
g(x)-g(0)
m(x)
Dy
\sqrt{M}
m(x)
g(x)-g(0)
x=sDy
M
Dy
In the multivariate case, where
x
d
f(x)
x
\inth(x)eMdx ≈ \left(
2\pi | |
M |
\right)d/2
| |||||||
|
asM\toinfty
where
H(f)(x0)
f
x0
| ⋅ |
By the way, although
x
d
dx
dx:=dx1dx2 … dxd
See main article: Method of steepest descent.
In extensions of Laplace's method, complex analysis, and in particular Cauchy's integral formula, is used to find a contour of steepest descent for an (asymptotically with large M) equivalent integral, expressed as a line integral. In particular, if no point x0 where the derivative of
f
The appropriate formulation for the complex z-plane is
b | |
\int | |
a |
eMdz ≈ \sqrt{
2\pi | |
-Mf''(z0) |
for a path passing through the saddle point at z0. Note the explicit appearance of a minus sign to indicate the direction of the second derivative: one must not take the modulus. Also note that if the integrand is meromorphic, one may have to add residues corresponding to poles traversed while deforming the contour (see for example section 3 of Okounkov's paper Symmetric functions and random partitions).
An extension of the steepest descent method is the so-called nonlinear stationary phase/steepest descent method. Here, instead of integrals, one needs to evaluate asymptotically solutions of Riemann–Hilbert factorization problems.
Given a contour C in the complex sphere, a function
f
f
An asymptotic evaluation is then possible along the lines of the linear stationary phase/steepest descent method. The idea is to reduce asymptotically the solution of the given Riemann–Hilbert problem to that of a simpler, explicitly solvable, Riemann–Hilbert problem. Cauchy's theorem is used to justify deformations of the jump contour.
The nonlinear stationary phase was introduced by Deift and Zhou in 1993, based on earlier work of Its. A (properly speaking) nonlinear steepest descent method was introduced by Kamvissis, K. McLaughlin and P. Miller in 2003, based on previous work of Lax, Levermore, Deift, Venakides and Zhou. As in the linear case, "steepest descent contours" solve a min-max problem. In the nonlinear case they turn out to be "S-curves" (defined in a different context back in the 80s by Stahl, Gonchar and Rakhmanov).
The nonlinear stationary phase/steepest descent method has applications to the theory of soliton equations and integrable models, random matrices and combinatorics.
In the generalization, evaluation of the integral is considered equivalent to finding the norm of the distribution with density
eM.
Denoting the cumulative distribution
F(x)
| ||||||
e |
the norm is given by
\sqrt{2\pi\gamma-1
and the corresponding diffeomorphism is
y(x)= | 1 |
\sqrt{\gamma |
where
\Phi
In general, any distribution diffeomorphic to the Gaussian distribution has density
| ||||||
e |
y'(x)
and the median-point is mapped to the median of the Gaussian distribution. Matching the logarithm of the density functions and their derivatives at the median point up to a given order yields a system of equations that determine the approximate values of
\gamma
g
The approximation was introduced in 2019 by D. Makogon and C. Morais Smith primarily in the context of partition function evaluation for a system of interacting fermions.[5]
For complex integrals in the form:
1 | |
2\pii |
c+iinfty | |
\int | |
c-iinfty |
g(s)estds
with
t\gg1,
s=c+ix
1 | |
2\pi |
infty | |
\int | |
-infty |
g(c+ix)e-uxeicudx.
We then split g(c + ix) in its real and complex part, after which we recover u = t/i. This is useful for inverse Laplace transforms, the Perron formula and complex integration.
Laplace's method can be used to derive Stirling's approximation
N! ≈ \sqrt{2\piN}(
N | |
e |
)N
for a large integer N.
From the definition of the Gamma function, we have
N!=
infty | |
\Gamma(N+1)=\int | |
0 |
e-xxNdx.
Now we change variables, letting
x=Nz
dx=Ndz.
\begin{align} N!&=
infty | |
\int | |
0 |
e-Nz(Nz)NNdz\\ &=NN+1
infty | |
\int | |
0 |
e-NzzNdz\\ &=NN+1
infty | |
\int | |
0 |
e-NzeNlndz\\ &=NN+1
infty | |
\int | |
0 |
eN(lndz. \end{align}
This integral has the form necessary for Laplace's method with
f(z)=ln{z}-z
which is twice-differentiable:
f'(z)=
1 | |
z |
-1,
f''(z)=-
1 | |
z2 |
.
The maximum of
f(z)
f(z)
N! ≈ NN+1\sqrt{
2\pi | |
N |