In mathematics, Itô's lemma or Itô's formula (also called the Itô–Doeblin formula, especially in the French literature) is an identity used in Itô calculus to find the differential of a time-dependent function of a stochastic process. It serves as the stochastic calculus counterpart of the chain rule. It can be heuristically derived by forming the Taylor series expansion of the function up to its second derivatives and retaining terms up to first order in the time increment and second order in the Wiener process increment. The lemma is widely employed in mathematical finance, and its best known application is in the derivation of the Black–Scholes equation for option values.
Kiyoshi Itô published a proof of the formula in 1951.[1]
Suppose we are given the stochastic differential equationwhere is a Wiener process and the functions
\mut,\sigmat
Xt
Bt.
This expression lets us easily read off the mean and variance of
Xt
dBt
Xt
Similarly, because the
dB
Xt
However, sometimes we are faced with a stochastic differential equation for a more complex process
Yt,
a1
a2.
Yt
Xt
f(t,x),\mut,
\sigmat,
Yt=f(t,Xt)
dXt=\mut dt+\sigmat dBt.
A formal proof of the lemma relies on taking the limit of a sequence of random variables. This approach is not presented here since it involves a number of technical details. Instead, we give a sketch of how one can derive Itô's lemma by expanding a Taylor series and applying the rules of stochastic calculus.
Suppose is an Itô drift-diffusion process that satisfies the stochastic differential equation
dXt=\mutdt+\sigmatdBt,
where is a Wiener process.
If is a twice-differentiable scalar function, its expansion in a Taylor series is
df=
\partialf | |
\partialt |
dt+
1 | |
2 |
\partial2f | |
\partialt2 |
dt2+ … +
\partialf | |
\partialx |
dx+
1 | |
2 |
\partial2f | |
\partialx2 |
dx2+ … .
Substituting for and therefore for gives
df=
\partialf | |
\partialt |
dt+
1 | |
2 |
\partial2f | |
\partialt2 |
dt2+ … +
\partialf | |
\partialx |
(\mutdt+\sigmatdBt)+
1 | |
2 |
\partial2f | |
\partialx2 |
\left
2(dt) | |
(\mu | |
t |
2+2\mut\sigmatdtdBt+
2 | |
\sigma | |
t) |
\right)+ … .
In the limit, the terms and tend to zero faster than, which is . Setting the and terms to zero, substituting for (due to the quadratic variation of a Wiener process), and collecting the and terms, we obtain
df=\left(
\partialf | |
\partialt |
+
\mu | ||||
|
+
| |||||||
2 |
\partial2f | |
\partialx2 |
\right)dt+
\sigma | ||||
|
dBt
as required.
Suppose we know that
Xt,Xt+dt
f
f(Xt),f(Xt+dt)
Xt+dt\midXt
f(Xt+dt)\midf(Xt)
dt
dt
The key idea is that
Xt+dt=Xt+\mutdt+dWt
f
f
To find out how large the contribution is, we write
Xt=Xt+\mutdt+\sigmat\sqrt{dt}z
z
dt\to0
f(Xt)+f'(Xt)\mutdt
12 | |
f''(X |
t)
2 | |
\sigma | |
t |
dt
To understand why there should be a contribution due to convexity, consider the simplest case of geometric Brownian walk (of the stock market):
St+dt=St(1+dBt)
d(lnSt)=dBt
Xt=lnSt
St=
Xt | |
e |
Xt
Xt
St
Xt
St
In the following subsections we discuss versions of Itô's lemma for different types of stochastic processes.
In its simplest form, Itô's lemma states the following: for an Itô drift-diffusion process
dXt=\mutdt+\sigmatdBt
and any twice differentiable scalar function of two real variables and, one has
df(t,Xt)=\left(
\partialf | |
\partialt |
+\mut
\partialf | |
\partialx |
+
| |||||||
2 |
\partial2f | |
\partialx2 |
\right)dt+\sigmat
\partialf | |
\partialx |
dBt.
This immediately implies that is itself an Itô drift-diffusion process.
In higher dimensions, if
Xt=
1 | |
(X | |
t, |
2 | |
X | |
t, |
\ldots,
T | |
X | |
t) |
dXt=\boldsymbol{\mu}tdt+GtdBt
for a vector
\boldsymbol{\mu}t
Gt
\begin{align} df(t,Xt)&=
\partialf | |
\partialt |
dt+\left(\nablaXf\right)TdXt+
1 | |
2 |
\left(dXt\right)T\left(HXf\right)dXt,\\[4pt] &=\left\{
\partialf | |
\partialt |
+\left(\nablaXf\right)T\boldsymbol{\mu}t+
1 | |
2 |
\operatorname{Tr}\left[
T | |
G | |
t |
\left(HXf\right)Gt\right]\right\}dt+\left(\nablaXf\right)TGtdBt \end{align}
where
\nablaXf
We may also define functions on discontinuous stochastic processes.
Let be the jump intensity. The Poisson process model for jumps is that the probability of one jump in the interval is plus higher order terms. could be a constant, a deterministic function of time, or a stochastic process. The survival probability is the probability that no jump has occurred in the interval . The change in the survival probability is
dps(t)=-ps(t)h(t)dt.
So
ps(t)=\exp
t | |
\left(-\int | |
0 |
h(u)du\right).
Let be a discontinuous stochastic process. Write
S(t-)
djS(t)
djS(t)=\lim\Delta(S(t+\Deltat)-S(t-))
Let z be the magnitude of the jump and let
η(S(t-),z)
E[djS(t)]=h(S(t-))dt\intzzη(S(t-),z)dz.
Define
dJS(t)
dJS(t)=djS(t)-E[djS(t)]=S(t)-S(t-)-\left(
-))\int | |
h(S(t | |
z |
zη\left(S(t-),z\right)dz\right)dt.
Then
djS(t)=E[djS(t)]+dJS(t)=h(S(t-))\left(\intzzη(S(t-),z)dz\right)dt+dJS(t).
Consider a function
g(S(t),t)
ηg
g(t-)
S(t-)
g
g(t)-g(t-)=h(t)dt\int\Delta\Deltagηg( ⋅ )d\Deltag+dJg(t).
If
S
g(S(t),t)
dg(t)=\left(
\partialg | |
\partialt |
+\mu
\partialg | + | |
\partialS |
\sigma2 | |
2 |
\partial2g | |
\partialS2 |
+h(t)\int\Delta\left(\Deltagηg( ⋅ )d{\Delta}g\right)\right)dt+
\partialg | |
\partialS |
\sigmadW(t)+dJg(t).
Itô's lemma for a process which is the sum of a drift-diffusion process and a jump process is just the sum of the Itô's lemma for the individual parts.
Itô's lemma can also be applied to general -dimensional semimartingales, which need not be continuous. In general, a semimartingale is a càdlàg process, and an additional term needs to be added to the formula to ensure that the jumps of the process are correctly given by Itô's lemma.For any cadlag process, the left limit in is denoted by, which is a left-continuous process. The jumps are written as . Then, Itô's lemma states that if is a -dimensional semimartingale and f is a twice continuously differentiable real valued function on then f(X) is a semimartingale, and
\begin{align} f(Xt) &= f(X0) +\sum
t | |
0 |
fi(Xs-
i | |
)dX | |
s + |
1 | |
2 |
d | |
\sum | |
i,j=1 |
t | |
\int | |
0 |
fi,j(Xs-)d[Xi,X
j] | |
s\\ & + |
\sums\le\left(\Deltaf(Xs)-\sum
df | |
i |
(Xs-)\Delta
i | ||||
X | ||||
|
d | |
\sum | |
i,j=1 |
fi,j(Xs-)\Delta
i | |
X | |
s |
\Delta
j | |
X | |
s\right). \end{align} |
This differs from the formula for continuous semi-martingales by the additional term summing over the jumps of X, which ensures that the jump of the right hand side at time is Δf(Xt).
There is also a version of this for a twice-continuously differentiable in space once in time function f evaluated at (potentially different) non-continuous semi-martingales which may be written as follows:
d | |
\begin{align} f(t,X | |
t) = |
{}
d | |
& f(0,X | |
0) +\int |
t | |
0 |
f |
1 | |
({s | |
s- |
d | |
,\ldots,X | |
s- |
)d{s}\\ &{}
d | |
+\sum | |
i=1 |
t | |
\int | |
0 |
fi
1 | |
({s | |
s- |
d | |
,\ldots,X | |
s- |
(c,i) | |
)dX | |
s\\ & |
{}+
1 | |
2 |
d | |
\sum | |
i1,\ldots,id=1 |
t | |
\int | |
0 |
f | |
i1,\ldots,id |
1 | |
({s | |
s- |
d | |
,\ldots,X | |
s- |
(c,i1) | |
)dX | |
s … |
(c,id) | |
X | |
s\\ & |
{}+\sum0<s\leq\left[
d | |
f(s,X | |
s) |
-
1 | |
f({s | |
s- |
d | |
,\ldots,X | |
s- |
) \right] \end{align}
where
Xc,i
dSt=\sigmaStdBt+\muStdt
f(St)=log(St)
\begin{align} df&=
\prime(S | |
f | |
t)dS |
t+
1 | |
2 |
f\prime\prime(St)
2 | |
(dS | |
t) |
\\[4pt] &=
1 | |
St |
dSt+
1 | |
2 |
-2 | |
(-S | |
t |
)
2\sigma | |
(S | |
t |
2dt)\\[4pt] &=
1 | |
St |
\left(\sigmaStdBt+\muStdt\right)-
1 | |
2 |
\sigma2dt\\[4pt] &=\sigmadBt+\left(\mu-\tfrac{\sigma2}{2}\right)dt. \end{align}
It follows that
log(St)=log(S0)+\sigmaBt+\left(\mu-\tfrac{\sigma2}{2}\right)t,
exponentiating gives the expression for S,
St=S0\exp\left(\sigmaBt+\left(\mu-\tfrac{\sigma2}{2}\right)t\right).
The correction term of corresponds to the difference between the median and mean of the log-normal distribution, or equivalently for this distribution, the geometric mean and arithmetic mean, with the median (geometric mean) being lower. This is due to the AM–GM inequality, and corresponds to the logarithm being concave (or convex upwards), so the correction term can accordingly be interpreted as a convexity correction. This is an infinitesimal version of the fact that the annualized return is less than the average return, with the difference proportional to the variance. See geometric moments of the log-normal distribution for further discussion.
The same factor of appears in the d1 and d2 auxiliary variables of the Black–Scholes formula, and can be interpreted as a consequence of Itô's lemma.
The Doléans-Dade exponential (or stochastic exponential) of a continuous semimartingale X can be defined as the solution to the SDE with initial condition . It is sometimes denoted by .Applying Itô's lemma with f(Y) = log(Y) gives
\begin{align} dlog(Y)&=
1 | |
Y |
dY-
1 | |
2Y2 |
d[Y]\\[6pt] &=dX-\tfrac{1}{2}d[X]. \end{align}
Exponentiating gives the solution
Yt=\exp\left(Xt-X0-\tfrac{1}{2}[X]t\right).
Itô's lemma can be used to derive the Black–Scholes equation for an option.[2] Suppose a stock price follows a geometric Brownian motion given by the stochastic differential equation . Then, if the value of an option at time is f(t, St), Itô's lemma gives
df(t,St)=\left(
\partialf | |
\partialt |
+
1 | |
2 |
| ||||
\left(S | ||||
t\sigma\right) |
\right)dt+
\partialf | |
\partialS |
dSt.
The term represents the change in value in time dt of the trading strategy consisting of holding an amount of the stock. If this trading strategy is followed, and any cash held is assumed to grow at the risk free rate r, then the total value V of this portfolio satisfies the SDE
dVt=
r\left(V | ||||
|
St\right)dt+
\partialf | |
\partialS |
dSt.
This strategy replicates the option if V = f(t,S). Combining these equations gives the celebrated Black–Scholes equation
\partialf | |
\partialt |
+
\sigma2S2 | |
2 |
\partial2f | |
\partialS2 |
+rS
\partialf | |
\partialS |
-rf=0.
Let
Xt
dXt=
2\end{pmatrix} | |
d\begin{pmatrix}X | |
t |
=
2\end{pmatrix} | |
\begin{pmatrix}\mu | |
t |
dt+
2\end{pmatrix} | |
\begin{pmatrix}\sigma | |
t |
dBt
Then we can use the multi-dimensional form of Ito's lemma to find an expression for
2) | |
d(X | |
t |
We have
\mut=\begin{pmatrix}\mu
2\end{pmatrix} | |
t |
G=
2\end{pmatrix} | |
\begin{pmatrix}\sigma | |
t |
We set
f(t,Xt)=X
2 | |
t |
\partialf | |
\partialt |
T | |
=0, (\nabla | |
Xf) |
=
1) | |
(X | |
t |
HXf=\begin{pmatrix}0&1\\1&0\end{pmatrix}
Substituting these values in the multi-dimensional version of the lemma gives us:
2) | |
\begin{align} d(X | |
t |
&=df(t,Xt)\\ &=0 ⋅ dt+
1) | |
(X | |
t |
dXt+
12 | |
(dX |
2 | |
t |
1 | |
dX | |
t |
+
1 | |
X | |
t |
2 | |
dX | |
t |
+
1 | |
dX | |
t |
2 | |
dX | |
t |
\end{align}
This is a generalisation of Leibniz's product rule to Ito processes, which are non-differentiable.
Further, using the second form of the multidimensional version above gives us
2) &=\left\{ 0 | |
\begin{align} d(X | |
t |
+
2\end{pmatrix} + | |
(X | |
t |
12 | |
\operatorname{Tr} \left[ (\sigma |
2\end{pmatrix} \right] \right\} | |
t |
dt +
1 | |
(X | |
t |
+
1 | |
X | |
t |
2) | |
\sigma | |
t |
dBt\\[5pt] &=
1 | |
\left(X | |
t |
+
1 | |
X | |
t |
2 | |
\mu | |
t |
+
2\right) | |
\sigma | |
t |
dt +
1 | |
(X | |
t |
+
1 | |
X | |
t |
2) | |
\sigma | |
t |
dBt \end{align}
so we see that the product
2 | |
X | |
t |
An idea by Hans Föllmer was to extend Itô's formula to functions with finite quadratic variation.[3]
Let
f\inC2
x:[0,infty]\toR
\begin{align}f(xt)={}&f(x0)+\int
t | |
0 |
f'(xs-)dxs+
1 | |
2 |
\int]0,t]f''(xs-)d[x,x]s \\&+\sum0\leq\left(f(xs)-f(xs-)-f'(xs-)\Deltaxs-
1 | |
2 |
f''(xs-)(\Delta
2)\right).\end{align} | |
x | |
s) |
There exist a couple of extensions to infinite-dimensional spaces (e.g. Pardoux,[4] Gyöngy-Krylov,[5] Brzezniak-van Neerven-Veraar-Weis[6]).