Itô's lemma explained

In mathematics, Itô's lemma or Itô's formula (also called the Itô–Doeblin formula, especially in the French literature) is an identity used in Itô calculus to find the differential of a time-dependent function of a stochastic process. It serves as the stochastic calculus counterpart of the chain rule. It can be heuristically derived by forming the Taylor series expansion of the function up to its second derivatives and retaining terms up to first order in the time increment and second order in the Wiener process increment. The lemma is widely employed in mathematical finance, and its best known application is in the derivation of the Black–Scholes equation for option values.

Kiyoshi Itô published a proof of the formula in 1951.^[1]

Motivation

Suppose we are given the stochastic differential equation $dX_t = \mu_t\ dt + \sigma_t\ dB_t,$ where is a Wiener process and the functions

\mu_t,\sigma_t

are deterministic (not stochastic) functions of time. In general, it's not possible to write a solution

X_t

directly in terms of

B_t.

However, we can formally write an integral solution

X_t = \int_0^t \mu_s\ ds + \int_0^t \sigma_s\ dB_s.

This expression lets us easily read off the mean and variance of

X_t

(which has no higher moments). First, notice that every

dB_t

individually has mean 0, so the expected value of

X_t

is simply the integral of the drift function:

\mathrm E[X_t]=\int_0^t \mu_s\ ds.

Similarly, because the

terms have variance 1 and no correlation with one another, the variance of

X_t

is simply the integral of the variance of each infinitesimal step in the random walk:

\mathrm[X_t] = \int_0^t\sigma_s^2\ ds.

However, sometimes we are faced with a stochastic differential equation for a more complex process

Y_t,

in which the process appears on both sides of the differential equation. That is, say

dY_t = a_1(Y_t,t) \ dt + a_2(Y_t,t)\ dB_t,

for some functions

a₁

and

a_2.

In this case, we cannot immediately write a formal solution as we did for the simpler case above. Instead, we hope to write the process

Y_t

as a function of a simpler process

X_t

taking the form above. That is, we want to identify three functions

f(t,x),\mu_t,

and

\sigma_t,

such that

Y_t=f(t,X_t)

and

dX_t=\mu_t dt+\sigma_t dB_t.

In practice, Ito's lemma is used in order to find this transformation. Finally, once we have transformed the problem into the simpler type of problem, we can determine the mean and higher moments of the process.

Informal derivation

A formal proof of the lemma relies on taking the limit of a sequence of random variables. This approach is not presented here since it involves a number of technical details. Instead, we give a sketch of how one can derive Itô's lemma by expanding a Taylor series and applying the rules of stochastic calculus.

Suppose is an Itô drift-diffusion process that satisfies the stochastic differential equation

dX_t=\mu_tdt+\sigma_tdB_t,

where is a Wiener process.

If is a twice-differentiable scalar function, its expansion in a Taylor series is

df=

	\partialf
	\partialt

dt+

	1
	2

	\partial²f
	\partialt²

dt²+ … +

	\partialf
	\partialx

dx+

	1
	2

	\partial²f
	\partialx²

dx²+ … .

Substituting for and therefore for gives

df=

	\partialf
	\partialt

dt+

	1
	2

	\partial²f
	\partialt²

dt²+ … +

	\partialf
	\partialx

(\mu_tdt+\sigma_tdB_t)+

	1
	2

	\partial²f
	\partialx²

\left

	2(dt)
(\mu
	t

²+2\mu_t\sigma_tdtdB_t+

	2
\sigma
	t)

\right)+ … .

In the limit, the terms and tend to zero faster than, which is . Setting the and terms to zero, substituting for (due to the quadratic variation of a Wiener process), and collecting the and terms, we obtain

df=\left(

	\partialf
	\partialt

\mu

t	\partialf
	\partialx

	2
\sigma
	t

	\partial²f
	\partialx²

\right)dt+

\sigma

t	\partialf
	\partialx

dB_t

as required.

Geometric intuition

Suppose we know that

X_t,X_t+dt

are two jointly-Gaussian distributed random variables, and

is nonlinear but has continuous second derivative, then in general, neither of

f(X_t),f(X_t+dt)

is Gaussian, and their joint distribution is also not Gaussian. However, since

X_t+dt\midX_t

is Gaussian, we might still find

f(X_t+dt)\midf(X_t)

is Gaussian. This is not true when

is finite, but when

becomes infinitesimal, this becomes true.

The key idea is that

X_t+dt=X_t+\mu_tdt+dW_t

has a deterministic part and a noisy part. When

is nonlinear, the noisy part has a deterministic contribution. If

is convex, then the deterministic contribution is positive (by Jensen's inequality).

To find out how large the contribution is, we write

X_t=X_t+\mu_tdt+\sigma_t\sqrt{dt}z

, where

is a standard Gaussian, then perform Taylor expansion.

\beginf(X_) &= f(X_t) + f'(X_t) \mu_t \, dt + f'(X_t)\sigma_t \sqrt \, z + \frac 12 f

(X_t) (\sigma_t^2 z^2 \, dt + 2 \mu_t \sigma_t z \, dt^ + \mu_t^2 dt^2) + o(dt) \\&= \left(f(X_t) + f'(X_t) \mu_t \, dt + \frac 12 f(X_t) \sigma_t^2 \, dt + o(dt)\right) + \left(f'(X_t)\sigma_t \sqrt \, z + \frac 12 f(X_t) \sigma_t^2 (z^2-1) \, dt + o(dt)\right)\endWe have split it into two parts, a deterministic part, and a random part with mean zero. The random part is non-Gaussian, but the non-Gaussian parts decay faster than the Gaussian part, and at the
dt\to0

limit, only the Gaussian part remains. The deterministic part has the expected
f(X_t)+f'(X_t)\mu_tdt

, but also a part contributed by the convexity:
12
f''(X

_t)

2
\sigma
t

dt

.

To understand why there should be a contribution due to convexity, consider the simplest case of geometric Brownian walk (of the stock market):

S_t+dt=S_t(1+dB_t)

. In other words,

d(lnS_t)=dB_t

. Let

X_t=lnS_t

, then

S_t=

	X_t
e

, and

X_t

is a Brownian walk. However, although the expectation of

X_t

remains constant, the expectation of

S_t

grows. Intuitively it is because the downside is limited at zero, but the upside is unlimited. That is, while

X_t

is normally distributed,

S_t

is log-normally distributed.

Mathematical formulation of Itô's lemma

In the following subsections we discuss versions of Itô's lemma for different types of stochastic processes.

Itô drift-diffusion processes (due to: Kunita–Watanabe)

In its simplest form, Itô's lemma states the following: for an Itô drift-diffusion process

dX_t=\mu_tdt+\sigma_tdB_t

and any twice differentiable scalar function of two real variables and, one has

df(t,X_t)=\left(

	\partialf
	\partialt

+\mu_t

	\partialf
	\partialx

	2
\sigma
	t

	\partial^2f
	\partialx²

\right)dt+\sigma_t

	\partialf
	\partialx

dB_t.

This immediately implies that is itself an Itô drift-diffusion process.

In higher dimensions, if

X_t=

	1
(X
	t,

	2
X
	t,

\ldots,

	T
X
	t)

is a vector of Itô processes such that

dX_t=\boldsymbol{\mu}_tdt+G_tdB_t

for a vector

\boldsymbol{\mu}_t

and matrix

G_t

, Itô's lemma then states that

\begin{align} df(t,X_t)&=

	\partialf
	\partialt

dt+\left(\nabla_Xf\right)^TdX_t+

	1
	2

\left(dX_t\right)^T\left(H_Xf\right)dX_t,\\[4pt] &=\left\{

	\partialf
	\partialt

+\left(\nabla_Xf\right)^T\boldsymbol{\mu}_t+

	1
	2

\operatorname{Tr}\left[

	T
G
	t

\left(H_Xf\right)G_t\right]\right\}dt+\left(\nabla_Xf\right)^TG_tdB_{t
\end{align}}

where

\nabla_Xf

is the gradient of w.r.t., is the Hessian matrix of w.r.t., and is the trace operator.

Poisson jump processes

We may also define functions on discontinuous stochastic processes.

Let be the jump intensity. The Poisson process model for jumps is that the probability of one jump in the interval is plus higher order terms. could be a constant, a deterministic function of time, or a stochastic process. The survival probability is the probability that no jump has occurred in the interval . The change in the survival probability is

dp_s(t)=-p_s(t)h(t)dt.

p_s(t)=\exp

	t
\left(-\int
	0

h(u)du\right).

Let be a discontinuous stochastic process. Write

S(t^-)

for the value of S as we approach t from the left. Write

d_jS(t)

for the non-infinitesimal change in as a result of a jump. Then

d_jS(t)=\lim_\Delta(S(t+\Deltat)-S(t^-))

Let z be the magnitude of the jump and let

η(S(t^-),z)

be the distribution of z. The expected magnitude of the jump is

E[d_jS(t)]=h(S(t^-))dt\int_zzη(S(t^-),z)dz.

Define

dJ_S(t)

, a compensated process and martingale, as

dJ_S(t)=d_jS(t)-E[d_jS(t)]=S(t)-S(t^-)-\left(

	-))\int
h(S(t
	z

zη\left(S(t^-),z\right)dz\right)dt.

Then

d_jS(t)=E[d_jS(t)]+dJ_S(t)=h(S(t^-))\left(\int_zzη(S(t^-),z)dz\right)dt+dJ_S(t).

Consider a function

g(S(t),t)

of the jump process . If jumps by then jumps by . is drawn from distribution

η_g

which may depend on

g(t^-)

, dg and

S(t^-)

. The jump part of

g(t)-g(t^-)=h(t)dt\int_\Delta\Deltagη_{g( ⋅ )}d\Deltag+dJ_g(t).

contains drift, diffusion and jump parts, then Itô's Lemma for

g(S(t),t)

dg(t)=\left(

	\partialg
	\partialt

+\mu

	\partialg	+
	\partialS

	\sigma²
	2

	\partial²g
	\partialS²

+h(t)\int_\Delta\left(\Deltagη_{g( ⋅ )}d{\Delta}g\right)\right)dt+

	\partialg
	\partialS

\sigmadW(t)+dJ_g(t).

Itô's lemma for a process which is the sum of a drift-diffusion process and a jump process is just the sum of the Itô's lemma for the individual parts.

Non-continuous semimartingales

Itô's lemma can also be applied to general -dimensional semimartingales, which need not be continuous. In general, a semimartingale is a càdlàg process, and an additional term needs to be added to the formula to ensure that the jumps of the process are correctly given by Itô's lemma.For any cadlag process, the left limit in is denoted by, which is a left-continuous process. The jumps are written as . Then, Itô's lemma states that if is a -dimensional semimartingale and f is a twice continuously differentiable real valued function on then f(X) is a semimartingale, and

\begin{align} f(X_t)
&=
f(X_0)
+\sum

	t

	0

f_i(X_s-

	i
)dX
	s +

	1
	2

	d
\sum
	i,j=1

	t
\int
	0

f_i,j(X_s-)d[X^i,X

	j]

	s\\ & +

\sum_s\le\left(\Deltaf(X_s)-\sum

	df

	i

(X_s-)\Delta

s -	1
	2

	d
\sum
	i,j=1

f_i,j(X_s-)\Delta

	i
X
	s

\Delta

	j
X
	s\right). \end{align}

This differs from the formula for continuous semi-martingales by the additional term summing over the jumps of X, which ensures that the jump of the right hand side at time is Δf(X_t).

Multiple non-continuous jump processes

There is also a version of this for a twice-continuously differentiable in space once in time function f evaluated at (potentially different) non-continuous semi-martingales which may be written as follows:

	d
\begin{align} f(t,X
	t) =

{}

	d
& f(0,X
	0) +\int

	t

	0

•

	1
({s
	s_-

	d
,\ldots,X
	s_-

)d{s}\\ &{}

	d
+\sum
	i=1

	t
\int
	0

f_i

	1
({s
	s_-

	d
,\ldots,X
	s_-

	(c,i)
)dX
	s\\ &

{}+

	1
	2

	d
\sum
	i_1,\ldots,i_d=1

	t
\int
	0

f
	i_1,\ldots,i_d

	1
({s
	s_-

	d
,\ldots,X
	s_-

	(c,i₁₎
)dX
	s …

	(c,i_d)
X
	s\\ &

{}+\sum_0<s\leq\left[

	d
f(s,X
	s)

	1
f({s
	s_-

	d
,\ldots,X
	s_-

) \right] \end{align}

where

X^c,i

denotes the continuous part of the ith semi-martingale.

Examples

Geometric Brownian motion

dS_t=\sigmaS_tdB_t+\muS_tdt

, for a Brownian motion B. Applying Itô's lemma with

f(S_t)=log(S_t)

gives

\begin{align} df&=

	\prime(S
f
	t)dS

_t+

	1
	2

f^\prime\prime(S_t)

	2
(dS
	t)

\\[4pt] &=

	1
	S_t

dS_t+

	1
	2

	-2
(-S
	t

)

	2\sigma
(S
	t

^2dt)\\[4pt] &=

	1
	S_t

\left(\sigmaS_tdB_t+\muS_tdt\right)-

	1
	2

\sigma^2dt\\[4pt] &=\sigmadB_t+\left(\mu-\tfrac{\sigma^2}{2}\right)dt. \end{align}

It follows that

log(S_t)=log(S₀₎+\sigmaB_t+\left(\mu-\tfrac{\sigma^2}{2}\right)t,

exponentiating gives the expression for S,

S_t=S_{0\exp\left(\sigma}B_t+\left(\mu-\tfrac{\sigma^2}{2}\right)t\right).

The correction term of corresponds to the difference between the median and mean of the log-normal distribution, or equivalently for this distribution, the geometric mean and arithmetic mean, with the median (geometric mean) being lower. This is due to the AM–GM inequality, and corresponds to the logarithm being concave (or convex upwards), so the correction term can accordingly be interpreted as a convexity correction. This is an infinitesimal version of the fact that the annualized return is less than the average return, with the difference proportional to the variance. See geometric moments of the log-normal distribution for further discussion.

The same factor of appears in the d₁ and d₂ auxiliary variables of the Black–Scholes formula, and can be interpreted as a consequence of Itô's lemma.

Doléans-Dade exponential

The Doléans-Dade exponential (or stochastic exponential) of a continuous semimartingale X can be defined as the solution to the SDE with initial condition . It is sometimes denoted by .Applying Itô's lemma with f(Y) = log(Y) gives

\begin{align} dlog(Y)&=

	1
	Y

dY-

	1
	2Y²

d[Y]\\[6pt] &=dX-\tfrac{1}{2}d[X]. \end{align}

Exponentiating gives the solution

Y_t=\exp\left(X_t-X_{0-\tfrac{1}{2}}[X]_t\right).

Black–Scholes formula

Itô's lemma can be used to derive the Black–Scholes equation for an option.^[2] Suppose a stock price follows a geometric Brownian motion given by the stochastic differential equation . Then, if the value of an option at time is f(t, S_t), Itô's lemma gives

df(t,S_t)=\left(

	\partialf
	\partialt

	1
	2

2	\partial²f
	\partialS²

\left(S

t\sigma\right)

\right)dt+

	\partialf
	\partialS

dS_t.

The term represents the change in value in time dt of the trading strategy consisting of holding an amount of the stock. If this trading strategy is followed, and any cash held is assumed to grow at the risk free rate r, then the total value V of this portfolio satisfies the SDE

dV_t=

r\left(V

t-	\partialf
	\partialS

S_t\right)dt+

	\partialf
	\partialS

dS_t.

This strategy replicates the option if V = f(t,S). Combining these equations gives the celebrated Black–Scholes equation

	\partialf
	\partialt

	\sigma^2S²
	2

	\partial²f
	\partialS²

+rS

	\partialf
	\partialS

-rf=0.

Product rule for Itô processes

Let

X_t

be a two-dimensional Ito process with SDE:

dX_t=

	2\end{pmatrix}
d\begin{pmatrix}X
	t

	2\end{pmatrix}
\begin{pmatrix}\mu
	t

dt+

	2\end{pmatrix}
\begin{pmatrix}\sigma
	t

dB_t

Then we can use the multi-dimensional form of Ito's lemma to find an expression for

	2)
d(X
	t

We have

\mu_{t=\begin{pmatrix}\mu}

	2\end{pmatrix}

	t

and

	2\end{pmatrix}
\begin{pmatrix}\sigma
	t

We set

f(t,X_t)=X

	2

	t

and observe that

	\partialf
	\partialt

	T
=0, (\nabla
	Xf)

	1)
(X
	t

and

H_{Xf=\begin{pmatrix}0&1\\1&0\end{pmatrix}}

Substituting these values in the multi-dimensional version of the lemma gives us:

	2)
\begin{align} d(X
	t

&=df(t,X_t)\\
&=0 ⋅ dt+

	1)
(X
	t

dX_t+

	12
	(dX

	2

	t

	1
dX
	t

	1
X
	t

	2
dX
	t

	1
dX
	t

	2
dX
	t

\end{align}

This is a generalisation of Leibniz's product rule to Ito processes, which are non-differentiable.

Further, using the second form of the multidimensional version above gives us

	2) &=\left\{ 0
\begin{align} d(X
	t

	2\end{pmatrix} +
(X
	t

	12
	\operatorname{Tr} \left[ (\sigma

	2\end{pmatrix} \right] \right\}

	t

dt +

	1
(X
	t

	1
X
	t

	2)
\sigma
	t

dB_t\\[5pt]
&=

	1
\left(X
	t

	1
X
	t

	2
\mu
	t

	2\right)
\sigma
	t

dt +

	1
(X
	t

	1
X
	t

	2)
\sigma
	t

dB_{t
\end{align}}

so we see that the product

	2
X
	t

is itself an Itô drift-diffusion process.

Itô's formula for functions with finite quadratic variation

An idea by Hans Föllmer was to extend Itô's formula to functions with finite quadratic variation.^[3]

Let

f\inC²

be a real-valued function and

x:[0,infty]\toR

a RCLL function with finite quadratic variation. Then

\begin{align}f(x_t)={}&f(x_0)+\int

	t

	0

f'(x_s-)dx_s+

	1
	2

\int_]0,t]f''(x_s-)d[x,x]_s
\\&+\sum_0\leq\left(f(x_s)-f(x_s-)-f'(x_s-)\Deltax_s-

	1
	2

f''(x_s-)(\Delta

	2)\right).\end{align}
x
	s)

Infinite-dimensional formulas

There exist a couple of extensions to infinite-dimensional spaces (e.g. Pardoux,^[4] Gyöngy-Krylov,^[5] Brzezniak-van Neerven-Veraar-Weis^[6]).

References

Kiyosi Itô (1944). Stochastic Integral. Proc. Imperial Acad. Tokyo 20, 519–524. This is the paper with the Ito Formula; Online
Kiyosi Itô (1951). On stochastic differential equations. Memoirs, American Mathematical Society 4, 1–51. Online
Bernt Øksendal (2000). Stochastic Differential Equations. An Introduction with Applications, 5th edition, corrected 2nd printing. Springer. . Sections 4.1 and 4.2.
Philip E Protter (2005). Stochastic Integration and Differential Equations, 2nd edition. Springer. . Section 2.7.

External links

Derivation, Prof. Thayer Watkins
Informal proof, optiontutor

Notes and References

Kiyoshi. Itô. 1951. On a formula concerning stochastic differentials. Nagoya Math. J.. 3. 55–65. 10.1017/S0027763000012216 .
Book: Malliaris, A. G. . Stochastic Methods in Economics and Finance . New York . North-Holland . 1982 . 0-444-86201-3 . 220–223 .
Hans. Föllmer. Calcul d'Ito sans probabilités. Séminaire de probabilités de Strasbourg. 15. 1981. 143–144.
Étienne. Pardoux. Équations aux dérivées partielles stochastiques de type monotone. Séminaire Jean Leray. 3. 1974.
Encyclopedia: István. Gyöngy. Nikolay Vladim Vladimirovich. Krylov. 1981. Ito formula in banach spaces. M. Arató. D. Vermes, D.. A.V. Balakrishnan. Stochastic Differential Systems. Lecture Notes in Control and Information Sciences . 36. 69–73 . Springer, Berlin, Heidelberg. 10.1007/BFb0006409. 3-540-11038-0 .
Ito's formula in UMD Banach spaces and regularity of solutions of the Zakai equation. Zdzislaw. Brzezniak. Jan M. A. M.. van Neerven. Mark C.. Veraar. Lutz. Weis. Journal of Differential Equations. 245. 1. 2008. 30–58 . 10.1016/j.jde.2008.03.026 . 0804.0302 .

Itô's lemma explained

Motivation

Informal derivation

Geometric intuition

Mathematical formulation of Itô's lemma

Itô drift-diffusion processes (due to: Kunita–Watanabe)

Poisson jump processes

Non-continuous semimartingales

Multiple non-continuous jump processes

Examples

Geometric Brownian motion

Doléans-Dade exponential

Black–Scholes formula

Product rule for Itô processes

Itô's formula for functions with finite quadratic variation

Infinite-dimensional formulas

See also

References

External links

Notes and References