Polar factorization theorem explained

In optimal transport, a branch of mathematics, polar factorization of vector fields is a basic result due to Brenier (1987),[1] with antecedents of Knott-Smith (1984)[2] and Rachev (1985),[3] that generalizes many existing results among which are the polar decomposition of real matrices, and the rearrangement of real-valued functions.

The theorem

Notation. Denote

\xi\#\mu

the image measure of

\mu

through the map

\xi

.

Definition: Measure preserving map. Let

(X,\mu)

and

(Y,\nu)

be some probability spaces and

\sigma:XY

a measurable map. Then,

\sigma

is said to be measure preserving iff

\sigma\#\mu=\nu

, where

\#

is the pushforward measure. Spelled out: for every

\nu

-measurable subset

\Omega

of

Y

,

\sigma-1(\Omega)

is

\mu

-measurable, and

\mu(\sigma-1(\Omega))=\nu(\Omega)

. The latter is equivalent to:

\intX(f\circ\sigma)(x)\mu(dx)=\intX(\sigma*f)(x)\mu(dx)=\intYf(y)(\sigma\#\mu)(dy)=\intYf(y)\nu(dy)

where

f

is

\nu

-integrable and

f\circ\sigma

is

\mu

-integrable.

Theorem. Consider a map

\xi:\OmegaRd

where

\Omega

is a convex subset of

Rd

, and

\mu

a measure on

\Omega

which is absolutely continuous. Assume that

\xi\#\mu

is absolutely continuous. Then there is a convex function

\varphi:\OmegaR

and a map

\sigma:\Omega\Omega

preserving

\mu

such that

\xi=\left(\nabla\varphi\right)\circ\sigma

In addition,

\nabla\varphi

and

\sigma

are uniquely defined almost everywhere.[4]

Applications and connections

Dimension 1

In dimension 1, and when

\mu

is the Lebesgue measure over the unit interval, the result specializes to Ryff's theorem.[5] When

d=1

and

\mu

is the uniform distribution over

\left[0,1\right]

, the polar decomposition boils down to

\xi\left(t\right)

-1
=F
X

\left(\sigma\left(t\right)\right)

where

FX

is cumulative distribution function of the random variable

\xi\left(U\right)

and

U

has a uniform distribution over

\left[0,1\right]

.

FX

is assumed to be continuous, and

\sigma\left(t\right)=FX\left(\xi\left(t\right)\right)

preserves the Lebesgue measure on

\left[0,1\right]

.

Polar decomposition of matrices

When

\xi

is a linear map and

\mu

is the Gaussian normal distribution, the result coincides with the polar decomposition of matrices. Assuming

\xi\left(x\right)=Mx

where

M

is an invertible

d x d

matrix and considering

\mu

the

l{N}\left(0,Id\right)

probability measure, the polar decomposition boils down to

M=SO

where

S

is a symmetric positive definite matrix, and

O

an orthogonal matrix. The connection with the polar factorization is

\varphi\left(x\right)=x\topSx/2

which is convex, and

\sigma\left(x\right)=Ox

which preserves the

l{N}\left(0,Id\right)

measure.

Helmholtz decomposition

The results also allow to recover Helmholtz decomposition. Letting

xV\left(x\right)

be a smooth vector field it can then be written in a unique way as

V=w+\nablap

where

p

is a smooth real function defined on

\Omega

, unique up to an additive constant, and

w

is a smooth divergence free vector field, parallel to the boundary of

\Omega

.

The connection can be seen by assuming

\mu

is the Lebesgue measure on a compact set

\Omega\subsetRn

and by writing

\xi

as a perturbation of the identity map

\xi\epsilon(x)=x+\epsilonV(x)

where

\epsilon

is small. The polar decomposition of

\xi\epsilon

is given by

\xi\epsilon=(\nabla\varphi\epsilon)\circ\sigma\epsilon

. Then, for any test function

f:RnR

the following holds:

\int\Omegaf(x+\epsilonV(x))dx=\int\Omegaf((\nabla\varphi \epsilon)\circ\sigma\epsilon\left(x\right))dx=\int\Omega f(\nabla\varphi\epsilon\left(x\right))dx

where the fact that

\sigma\epsilon

was preserving the Lebesgue measure was used in the second equality.

In fact, as

style\varphi0(x)=

1
2

\Vertx\Vert2

, one can expand

style\varphi\epsilon(x)=

1
2

\Vertx\Vert2+\epsilonp(x)+O(\epsilon2)

, and therefore

style\nabla\varphi\epsilon\left(x\right)=x+\epsilon\nablap(x)+O(\epsilon2)

. As a result,

style\int\Omega\left(V(x)-\nablap(x)\right)\nablaf(x))dx

for any smooth function

f

, which implies that

w\left(x\right)=V(x)-\nablap(x)

is divergence-free.[6]

Notes and References

  1. Brenier . Yann . Polar factorization and monotone rearrangement of vector‐valued functions . Communications on Pure and Applied Mathematics . 1991 . 44 . 4 . 375–417 . 10.1002/cpa.3160440402 . 16 April 2021.
  2. Knott . M. . Smith . C. S. . On the optimal mapping of distributions . Journal of Optimization Theory and Applications . 1984 . 43 . 39–49 . 10.1007/BF00934745 . 120208956 . 16 April 2021.
  3. Rachev . Svetlozar T. . The Monge–Kantorovich mass transference problem and its stochastic applications . Theory of Probability & Its Applications . 1985 . 29 . 4 . 647–676 . 10.1137/1129093 . 16 April 2021.
  4. Book: Santambrogio . Filippo . Optimal transport for applied mathematicians . 2015 . Birkäuser . New York . 10.1.1.726.35 .
  5. Ryff . John V. . Orbits of L1-Functions Under Doubly Stochastic Transformation . Transactions of the American Mathematical Society . 1965 . 117 . 92–100 . 10.2307/1994198 . 1994198 . 16 April 2021.
  6. Book: Villani . Cédric . Topics in optimal transportation . 2003 . American Mathematical Society.