Giry monad explained
In mathematics, the Giry monad is a construction that assigns to a measurable space a space of probability measures over it, equipped with a canonical sigma-algebra. It is one of the main examples of a probability monad.
It is implicitly used in probability theory whenever one considers probability measures which depend measurably on a parameter (giving rise to Markov kernels), or when one has probability measures over probability measures (such as in de Finetti's theorem).
Like many iterable constructions, it has the category-theoretic structure of a monad, on the category of measurable spaces.
Construction
The Giry monad, like every monad, consists of three structures:
- A functorial assignment, which in this case assigns to a measurable space
a space of probability measures
over it;
called the
unit, which in this case assigns to each element of a space the
Dirac measure over it;
called the
multiplication, which in this case assigns to each
probability measure over probability measures its
expected value.
The space of probability measures
Let
be a
measurable space. Denote by
the set of
probability measures over
. We equip the set
with a
sigma-algebra as follows. First of all, for every measurable set
, define the map
by
. We then define the sigma algebra
on
to be the smallest sigma-algebra which makes the maps
measurable, for all
(where
is assumed equipped with the
Borel sigma-algebra).
Equivalently,
can be defined as the smallest sigma-algebra on
which makes the maps
measurable for all bounded measurable
.
The assignment
(X,l{F})\mapsto(PX,l{PF})
is part of an endofunctor on the
category of measurable spaces, usually denoted again by
. Its action on
morphisms, i.e. on
measurable maps, is via the
pushforward of measures. Namely, given a measurable map
, one assigns to
the map
f*:(PX,l{PF})\to(PY,l{PG})
defined by
for all
and all measurable sets
.
The Dirac delta map
Given a measurable space
, the map
\delta:(X,l{F})\to(PX,l{PF})
maps an element
to the
Dirac measure
, defined on measurable subsets
by
\deltax(A)=1A(x)=
\begin{cases}
1&ifx\inA,\\
0&ifx\notinA.
\end{cases}
The expectation map
Let
, i.e. a probability measure over the probability measures over
. We define the probability measure
by
l{E}\mu(A)=\intPXp(A)\mu(dp)
for all measurable
.This gives a measurable,
natural map
l{E}:(PPX,l{PPF})\to(PX,l{PF})
.
Example: mixture distributions
A mixture distribution, or more generally a compound distribution, can be seen as an application of the map
. Let's see this for the case of a finite mixture. Let
be probability measures on
, and consider the probability measure
given by the mixture
for all measurable
, for some weights
satisfying
. We can view the mixture
as the average
, where the measure on measures
, which in this case is discrete, is given by
More generally, the map
can be seen as the most general, non-parametric way to form arbitrary
mixture or
compound distributions.
The triple
is called the
Giry monad.
Relationship with Markov kernels
is that given
measurable spaces
and
, we have a
bijective correspondence between
measurable functions
and
Markov kernels
. This allows to view a Markov kernel, equivalently, as a measurably parametrized probability measure.
In more detail, given a measurable function
, one can obtain the Markov kernel
f\flat:(X,l{F})\to(Y,l{G})
as follows,
for every
and every measurable
(note that
is a probability measure). Conversely, given a Markov kernel
, one can form the measurable function
k\sharp:(X,l{F})\to(PY,l{PG})
mapping
to the probability measure
defined by
for every measurable
. The two assignments are mutually inverse.
HomMeas(X,PY)\congHomStoch(X,Y)
between the
category of measurable spaces and the
category of Markov kernels. In particular, the category of Markov kernels can be seen as the
Kleisli category of the Giry monad.
Product distributions
Given measurable spaces
(PX,l{PF}) x (PY,l{PG})\to(P(X x Y),l{P(F x G)})
usually denoted by
or by
.
The map
\nabla:PX x PY\toP(X x Y)
is in general not an isomorphism, since there are probability measures on
which are not product distributions, for example in case of
correlation.However, the maps
\nabla:PX x PY\toP(X x Y)
and the isomorphism
make the Giry monad a
monoidal monad, and so in particular a commutative
strong monad.
Further properties
is
standard Borel, so is
. Therefore the Giry monad restricts to the
full subcategory of standard Borel spaces.
, with the algebra structure map given by taking
expected values. For example, for
, the structure map
is given by
p\longmapsto\int[0,infty)xp(dx)
whenever
is supported on
and has finite expected value, and
otherwise.
See also
References
Further reading
External links