In probability theory, regular conditional probability is a concept that formalizes the notion of conditioning on the outcome of a random variable. The resulting conditional probability distribution is a parametrized family of probability measures called a Markov kernel.
Consider two random variables
X,Y:\Omega\toR
\kappaY\mid:R x l{B}(R)\to[0,1]
If the random variable X is discrete
\kappaY\mid(x,A)=P(Y\inA\midX=x)=\begin{cases}
P(Y\inA,X=x) | |
P(X=x) |
&ifP(X=x)>0\\[3pt] arbitraryvalue&otherwise. \end{cases}
If the random variables X, Y are continuous with density
fX,Y(x,y)
\kappaY\mid(x,A)=\begin{cases}
\intAfX,Y(x,y)dy | |
\intRfX,Y(x,y)dy |
& if\intRfX,Y(x,y)dy>0\\[3pt] arbitraryvalue&otherwise. \end{cases}
A more general definition can be given in terms of conditional expectation. Consider a function
eY:R\to[0,1]
eY(X(\omega))=\operatornameE[1Y\midX](\omega)
\omega
\kappaY\mid(x,A)=eY(x).
As with conditional expectation, this can be further generalized to conditioning on a sigma algebra
l{F}
\Omega x l{B}(R)\to[0,1]
\kappaY\midl{F
For working with
\kappaY\mid
A\mapsto\kappaY\mid(x,A)
x\mapsto\kappaY\mid(x,A)
\kappaY\mid
The second condition holds trivially, but the proof of the first is more involved. It can be shown that if Y is a random element
\Omega\toS
\kappaY\mid
For discrete and continuous random variables, the conditional expectation can be expressed as
\begin{aligned} \operatornameE[Y\midX=x]&=\sumyyP(Y=y\midX=x)\\ \operatornameE[Y\midX=x]&=\intyfY\mid(x,y)dy \end{aligned}
fY\mid(x,y)
This result can be extended to measure theoretical conditional expectation using the regular conditional probability distribution:
\operatornameE[Y\midX](\omega)=\inty\kappaY\mid\sigma(X)(\omega,dy).
Let
(\Omega,lF,P)
T:\Omega → E
\Omega
(E,lE)
T
\Omega
\{T-1(x)\}x
P
x\inE
\nu:E x lF → [0,1],
x\inE
\nu(x, ⋅ )
lF
x\inE
A\inlF
\nu( ⋅ ,A)
E\to[0,1]
lE
A\inlF
B\inlE
P(A\capT-1(B))=\intB\nu(x,A)(P\circT-1)(dx)
P\circT-1
T*P
T
x\in\operatorname{supp}T,
T*P
B=E
A\capT-1(E)=A
P(A)=\intE\nu(x,A)(P\circT-1)(dx),
\nu(x,A)
P(A | T=x)
\Omega
P(A\midT=t)=\limU\supset
where the limit is taken over the net of open neighborhoods U of t as they become smaller with respect to set inclusion. This limit is defined if and only if the probability space is Radon, and only in the support of T, as described in the article. This is the restriction of the transition probability to the support of T. To describe this limiting process rigorously:
For every
\varepsilon>0,
\{T=t\}\subsetV\subsetU,
\left| | P(A\capV) |
P(V) |
-L\right|<\varepsilon,
L=P(A\midT=t)