The Hammersley–Clifford theorem is a result in probability theory, mathematical statistics and statistical mechanics that gives necessary and sufficient conditions under which a strictly positive probability distribution (of events in a probability space) can be represented as events generated by a Markov network (also known as a Markov random field). It is the fundamental theorem of random fields.[1] It states that a probability distribution that has a strictly positive mass or density satisfies one of the Markov properties with respect to an undirected graph G if and only if it is a Gibbs random field, that is, its density can be factorized over the cliques (or complete subgraphs) of the graph.
The relationship between Markov and Gibbs random fields was initiated by Roland Dobrushin and Frank Spitzer in the context of statistical mechanics. The theorem is named after John Hammersley and Peter Clifford, who proved the equivalence in an unpublished paper in 1971. Simpler proofs using the inclusion–exclusion principle were given independently by Geoffrey Grimmett, Preston and Sherman in 1973, with a further proof by Julian Besag in 1974.
It is a trivial matter to show that a Gibbs random field satisfies every Markov property. As an example of this fact, see the following:
In the image to the right, a Gibbs random field over the provided graph has the form
\Pr(A,B,C,D,E,F)\proptof1(A,B,D)f2(A,C,D)f3(C,D,F)f4(C,E,F)
C
D
A,B\perpE,F|C,D
C,D
A,B
E,F
With
C
D
\Pr(A,B,E,F|C=c,D=d)\propto[f1(A,B,d)f2(A,c,d)] ⋅ [f3(c,d,F)f4(c,E,F)]=g1(A,B)g2(E,F)
g1(A,B)=f1(A,B,d)f2(A,c,d)
g2(E,F)=f3(c,d,F)f4(c,E,F)
A,B\perpE,F|C,D
To establish that every positive probability distribution that satisfies the local Markov property is also a Gibbs random field, the following lemma, which provides a means for combining different factorizations, needs to be proved:
Lemma 1
Let
U
\Theta,\Phi1,\Phi2,...,\Phin\subseteqU
\Psi1,\Psi2,...,\Psim\subseteqU
X
X
X
If
\Pr(U)=
n | |
f(\Theta)\prod | |
i=1 |
gi(\Phii)=
m | |
\prod | |
j=1 |
hj(\Psij)
for functions
f,g1,g2,...gn
h1,h2,...,hm
h'1,h'2,...,h'm
g'1,g'2,...,g'n
\Pr(U)=
m | |
(\prod | |
j=1 |
h'j(\Theta\cap\Psij))(\prod
n | |
i=1 |
g'i(\Phii))
In other words,
m | |
\prod | |
j=1 |
hj(\Psij)
f(\Theta)
In order to use
m | |
\prod | |
j=1 |
hj(\Psij)
f(\Theta)
\Theta
\bar{\theta}
U\setminus\Theta
\Theta
X
\bar{\theta}[X]
\bar{\theta}
X\setminus\Theta
X
\Theta
Moreover, to factorize only
f(\Theta)
g1(\Phi1),g2(\Phi2),...,gn(\Phin)
\Theta
\Pr(U)=
n | |
f(\Theta)\prod | |
i=1 |
gi(\Phii)
will be re-expressed as
\Pr(U)=
n | |
(f(\Theta)\prod | |
i=1 |
gi(\Phii\cap\Theta,\bar{\theta}[\Phii]))(\prod
n | |
i=1 |
gi(\Phii) | |
gi(\Phii\cap\Theta,\bar{\theta |
[\Phii])})
For each
i=1,2,...,n
gi(\Phii\cap\Theta,\bar{\theta}[\Phii])
gi(\Phii)
\Theta
\bar{\theta}
Let
f'(\Theta)=
n | |
f(\Theta)\prod | |
i=1 |
gi(\Phii\cap\Theta,\bar{\theta}[\Phii])
g'i(\Phii)=
gi(\Phii) | |
gi(\Phii\cap\Theta,\bar{\theta |
[\Phii])}
i=1,2,...,n
\Pr(U)=
n | |
f'(\Theta)\prod | |
i=1 |
g'i(\Phii)=
m | |
\prod | |
j=1 |
hj(\Psij)
What is most important is that
g'i(\Phii)=
gi(\Phii) | |
gi(\Phii\cap\Theta,\bar{\theta |
[\Phii])}=1
\Phii
\bar{\theta}
g'i(\Phii)
\Theta
\bar{\theta}
Fixing all variables not in
\Theta
\bar{\theta}
\Pr(\Theta,\bar{\theta})=f'(\Theta)
n | |
\prod | |
i=1 |
g'i(\Phii\cap\Theta,\bar{\theta}[\Phii])=
m | |
\prod | |
j=1 |
hj(\Psij\cap\Theta,\bar{\theta}[\Psij])
Since
g'i(\Phii\cap\Theta,\bar{\theta}[\Phii])=1
f'(\Theta)=
m | |
\prod | |
j=1 |
hj(\Psij\cap\Theta,\bar{\theta}[\Psij])
Letting
h'j(\Theta\cap\Psij)=hj(\Psij\cap\Theta,\bar{\theta}[\Psij])
f'(\Theta)=
m | |
\prod | |
j=1 |
h'j(\Theta\cap\Psij)
\Pr(U)=
m | |
(\prod | |
j=1 |
h'j(\Theta\cap\Psij))(\prod
n | |
i=1 |
g'i(\Phii))
Lemma 1 provides a means of combining two different factorizations of
\Pr(U)
x\inU
fx
f-x
\Pr(U)=fx(x,\partialx)f-x(U\setminus\{x\})
where
\partialx
x
\Pr(U)
End of Proof