A factor graph is a bipartite graph representing the factorization of a function. In probability theory and its applications, factor graphs are used to represent factorization of a probability distribution function, enabling efficient computations, such as the computation of marginal distributions through the sum–product algorithm. One of the important success stories of factor graphs and the sum–product algorithm is the decoding of capacity-approaching error-correcting codes, such as LDPC and turbo codes.
Factor graphs generalize constraint graphs. A factor whose value is either 0 or 1 is called a constraint. A constraint graph is a factor graph where all factors are constraints. The max-product algorithm for factor graphs can be viewed as a generalization of the arc-consistency algorithm for constraint processing.
A factor graph is a bipartite graph representing the factorization of a function. Given a factorization of a function
g(X1,X2,...,Xn)
g(X1,X2,...,Xn)=
m | |
\prod | |
j=1 |
fj(Sj),
where
Sj\subseteq\{X1,X2,...,Xn\}
G=(X,F,E)
X=\{X1,X2,...,Xn\}
F=\{f1,f2,...,fm\}
E
fj
Xk
Xk\inSj
g(X1,X2,...,Xn)\inR
Factor graphs can be combined with message passing algorithms to efficiently compute certain characteristics of the function
g(X1,X2,...,Xn)
Consider a function that factorizes as follows:
g(X1,X2,X3)=f1(X1)f2(X1,X2)f3(X1,X2)f4(X2,X3)
with a corresponding factor graph shown on the right. Observe that the factor graph has a cycle. If we merge
f2(X1,X2)f3(X1,X2)
A popular message passing algorithm on factor graphs is the sum–product algorithm, which efficiently computes all the marginals of the individual variables of the function. In particular, the marginal of variable
Xk
gk(Xk)=
\sum | |
X\bar{k |
X\bar{k
Xk
In practice, the sum–product algorithm is used for statistical inference, whereby
g(X1,X2,...,Xn)
The Hammersley–Clifford theorem shows that other probabilistic models such as Bayesian networks and Markov networks can be represented as factor graphs; the latter representation is frequently used when performing inference over such networks using belief propagation. On the other hand, Bayesian networks are more naturally suited for generative models, as they can directly represent the causalities of the model.