The term Blahut–Arimoto algorithm is often used to refer to a class of algorithms for computing numerically either the information theoretic capacity of a channel, the rate-distortion function of a source or a source encoding (i.e. compression to remove the redundancy). They are iterative algorithms that eventually converge to one of the maxima of the optimization problem that is associated with these information theoretic concepts.
For the case of channel capacity, the algorithm was independently invented by Suguru Arimoto[1] and Richard Blahut.[2] In addition, Blahut's treatment gives algorithms for computing rate distortion and generalized capacity with input contraints (i.e. the capacity-cost function, analogous to rate-distortion). These algorithms are most applicable to the case of arbitrary finite alphabet sources. Much work has been done to extend it to more general problem instances.[3] [4] Recently, a version of the algorithm that accounts for continuous and multivariate outputs was proposed with applications in cellular signaling. There exists also a version of Blahut–Arimoto algorithm for directed information.[5]
A discrete memoryless channel (DMC) can be specified using two random variables
X,Y
lX,lY
p(y|x)
C:=\suppI(X;Y)
|lX|=n,|lY|=m
pY|X
n x m
ith
jth
wij
C=maxpmaxQ
n | |
\sum | |
i=1 |
m | |
\sum | |
j=1 |
piwijlog\left(\dfrac{Qji
where
p
Q
p
X
p
(p1,p2...,pn),
np | |
\sum | |
i |
=1
Q=(qji)
m x n
Y
X
1\leqi\leqn,1\leqj\leqm
qji\geq0,qji=0\Leftrightarrowwij=0
n | |
\sum | |
i=1 |
qji=1
Then upon picking a random probability distribution
p0:=
0 | |
(p | |
1, |
0 | |
p | |
2, |
...
0 | |
p | |
n) |
X
(p0,Q0,p1,Q1...)
t | |
(q | |
ji |
):=
t | |
\dfrac{p | |
iw |
ij
t+1 | |
p | |
k |
:=\dfrac
m | |
{\prod | |
j=1 |
t | |
(q | |
jk |
wkj | |
) |
For
t=0,1,2...
Then, using the theory of optimization, specifically coordinate descent, Yeung[9] showed that the sequence indeed converges to the required maximum. That is,
\limt\to
n | |
\sum | |
i=1 |
m | |
\sum | |
j=1 |
t | |
p | |
i |
wijlog\left(\dfrac
t | |
{Q | |
ji |
So given a channel law
p(y|x)
Suppose we have a source
X
p(x)
p(\hat{x}|x)
\hat{X}
\langled(x,\hat{x})\rangle
X
\hat{X}
pt+1(\hat{x})=\sumxp(x)pt(\hat{x}|x)
pt+1(\hat{x}|x)=
pt(\hat{x | |
) |
\exp(-\betad(x,\hat{x}))}{\sum\hat{x
where
\beta
\beta