In information theory, dual total correlation,[1] information rate,[2] excess entropy,[3] [4] or binding information[5] is one of several known non-negative generalizations of mutual information. While total correlation is bounded by the sum entropies of the n elements, the dual total correlation is bounded by the joint-entropy of the n elements. Although well behaved, dual total correlation has received much less attention than the total correlation. A measure known as "TSE-complexity" defines a continuum between the total correlation and dual total correlation.[3]
For a set of n random variables
\{X1,\ldots,Xn\}
D(X1,\ldots,Xn)
D(X1,\ldots,Xn)=H\left(X1,\ldots,Xn\right)-
n | |
\sum | |
i=1 |
H\left(Xi\midX1,\ldots,Xi-1,Xi+1,\ldots,Xn\right),
where
H(X1,\ldots,Xn)
\{X1,\ldots,Xn\}
H(Xi\mid … )
Xi
The dual total correlation normalized between [0,1] is simply the dual total correlation divided by its maximum value
H(X1,\ldots,Xn)
ND(X1,\ldots,Xn)=
D(X1,\ldots,Xn) | |
H(X1,\ldots,Xn) |
.
Dual total correlation is non-negative and bounded above by the joint entropy
H(X1,\ldots,Xn)
0\leqD(X1,\ldots,Xn)\leqH(X1,\ldots,Xn).
Secondly, Dual total correlation has a close relationship with total correlation,
C(X1,\ldots,Xn)
N-1
D(bf{X})=(N-1)C(bf{X})-
N | |
\sum | |
i=1 |
C(bf{X}-i)
where
bf{X}=\{X1,\ldots,Xn\}
bf{X}-i=\{X1,\ldots,Xi-1,Xi+1,\ldots,Xn\}
Furthermore, the total correlation and dual total correlation are related by the following bounds:
C(X1,\ldots,Xn) | |
n-1 |
\leqD(X1,\ldots,Xn)\leq(n-1) C(X1,\ldots,Xn).
Finally, the difference between the total correlation and the dual total correlation defines a novel measure of higher-order information-sharing: the O-information:[7]
\Omega(bf{X})=C(bf{X})-D(bf{X})
The O-information (first introduced as the "enigmatic information" by James and Crutchfield[8] is a signed measure that quantifies the extent to which the information in a multivariate random variable is dominated by synergistic interactions (in which case
\Omega(bf{X})<0
\Omega(bf{X})>0
Han (1978) originally defined the dual total correlation as,
\begin{align} &D(X1,\ldots,Xn)\\[10pt] \equiv{}&\left[
n | |
\sum | |
i=1 |
H(X1,\ldots,Xi-1,Xi+1,\ldots,Xn)\right]-(n-1) H(X1,\ldots,Xn) . \end{align}
\begin{align} &D(X1,\ldots,Xn)\\[10pt] \equiv{}&\left[
n | |
\sum | |
i=1 |
H(X1,\ldots,Xi-1,Xi+1,\ldots,Xn)\right]-(n-1) H(X1,\ldots,Xn)\\ ={}&\left[
n | |
\sum | |
i=1 |
H(X1,\ldots,Xi-1,Xi+1,\ldots,Xn)\right]+(1-n) H(X1,\ldots,Xn)\\ ={}&H(X1,\ldots,Xn)+\left[
n | |
\sum | |
i=1 |
H(X1,\ldots,Xi-1,Xi+1,\ldots,Xn)-H(X1,\ldots,Xn)\right]\\ ={}&H\left(X1,\ldots,Xn\right)-
n | |
\sum | |
i=1 |
H\left(Xi\midX1,\ldots,Xi-1,Xi+1,\ldots,Xn\right) . \end{align}