In information theory, Shannon–Fano–Elias coding is a precursor to arithmetic coding, in which probabilities are used to determine codewords.[1] It is named for Claude Shannon, Robert Fano, and Peter Elias.
Given a discrete random variable X of ordered values to be encoded, let be the probability for any x in X. Define a function
\barF(x)=
\sum | |
xi<x |
p(xi)+
12 | |
p(x) |
Algorithm:
For each x in X,
Let Z be the binary expansion of
\barF(x)
Choose the length of the encoding of x,
L(x)
\left\lceillog2
1 | |
p(x) |
\right\rceil+1
Choose the encoding of x,
code(x)
L(x)
Let X =, with probabilities p = .
For A
\barF(A)=
12 | |
p(A) |
=
12 | |
⋅ |
13 | |
= |
0.1666\ldots
In binary, Z(A) = 0.0010101010...
L(A)=\left\lceillog2
1 | |||
|
\right\rceil+1=3
code(A) is 001
For B
\barF(B)=p(A)+
12 | |
p(B) |
=
13 | |
+ |
12 | |
⋅ |
14 | |
= |
0.4583333\ldots
In binary, Z(B) = 0.01110101010101...
L(B)=\left\lceillog2
1 | |||
|
\right\rceil+1=3
code(B) is 011
For C
\barF(C)=p(A)+p(B)+
12 | |
p(C) |
=
13 | |
+ |
14 | |
+ |
12 | |
⋅ |
16 | |
= |
0.66666\ldots
In binary, Z(C) = 0.101010101010...
L(C)=\left\lceillog2
1 | |||
|
\right\rceil+1=4
code(C) is 1010
For D
\barF(D)=p(A)+p(B)+p(C)+
12 | |
p(D) |
=
13 | |
+ |
14 | |
+ |
16 | |
+ |
12 | |
⋅ |
14 | |
= |
0.875
In binary, Z(D) = 0.111
L(D)=\left\lceillog2
1 | |||
|
\right\rceil+1=3
code(D) is 111
Shannon–Fano–Elias coding produces a binary prefix code, allowing for direct decoding.
Let bcode(x) be the rational number formed by adding a decimal point before a binary code. For example, if code(C) = 1010 then bcode(C) = 0.1010. For all x, if no y exists such that
\operatorname{bcode}(x)\le\operatorname{bcode}(y)<\operatorname{bcode}(x)+2-L(x)
By comparing F to the CDF of X, this property may be demonstrated graphically for Shannon–Fano–Elias coding.
By definition of L it follows that
2-L(x)\le
12 | |
p(x) |
\barF(y)-\operatorname{bcode}(y)\le2-L(y)
So the above graph demonstrates that the
\operatorname{bcode}(y)-\operatorname{bcode}(x)>p(x)\ge2-L(x)
The average code length is
LC(X)=\sumxp(x)L(x)=\sumxp(x)\left(\left\lceillog2
1 | |
p(x) |
\right\rceil+1\right)
H(X)+1\leLC(X)<H(X)+2