Binary Justesen Codes | ||||||||||
Namesake: | Jørn Justesen | |||||||||
Block Length: | n | |||||||||
Message Length: | k | |||||||||
Rate: | =
| |||||||||
Distance: | \deltan \delta\geq(1-R-\epsilon)
-\epsilon)\sim0.11(1-R-\epsilon) \epsilon>0 | |||||||||
Alphabet Size: | 2 | |||||||||
Notation: | \left[n,k/2,\deltan\right]2 | |||||||||
Properties: | constant rate, constant relative distance, constant alphabet size |
In coding theory, Justesen codes form a class of error-correcting codes that have a constant rate, constant relative distance, and a constant alphabet size.
Before the Justesen error correction code was discovered, no error correction code was known that had all of these three parameters as a constant.
Subsequently, other ECC codes with this property have been discovered, for example expander codes.These codes have important applications in computer science such as in the construction of small-bias sample spaces.
Justesen codes are derived as the code concatenation of a Reed–Solomon code and the Wozencraft ensemble.
The Reed–Solomon codes used achieve constant rate and constant relative distance at the expense of an alphabet size that is linear in the message length.
The Wozencraft ensemble is a family of codes that achieve constant rate and constant alphabet size, but the relative distance is only constant for most of the codes in the family.
The concatenation of the two codes first encodes the message using the Reed–Solomon code, and then encodes each symbol of the codeword further using a code from the Wozencraft ensemble – using a different code of the ensemble at each position of the codeword.
This is different from usual code concatenation where the inner codes are the same for each position. The Justesen code can be constructed very efficiently using only logarithmic space.
The Justesen code is the concatenation of an
(N,K,D) | |
qk |
Cout
(n,k,d)q
i | |
C | |
in |
1\lei\leN
More precisely, the concatenation of these codes, denoted by
Cout\circ
1 | |
(C | |
in |
N | |
,...,C | |
in |
)
m\in[qk]K
Cout
Cout(m)=(c1,c2,..,cN)
Then we apply each code of N linear inner codes to each coordinate of that codeword to produce the final codeword; that is,
Cout\circ
N)(m) | |
(C | |
in |
=
1(c | |
(C | |
1),C |
2(c | |
2),..,C |
N(c | |
N)) |
Look back to the definition of the outer code and linear inner codes, this definition of the Justesen code makes sense because the codeword of the outer code is a vector with
N
N
N
Here for the Justesen code, the outer code
Cout
F | |
qk |
F | |
qk |
-\{0\}
R
0
R
1
The outer code
Cout
\deltaout=1-R
N=qk-1
\{
\alpha | |
C | |
in |
\}
|
As the linear codes in the Wonzencraft ensemble have the rate
1 | |
2 |
C*=Cout\circ
N) | |
(C | |
in |
R | |
2 |
C*
Let
\varepsilon>0.
C*
-1 | |
(1-R-\varepsilon)H | |
q |
\left(\tfrac{1}{2}-\varepsilon\right).
In order to prove a lower bound for the distance of a code
C*
\Delta(c1,c2)
c1
c2
m1 ≠ m2\in\left
(F | |
qk |
\right)K,
we want a lower bound for
*(m | |
\Delta(C | |
2)). |
Notice that if
Cout(m)=(c1, … ,cN)
C*(m)=
1(c | |
(C | |
1), … ,C |
N(c | |
N)) |
*(m | |
\Delta(C | |
2)) |
1, | |
C | |
in |
… ,
N. | |
C | |
in |
Suppose
\begin{align} Cout(m1)&=\left
1 | |
(c | |
N |
\right)\\ Cout(m2)&=\left
2 | |
(c | |
N |
\right) \end{align}
Recall that
\left\{
1, | |
C | |
in |
… ,
N | |
C | |
in |
\right\}
(1-\varepsilon)N
i | |
C | |
in |
-1 | |
H | |
q |
\left(\tfrac{1}{2}-\varepsilon\right) ⋅ 2k.
1\leqslanti\leqslantN,
1 | |
c | |
i |
\ne
2 | |
c | |
i |
i | |
C | |
in |
\geqslant
-1 | |
H | |
q |
\left(\tfrac{1}{2}-\varepsilon\right) ⋅ 2k,
\Delta\left
i | |
(C | |
in |
\left
1 | |
(c | |
i |
\right),
i | |
C | |
in |
\left
2 | |
(c | |
i |
\right)\right)\geqslant
-1 | |
H | |
q |
\left(\tfrac{1}{2}-\varepsilon\right) ⋅ 2k.
Further, if we have
T
1\leqslanti\leqslantN
1 | |
c | |
i |
\ne
2 | |
c | |
i |
i | |
C | |
in |
\geqslant
-1 | |
H | |
q |
(\tfrac{1}{2}-\varepsilon) ⋅ 2k,
\Delta\left
*(m | |
(C | |
2) |
\right)\geqslant
-1 | |
H | |
q |
\left(\tfrac{1}{2}-\varepsilon\right) ⋅ 2k ⋅ T.
So now the final task is to find a lower bound for
T
S=\left\{i : 1\leqslanti\leqslantN,
1 | |
c | |
i |
\ne
2 | |
c | |
i |
\right\}.
Then
T
i, | |
C | |
in |
i\inS
-1 | |
H | |
q |
\left(\tfrac{1}{2}-\varepsilon\right) ⋅ 2k.
Now we want to estimate
|S|.
|S|=\Delta(Cout(m1),Cout(m2))\geqslant(1-R)N
Due to the Wozencraft Ensemble Theorem, there are at most
\varepsilonN
-1 | |
H | |
q |
(\tfrac{1}{2}-\varepsilon) ⋅ 2k,
T\geqslant|S|-\varepsilonN\geqslant(1-R)N-\varepsilonN=(1-R-\varepsilon)N.
Finally, we have
*(m | |
\Delta(C | |
2)) |
\geqslant
-1 | |
H | |
q |
\left(\tfrac{1}{2}-\varepsilon\right) ⋅ 2k ⋅ T\geqslant
-1 | |
H | |
q |
\left(\tfrac{1}{2}-\varepsilon\right) ⋅ 2k ⋅ (1-R-\varepsilon) ⋅ N.
This is true for any arbitrary
m1\nem2
C*
-1 | |
(1-R-\varepsilon)H | |
q |
\left(\tfrac{1}{2}-\varepsilon\right),
We want to consider the "strongly explicit code". So the question is what the "strongly explicit code" is. Loosely speaking, for linear code, the "explicit" property is related to the complexity of constructing its generator matrix G.
That in effect means that we can compute the matrix in logarithmic space without using the brute force algorithm to verify that a code has a given satisfied distance.
For the other codes that are not linear, we can consider the complexity of the encoding algorithm.
So by far, we can see that the Wonzencraft ensemble and Reed-Solomon codes are strongly explicit. Therefore, we have the following result:
Corollary: The concatenated code
C*
R
\delta
The following slightly different code is referred to as the Justesen code in MacWilliams/MacWilliams. It is the particular case of the above-considered Justesen code for a very particular Wonzencraft ensemble:
Let R be a Reed-Solomon code of length N = 2m - 1, rank K and minimum weight N - K + 1.
The symbols of R are elements of F = GF(2m) and the codewords are obtained by taking every polynomial ƒ over F of degree less than K and listing the values of ƒ on the non-zero elements of F in some predetermined order.
Let α be a primitive element of F. For a codeword a = (a1, ..., aN) from R, let b be the vector of length 2N over F given by
b=\left(a1,a1,a2,\alpha1a2,\ldots,aN,\alphaN-1aN\right)
and let c be the vector of length 2N m obtained from b by expressing each element of F as a binary vector of length m. The Justesen code is the linear code containing all such c.
The parameters of this code are length 2m N, dimension m K and minimum distance at least
\ell | |
\sum | |
i=1 |
i\binom{2m}{i},
where
\ell
\ell | |
\sum | |
i=1 |
\binom{2m}{i}\leqN-K+1