Folded Reed–Solomon code explained

In coding theory, folded Reed–Solomon codes are like Reed–Solomon codes, which are obtained by mapping

Reed–Solomon codewords over a larger alphabet by careful bundling of codeword symbols.

Folded Reed–Solomon codes are also a special case of Parvaresh–Vardy codes.

Using optimal parameters one can decode with a rate of R, and achieve a decoding radius of 1 - R.

The term "folded Reed–Solomon codes" was coined in a paper by V.Y. Krachkovsky with an algorithm that presented Reed–Solomon codes with many random "phased burst" errors.^[1] The list-decoding algorithm for folded RS codes corrects beyond the

1-\sqrt{R}

bound for Reed–Solomon codes achieved by the Guruswami–Sudan algorithm for such phased burst errors.

History

One of the ongoing challenges in Coding Theory is to have error correcting codes achieve an optimal trade-off between (Coding) Rate and Error-Correction Radius. Though this may not be possible to achieve practically (due to Noisy Channel Coding Theory issues), quasi optimal tradeoffs can be achieved theoretically.

Prior to Folded Reed–Solomon codes being devised, the best Error-Correction Radius achieved was

1-\sqrt{R}

, by Reed–Solomon codes for all rates

An improvement upon this

1-\sqrt{R}

bound was achieved by Parvaresh and Vardy for rates

R<\tfrac{1}{16}.

For

R\to0

the Parvaresh–Vardy algorithm can decode a fraction

1-O(Rlog(1/R))

of errors.

Folded Reed–Solomon Codes improve on these previous constructions, and can be list decoded in polynomial time for a fraction

(1-R-\varepsilon)

of errors for any constant

\varepsilon>0

Definition

f(X)\mapsto\begin{bmatrix}f(1)\\f(\gamma)\\\vdots\\f(\gamma^m-1)\end{bmatrix},\begin{bmatrix}f(\gamma^m)\\f(\gamma^m+1)\\\vdots\\f(\gamma^2m-1)\end{bmatrix},\ldots,\begin{bmatrix}f(\gamma^n-m)\\f(\gamma^n-m+1)\\\vdots\\f(\gamma^n-1)\end{bmatrix}

Consider a Reed–Solomon

[n=q-1,k]_q

code of length

and dimension

and a folding parameter

m\ge1

. Assume that

divides

Mapping for Reed–Solomon codes like this:

f\mapsto\left\langlef(1),f\left(\gamma¹\right),f\left(\gamma²\right),\ldots,f\left(\gamma^n-1\right)\right\rangle

where

\gamma\inF_q

is a primitive element in

F_q=\left\{0,1,\gamma,\gamma^2,\ldots,\gamma^n-1\right\}

The

folded version of Reed Solomon code

, denoted

FRS_F,\gamma,m,k

is a code of block length

N=n/m

over

F^m

FRS_F,\gamma,m,k

are just

[q-1,k]

Reed Solomon codes with

consecutive symbols from RS codewords grouped together.

Graphic description

The above definition is made more clear by means of the diagram with

m=3

, where

is the folding parameter.

The message is denoted by

f(X)

, which when encoded using Reed–Solomon encoding, consists of values of

x_0,x_1,x_2,\ldots,x_n-1

, where

x_i=\gammaⁱ

Then bundling is performed in groups of 3 elements, to give a codeword of length

n/3

over the alphabet

	3
F
	q

Something to be observed here is that the folding operation demonstrated does not change the rate

of the original Reed–Solomon code.

To prove this, consider a linear

[n,k,d]_q

code, of length

, dimension

and distance

. The

folding operation will make it a

\left[\tfrac{n}{m},

\tfrac{k}{m},\tfrac{d}{m}\right]
	q^m

code. By this, the rate

R=\tfrac{k}{n}

will be the same.

Folded Reed–Solomon codes and the singleton bound

\delta

, of a code must satisfy

R\leqslant1-\delta+o(1)

where

is the rate of the code. As proved earlier, since the rate

is maintained, the relative distance

\delta\leqslant1-R

also meets the Singleton bound.

Why folding might help?

Folded Reed–Solomon codes are basically the same as Reed Solomon codes, just viewed over a larger alphabet. To show how this might help, consider a folded Reed–Solomon code with

m=3

. Decoding a Reed–Solomon code and folded Reed–Solomon code for the same fraction of errors

\rho

are tasks of almost the same computational intensity: one can unfold the received word of the folded Reed–Solomon code, treat it as a received word of the original Reed–Solomon code, and run the Reed–Solomon list decoding algorithm on it. Obviously, this list will contain all the folded Reed–Solomon codewords within distance

\rho

of the received word, along with some extras, which we can expurgate.

Also, decoding a folded Reed–Solomon code is an easier task. Suppose we want to correct a third of the errors. The decoding algorithm chosen must correct an error pattern that corrects every third symbol in the Reed–Solomon encoding. But after folding, this error pattern will corrupt all symbols over

	3
F
	q

and will eliminate the need for error correction. This propagation of errors is indicated by the blue color in the graphical description. This proves that for a fixed fraction of errors

\rho,

the folding operation reduces the channel's flexibility to distribute errors, which in turn leads to a reduction in the number of error patterns that need to be corrected.

How folded Reed–Solomon (FRS) codes and Parvaresh Vardy (PV) codes are related

We can relate Folded Reed Solomon codes with Parvaresh Vardy codes which encodes a polynomial

of degree

with polynomials

f_0=f,f₁,\ldots,f_s-1(s\geqslant2)

where

f_i(X)=f_i-1(X)^d\modE(X)

where

E(X)

is an irreducible polynomial. While choosing irreducible polynomial

E(X)=X^q-\gamma

and parameter

we should check if every polynomial

of degree at most

satisfies

f(\gammaX)=f(X)^d\modE(X)

since

f(\gammaX)

is just the shifted counterpart of

f(X)

where

\gamma

is the primitive element in

F_q.

Thus folded RS code with bundling together code symbols is PV code of order

s=m

for the set of evaluation points

\left\{1,\gamma,\gamma^2m

\left

(	n
	m

-1\right)m

,\ldots,\gamma

\right\}

If we compare the folded RS code to a PV code of order 2 for the set of evaluation points

\left\{1,\gamma,\ldots,\gamma^m-2,\gamma^m,\gamma^m+1,\ldots,\gamma^2m-2,\ldots,\gamma^n-m,\gamma^n-m+1,\ldots,\gamma^n-2\right\}

we can see that in PV encoding of

, for every

0\leqi\leqn/m-1

and every

0<j<m-1,f(\gamma^mi+j)

appears at

f(\gamma^mi+j)

and

	-1
f
	1(\gamma

\gamma^mi+j)

unlike in the folded FRS encoding in which it appears only once. Thus, the PV and folded RS codes have same information but only the rate of FRS is bigger by a factor of

2(m-1)/m

and hence the list decoding radius trade-off is better for folded RS code by just using the list decodability of the PV codes. The plus point is in choosing FRS code in a way that they are compressed forms of suitable PV code with similar error correction performance with better rate than corresponding PV code. One can use this idea to construct a folded RS codes of rate

that are list decodable up to radius approximately

1-R^s/[s+1]

for

s\geq1

. http://www.cse.buffalo.edu/~atri/papers/coding/folded-RS.pdf

Brief overview of list-decoding folded Reed–Solomon codes

A list decoding algorithm which runs in quadratic time to decode FRS code up to radius

1-R-\varepsilon

is presented by Guruswami. The algorithm essentially has three steps namely the interpolation step in which Welch–Berlekamp-style interpolation is used to interpolate the non-zero polynomial

Q(X,Y_1,Y_2,\ldots,Y_s)=A_0(X)+A_1(X)Y₁+A_2(X)Y₂+ … +A_s(X)Y_s,

after which all the polynomials

f\inF_q[X]

with degree

k-1

satisfying the equation derived in interpolation are found. In the third step the actual list of close-by codewords are known by pruning the solution subspace which takes

q^s

time.

Linear-algebraic list decoding algorithm

Guruswami presents a

	\Omega(1/\varepsilon²⁾
n

time list decoding algorithm based on linear-algebra, which can decode folded Reed–Solomon code up to radius

1-R-\varepsilon

with a list-size of

	O(1/\varepsilon²⁾
{n

}. There are three steps in this algorithm: Interpolation Step, Root Finding Step and Prune Step. In the Interpolation step it will try to find the candidate message polynomial

f(x)

by solving a linear system. In the Root Finding step, it will try to find the solution subspace by solving another linear system. The last step will try to prune the solution subspace gained in the second step. We will introduce each step in details in the following.

Step 1: The interpolation step

It is a Welch–Berlekamp-style interpolation (because it can be viewed as the higher-dimensional generalization of the Welch–Berlekamp algorithm). Suppose we received a codeword

of the

-folded Reed–Solomon code as shown below

\left(\begin{bmatrix}y_0\\y_1\\y_{2\\ … \\y}_m-1\end{bmatrix},\begin{bmatrix}y_m\\y_m+1\\y_m+2\\ … \\y_2m-1\end{bmatrix},\ldots,\begin{bmatrix}y_n-m\\y_n-m+1\\y_n-m+2\\ … \\y_n-1\end{bmatrix}\right)

We interpolate the nonzero polynomial

Q(X,Y_1,\ldots,Y_s)=A_0(X)+A_1(X)Y₁+ … +A_s(X)Y_s, \begin{cases}\deg(A_i)\leqslantD&1\leqslanti\leqslants\ \deg(A₀₎\leqslantD+k-1&\end{cases}

by using a carefully chosen degree parameter

D=\left\lfloor

	N(m-s+1)-k+1
	s+1

\right\rfloor

So the interpolation requirements will be

Q\left(\gamma^im+j,y_im+j,y_im+j+1,\ldots,y_im+j+s-1\right)=0, for i=0,1,\ldots,\tfrac{n}{m}-1,j=0,1,\ldots,m-s.

Then the number of monomials in

Q(X,Y_1,\ldots,Y_s)

(D+1)s+D+k=(D+1)(s+1)+k-1>N(m-s+1)

Because the number of monomials in

Q(X,Y_1,\ldots,Y_s)

is greater than the number of interpolation conditions. We have below lemma

Lemma 1.

0 ≠ Q\inF_q[X,Y_1,\ldots,Y_s]

satisfying the above interpolation condition can be found by solving a homogeneous linear system over

F_q

with at most

constraints and variables. Moreover this interpolation can be performed in

O(Nmlog^{2(Nm)loglog(Nm))}

operations over

F_q

This lemma shows us that the interpolation step can be done in near-linear time.

For now, we have talked about everything we need for the multivariate polynomial

Q(X,Y_1,\ldots,Y_s)

. The remaining task is to focus on the message polynomials

f(X)

Lemma 2. If a candidate message polynomial

f(X)\inF[X]

is a polynomial of degree at most

k-1

whose Folded Reed-Solomon encoding agrees with the received word

in at least

columns with

t>{{D+k-1}\over{m-s+1}},

then

Q(X,f(X),f(\gammaX),\ldots,f(\gamma_s-1X))=0.

Here "agree" means that all the

values in a column should match the corresponding values in codeword

This lemma shows us that any such polynomial

Q(X,Y_1,\ldots,Y_s)

presents an algebraic condition that must be satisfied for those message polynomials

f(x)

that we are interested in list decoding.

Combining Lemma 2 and parameter

, we have

t(m-s+1)>	N(m-s+1)+s(k-1)
	s+1

Further we can get the decoding bound

t\geqslant

	N	+
	s+1

	s	⋅
	s+1

	k	=N\left(
	m-s+1

	1	+
	s+1

	s	⋅
	s+1

	mR
	m-s+1

\right)

We notice that the fractional agreement is

\dfrac{1}{s+1}+\dfrac{s}{s+1} ⋅ \dfrac{mR}{m-s+1}

Step 2: The root-finding step

During this step, our task focus on how to find all polynomials

f\in{F_q[X]}

with degree no more than

k-1

and satisfy the equation we get from Step 1, namely

A_0(X)+A_1(X)f(X)+A_2(X)f(\gammaX)+ … +

	s-1
A
	s(X)f(\gamma

X)=0

Since the above equation forms a linear system equations over

F_q

in the coefficients

f_0,f_1,\ldots,f_k-1

of the polynomial

f(X)=f₀+f_1X+ … +f_k-1X^k-1,

the solutions to the above equation is an affine subspace of

	k
F
	q

. This fact is the key point that gives rise to an efficient algorithm - we can solve the linear system.

It is natural to ask how large is the dimension of the solution? Is there any upper bound on the dimension? Having an upper bound is very important in constructing an efficient list decoding algorithm because one can simply output all the codewords for any given decoding problem.

Actually it indeed has an upper bound as below lemma argues.

Lemma 3. If the order of

\gamma

is at least

(in particular when

\gamma

is primitive), then the dimension of the solution is at most

s-1

This lemma shows us the upper bound of the dimension for the solution space.

Finally, based on the above analysis, we have below theorem

Theorem 1. For the folded Reed–Solomon code

	(m)
FRS
	q[n,k]

of block length

N=\tfrac{n}{m}

and rate

R=\tfrac{k}{n},

the following holds for all integers

s,1\leqslants\leqslantm

. Given a received word

y\in

	m)
(F
	q

, in

O((Nmlogq)²⁾

time, one can find a basis for a subspace of dimension at most

s-1

that contains all message polynomials

f\inF_q[X]

of degree less than

whose FRS encoding differs from

in at most a fraction

	s	\left(1-
	s+1

	mR
	m-s+1

\right)

of the

codeword positions.

When

s=m=1

, we notice that this reduces to a unique decoding algorithm with up to a fraction

(1-R)/2

of errors. In other words, we can treat unique decoding algorithm as a specialty of list decoding algorithm. The quantity is about

n^{O(1/\varepsilon)}

for the parameter choices that achieve a list decoding radius of

1-R-\varepsilon

Theorem 1 tells us exactly how large the error radius would be.

Now we finally get the solution subspace. However, there is still one problem standing. The list size in the worst case is

n^{\Omega(1/\varepsilon)}

. But the actual list of close-by codewords is only a small set within that subspace. So we need some process to prune the subspace to narrow it down. This prune process takes

q^s

time in the worst case. Unfortunately it is not known how to improve the running time because we do not know how to improve the bound of the list size for folded Reed-Solomon code.

Things get better if we change the code by carefully choosing a subset of all possible degree

k-1

polynomials as messages, the list size shows to be much smaller while only losing a little bit in the rate. We will talk about this briefly in next step.

Step 3: The prune step

By converting the problem of decoding a folded Reed–Solomon code into two linear systems, one linear system that is used for the interpolation step and another linear system to find the candidate solution subspace, the complexity of the decoding problem is successfully reduced to quadratic. However, in the worst case, the bound of list size of the output is pretty bad.

It was mentioned in Step 2 that if one carefully chooses only a subset of all possible degree

k-1

polynomials as messages, the list size can be much reduced. Here we will expand our discussion.

To achieve this goal, the idea is to limit the coefficient vector

(f_0,f_1,\ldots,f_k-1)

to a special subset

\nu\subseteq

	k
F
	q

, which satisfies below two conditions:

Condition 1. The set

\nu

must be large enough (

|\nu|\geqq^{(1-\varepsilon)k}

This is to make sure that the rate will be at most reduced by factor of

(1-\varepsilon)

Condition 2. The set

\nu

should have low intersection with any subspace

of dimension

satisfying

S\subset

	k
F
	q

and

|S\cap\nu|\leqslantL.

Such a subset is called subspace-evasive subset.

The bound for the list size at worst case is

n^{\Omega(1/\varepsilon)}

, and it can be reduced to a relative small bound

O(1/\varepsilon²⁾

by using subspace-evasive subsets.

During this step, as it has to check each element of the solution subspace that we get from Step 2, it takes

q^s

time in the worst case (

is the dimension of the solution subspace).

Dvir and Lovett improved the result based on the work of Guruswami, which can reduce the list size to a constant.

Here is only presented the idea that is used to prune the solution subspace. For the details of the prune process, please refer to papers by Guruswami, Dvir and Lovett, which are listed in the reference.

Summary

If we don't consider the Step 3, this algorithm can run in quadratic time. A summary for this algorithm is listed below.

Overview of linear-algebraic list decoding algorithm for FRS code

Steps

Runtime

	\Omega(1/\varepsilon²⁾
n

Error radius

1-R-\varepsilon

List size

	O(1/\varepsilon²⁾
n

References

Atri Rudra's Lecture Notes: Folded Reed–Solomon Codes
Atri Rudra's Lecture Notes: Bounds
Guruswami . V. . Rudra . A. . Explicit capacity-achieving list-decodable codes or decoding Folded Reed-Solomon Codes up to their distance . Proceedings of the thirty-eighth annual ACM symposium on Theory of Computing . May 2006 . 1595931341 . 1–10 . 10.1145/1132516.1132518 .
Atri . Rudra . 3. List Decoding of Folded Reed-Solomon Codes . https://cse.buffalo.edu/faculty/atri/papers/coding/thesis-chaps/chap3.pdf . List Decoding and Property Testing of Error Correcting Codes . 2007 . PhD . University of Washington . 29–51 .
Venkatesan Guruswami's lecture notes: Elementary bounds on codes
Venkatesan Guruswami's lecture notes: List Decoding Folded Reed–Solomon Code
Book: Venkatesan . Guruswami . 2011 IEEE 26th Annual Conference on Computational Complexity . Linear-Algebraic List Decoding of Folded Reed-Solomon Codes . 1106.0436 . 10.1109/CCC.2011.22 . 2011arXiv1106.0436G . 77–85 . 2011 . 978-1-4577-0179-5 . 12067714 .
Zeev . Dvirl . Shachar . Lovett. Subspace evasive sets . 1110.5696. cs.CC . 2011 .
Brander . K. . Interpolation and list decoding of algebraic codes . 2010 . PhD . Technical University of Denmark .
V. Y. . Krachkovsky . 10.1109/TIT.2003.819333 . Reed–Solomon codes for correcting phased error bursts . IEEE Trans. Inf. Theory . 2003 . 49. 11 . 2975–84.

Notes and References

V.Y. . Krachkovsky . Reed-Solomon codes for correcting phased error bursts . IEEE Transactions on Information Theory . 49 . 11 . 2975–84 . November 2003 . 10.1109/TIT.2003.819333 .

Folded Reed–Solomon code explained

History

Definition

Graphic description

Folded Reed–Solomon codes and the singleton bound

Why folding might help?

How folded Reed–Solomon (FRS) codes and Parvaresh Vardy (PV) codes are related

Brief overview of list-decoding folded Reed–Solomon codes

Linear-algebraic list decoding algorithm

Step 1: The interpolation step

Step 2: The root-finding step

Step 3: The prune step

Summary

See also

References

Notes and References