Locally recoverable code explained

Locally recoverable codes are a family of error correction codes that were introduced first by D. S. Papailiopoulos and A. G. Dimakis and have been widely studied in information theory due to their applications related to distributive and cloud storage systems.

[n,k,d,r]_q

LRC is an

[n,k,d]_q

linear code such that there is a function

f_i

that takes as input

and a set of

other coordinates of a codeword

c=(c₁,\ldots,c_n)\inC

different from

c_i

, and outputs

c_i

Definition

Let

be a

[n,k,d]_q

linear code. For

i\in\{1,\ldots,n\}

, let us denote by

r_i

the minimum number of other coordinates we have to look at to recover an erasure in coordinate

. The number

r_i

is said to be the locality of the
i

-th coordinate of the code. The locality of the code is defined as

r = \max\.

[n,k,d,r]_q

locally recoverable code (LRC) is an

[n,k,d]_q

linear code

C\in

	n
F
	q

with locality

Let

be an

[n,k,d]_q

-locally recoverable code. Then an erased component can be recovered linearly, i.e. for every

i\in\{1,\ldots,n\}

, the space of linear equations of the code contains elements of the form

x_i=

f(x
	i₁

,\ldots,

x
	i_r

)

, where

i_j ≠ i

Optimal locally recoverable codes

Theorem Let

n=(r+1)s

and let

be an

[n,k,d]_q

-locally recoverable code having

disjoint locality sets of size

r+1

. Then

d\leqn-k-\left\lceil

	k
	r

\right\rceil+2.

[n,k,d,r]_q

-LRC

is said to be optimal if the minimum distance of

satisfies

d=n-k-\left\lceil

	k
	r

\right\rceil+2.

Tamo–Barg codes

Let

f\inF_q[x]

be a polynomial and let

\ell

be a positive integer. Then

is said to be (

\ell

)-good if

•

has degree

r+1

• there exist distinct subsets

A₁,\ldots,A_\ell

F_q

such that

– for any

i\in\{1,\ldots,\ell\}

f(A_i)=\{t_i\}

for some

t_i\inF_q

, i.e.,

is constant on

A_i

–

\#A_i=r+1

–

A_i\capA_j=\varnothing

for any

i ≠ j

We say that is a splitting covering for

Tamo–Barg construction

The Tamo–Barg construction utilizes good polynomials.

• Suppose that a

(r,\ell)

-good polynomial

f(x)

over

F_q

is given with splitting covering

i\in\{1,\ldots,\ell\}

• Let

s\leq\ell-1

be a positive integer.

• Consider the following

F_q

-vector space of polynomials

V = \left\.

• Let $T = \bigcup_^\ell A_i$ .

• The code

\{\operatorname{ev}_T(g):g\inV\}

is an

((r+1)\ell,(s+1)r,d,r)

-optimal locally coverable code, where

\operatorname{ev}_T

denotes evaluation of

at all points in the set

Parameters of Tamo–Barg codes

A_i

are disjoint for

i\in\{1,\ldots,\ell\}

, the length of the code is

|T|=(r+1)\ell

• Dimension. The dimension of the code is

(s+1)r

, for

≤

\ell-1

, as each

g_i

has degree at most

\deg(f(x))-2

, covering a vector space of dimension

\deg(f(x))-1=r

, and by the construction of

, there are

s+1

distinct

g_i

• Distance. The distance is given by the fact that

V\subseteqF_q[x]_\leq

, where

k=r+1-2+s(r+1)

, and the obtained code is the Reed-Solomon code of degree at most

, so the minimum distance equals

(r+1)\ell-((r+1)-2+s(r+1))

• Locality. After the erasure of the single component, the evaluation at

a_i\inA_i

, where

|A_i|=r+1

, is unknown, but the evaluations for all other

a\inA_i

are known, so at most

evaluations are needed to uniquely determine the erased component, which gives us the locality of

To see this,

restricted to

A_j

can be described by a polynomial

of degree at most

\deg(f(x))-2=r+1-2=r-1

thanks to the form of the elements in

(i.e., thanks to the fact that

is constant on

A_j

, and the

g_i

's have degree at most

\deg(f(x))-2

). On the other hand

|A_j\backslash\{a_j\}|=r

, and

evaluations uniquely determine a polynomial of degree

r-1

. Therefore

can be constructed and evaluated at

a_j

to recover

g(a_j)

Example of Tamo–Barg construction

We will use

x⁵\inF₄₁[x]

to construct

[15,8,6,4]

-LRC. Notice that the degree of this polynomial is 5, and it is constant on

A_i

for

i\in\{1,\ldots,8\}

, where

A₁=\{1,10,16,18,37\}

A₂=2A₁

A₃=3A₁

A₄=4A₁

A₅=5A₁

A₆=6A₁

A₇=11A₁

, and

A₈=15A₁

	5
A
	1

=\{1\}

	5
A
	2

=\{32\}

	5
A
	3

=\{38\}

	5
A
	4

=\{40\}

	5
A
	5

=\{9\}

	5
A
	6

=\{27\}

	5
A
	7

=\{3\}

	5
A
	8

=\{14\}

. Hence,

x⁵

is a

(4,8)

-good polynomial over

F₄₁

by the definition. Now, we will use this polynomial to construct a code of dimension

k=8

and length

n=15

over

F₄₁

. The locality of this code is 4, which will allow us to recover a single server failure by looking at the information contained in at most 4 other servers.

Next, let us define the encoding polynomial:

f_a(x)=

	r-1
\sum
	i=0

	i
f
	i(x)x

, where

f_i(x)=

	k	-1
	r

\sum

i=0

a_i,jg(x)^j

. So,

f_a(x)=

a_0,0+

a_0,1x⁵+

a_1,0x+

a_1,1x⁶+

a_2,0x²+

a_2,1x⁷+

a_3,0x³+

a_3,1x⁸

(a_0,0,a_0,1,a_1,0,a_1,1,a_2,0,a_2,1,a_3,0,a_3,1)

. Encoding the vector

to a length 15 message vector

by multiplying

by the generator matrix

G=\begin{pmatrix} 1&1&1&1&1&1&1&1&1&1&1&1&1&1&1\\ 1&1&1&1&1&32&32&32&32&32&38&38&38&38&38\\ 1&10&16&18&37&2&20&32&33&36&3&7&13&29&30\\ 1&10&16&18&37&23&25&40&31&4&32&20&2&36&33\\ 1&18&10&37&16&4&31&40&23&25&9&8&5&21&39\\ 1&18&10&37&16&5&8&9&39&21&14&17&26&19&6\\ 1&16&37&10&18&8&5&9&21&39&27&15&24&35&22\\ 1&16&37&10&18&10&37&1&16&18&1&37&10&18&16 \end{pmatrix} .

m=(1,1,1,1,1,1,1,1)

gives the codeword

c=mG=(8,8,5,9,21,3,36,31,32,12,2,20,37,33,21)

Observe that we constructed an optimal LRC; therefore, using the Singleton bound, we have that the distance of this code is

d=n-k-\left\lceil

	k
	r

\right\rceil+2=15-8-2+2=7

. Thus, we can recover any 6 erasures from our codeword by looking at no more than 8 other components.

Locally recoverable codes with availability

A code

has all-symbol locality

and availability

if every code symbol can be recovered from

disjoint repair sets of other symbols, each set of size at most

symbols. Such codes are called

(r,t)_a

-LRC.

Theorem The minimum distance of

[n,k,d]_q

-LRC having locality

and availability

satisfies the upper bound

d \leq n - \sum_^t \left\lfloor\frac\right\rfloor.

If the code is systematic and locality and availability apply only to its information symbols, then the code has information locality

and availability

, and is called

(r,t)_i

-LRC.

of an

[n,k,d]_q

linear

(r,t)_i

-LRC satisfies the upper bound

d \leq n-k-\left\lceil\frac\right\rceil+2.

This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia article "Locally recoverable code".