Expander code explained

Expander codes
Block Length:	n
Message Length:	n-m
Rate:	1-m/n
Distance:	2(1-\epsilon)\gamma ⋅ n
Alphabet Size:	2
Notation:	[n,n-m,2(1-\epsilon)\gamma ⋅ n]₂ -code

In coding theory, expander codes form a class of error-correcting codes that are constructed from bipartite expander graphs.Along with Justesen codes, expander codes are of particular interest since they have a constant positive rate, a constant positive relative distance, and a constant alphabet size.In fact, the alphabet contains only two elements, so expander codes belong to the class of binary codes.Furthermore, expander codes can be both encoded and decoded in time proportional to the block length of the code.

Expander codes

In coding theory, an expander code is a

[n,n-m]₂

linear block code whose parity check matrix is the adjacency matrix of a bipartite expander graph. These codes have good relative distance

2(1-\varepsilon)\gamma

, where

\varepsilon

and

\gamma

are properties of the expander graph as defined later, rate

\left(1-\tfrac{m}{n}\right)

, and decodability (algorithms of running time

O(n)

exist).

Definition

Let

be a

(c,d)

-biregular graph between a set of

nodes

\{v_{1, … ,v}_n\}

, called variables, and a set of

cn/d

nodes

\{C_{1, … ,C}_cn/d\}

, called constraints.

Let

b(i,j)

be a function designed so that, for each constraint

C_i

, the variables neighboring

C_i

are

v_b(i,1), … ,v_b(i,d)

Let

l{S}

be an error-correcting code of block length

. The expander code

l{C}(B,l{S})

is the code of block length

whose codewords are the words

(x_{1, … ,x}_n)

such that, for

1\leqi\leqcn/d

(x_b(i,1), … ,x_b(i,d))

is a codeword of

l{S}

.^[1]

It has been shown that nontrivial lossless expander graphs exist. Moreover, we can explicitly construct them.^[2]

Rate

The rate of

is its dimension divided by its block length. In this case, the parity check matrix has size

m x n

, and hence

has rate at least

(n-m)/n=1-m/n

Distance

Suppose

\varepsilon<\tfrac{1}{2}

. Then the distance of a

(n,m,d,\gamma,1-\varepsilon)

expander code

is at least

2(1-\varepsilon)\gamman

Proof

Note that we can consider every codeword

as a subset of vertices

S\subsetL

, by saying that vertex

v_i\inS

if and only if the

th index of the codeword is a 1. Then

is a codeword iff every vertex

v\inR

is adjacent to an even number of vertices in

. (In order to be a codeword,

cP=0

, where

is the parity check matrix. Then, each vertex in

corresponds to each column of

. Matrix multiplication over

GF(2)=\{0,1\}

then gives the desired result.) So, if a vertex

v\inR

is adjacent to a single vertex in

, we know immediately that

is not a codeword. Let

N(S)

denote the neighbors in

, and

U(S)

denote those neighbors of

which are unique, i.e., adjacent to a single vertex of

Lemma 1

For every

S\subsetL

of size

|S|\leq\gamman

d|S|\geq|N(S)|\geq|U(S)|\geqd(1-2\varepsilon)|S|

Proof

Trivially,

|N(S)|\geq|U(S)|

, since

v\inU(S)

implies

v\inN(S)

|N(S)|\leqd|S|

follows since the degree of every vertex in

. By the expansion property of the graph, there must be a set of

d(1-\varepsilon)|S|

edges which go to distinct vertices. The remaining

d\varepsilon|S|

edges make at most

d\varepsilon|S|

neighbors not unique, so

U(S)\geqd(1-\varepsilon)|S|-d\varepsilon|S|=d(1-2\varepsilon)|S|

Corollary

Every sufficiently small

has a unique neighbor. This follows since

\varepsilon<\tfrac{1}{2}

Lemma 2

Every subset

T\subsetL

with

|T|<2(1-\varepsilon)\gamman

has a unique neighbor.

Proof

Lemma 1 proves the case

|T|\leq\gamman

, so suppose

2(1-\varepsilon)\gamman>|T|>\gamman

. Let

S\subsetT

such that

|S|=\gamman

. By Lemma 1, we know that

|U(S)|\geqd(1-2\varepsilon)|S|

. Then a vertex

v\inU(S)

is in

U(T)

iff

v\notinN(T\setminusS)

, and we know that

|T\setminusS|\leq2(1-\varepsilon)\gamman-\gamman=(1-2\varepsilon)\gamman

, so by the first part of Lemma 1, we know

|N(T\setminusS)|\leqd(1-2\varepsilon)\gamman

. Since

\varepsilon<\tfrac{1}{2}

|U(T)|\geq|U(S)\setminusN(T\setminusS)|\geq|U(S)|-|N(T\setminusS)|>0

, and hence

U(T)

is not empty.

Corollary

Note that if a

T\subsetL

has at least 1 unique neighbor, i.e.

|U(T)|>0

, then the corresponding word

corresponding to

cannot be a codeword, as it will not multiply to the all zeros vector by the parity check matrix. By the previous argument,

c\inC\implieswt(c)\geq2(1-\varepsilon)\gamman

. Since

is linear, we conclude that

has distance at least

2(1-\varepsilon)\gamman

Encoding

The encoding time for an expander code is upper bounded by that of a general linear code -

O(n²⁾

by matrix multiplication. A result due to Spielman shows that encoding is possible in

O(n)

time.^[3]

Decoding

Decoding of expander codes is possible in

O(n)

time when

\varepsilon<\tfrac{1}{4}

using the following algorithm.

Let

v_i

be the vertex of

that corresponds to the

th index in the codewords of

. Let

y\in\{0,1\}ⁿ

be a received word, and

V(y)=\{v_i\midthei^thpositionofyisa1\}

. Let

e(i)

|\{v\inR\midv_i\inN(v)andN(v)\capV(y)iseven\}|

, and

o(i)

|\{v\inR\midv_i\inN(v)andN(v)\capV(y)isodd\}|

. Then consider the greedy algorithm:----Input: received word

. initialize y' to y while there is a v in R adjacent to an odd number of vertices in V(y') if there is an i such that o(i) > e(i) flip entry i in y' else failOutput: fail, or modified codeword

.----

Proof

We show first the correctness of the algorithm, and then examine its running time.

Correctness

We must show that the algorithm terminates with the correct codeword when the received codeword is within half the code's distance of the original codeword. Let the set of corrupt variables be

s=|S|

, and the set of unsatisfied (adjacent to an odd number of vertices) vertices in

. The following lemma will prove useful.

Lemma 3

0<s<\gamman

, then there is a

v_i

with

o(i)>e(i)

Proof

By Lemma 1, we know that

U(S)\geqd(1-2\varepsilon)s

. So an average vertex has at least

d(1-2\varepsilon)>d/2

unique neighbors (recall unique neighbors are unsatisfied and hence contribute to

o(i)

), since

\varepsilon<\tfrac{1}{4}

, and thus there is a vertex

v_i

with

o(i)>e(i)

So, if we have not yet reached a codeword, then there will always be some vertex to flip. Next, we show that the number of errors can never increase beyond

\gamman

Lemma 4

If we start with

s<\gamma(1-2\varepsilon)n

, then we never reach

s=\gamman

at any point in the algorithm.

Proof

When we flip a vertex

v_i

o(i)

and

e(i)

are interchanged, and since we had

o(i)>e(i)

, this means the number of unsatisfied vertices on the right decreases by at least one after each flip. Since

s<\gamma(1-2\varepsilon)n

, the initial number of unsatisfied vertices is at most

d\gamma(1-2\varepsilon)n

, by the graph's

-regularity. If we reached a string with

\gamman

errors, then by Lemma 1, there would be at least

d\gamma(1-2\varepsilon)n

unique neighbors, which means there would be at least

d\gamma(1-2\varepsilon)n

unsatisfied vertices, a contradiction.

Lemmas 3 and 4 show us that if we start with

s<\gamma(1-2\varepsilon)n

(half the distance of

), then we will always find a vertex

v_i

to flip. Each flip reduces the number of unsatisfied vertices in

by at least 1, and hence the algorithm terminates in at most

steps, and it terminates at some codeword, by Lemma 3. (Were it not at a codeword, there would be some vertex to flip). Lemma 4 shows us that we can never be farther than

\gamman

away from the correct codeword. Since the code has distance

2(1-\varepsilon)\gamman>\gamman

(since

\varepsilon<\tfrac{1}{2}

), the codeword it terminates on must be the correct codeword, since the number of bit flips is less than half the distance (so we couldn't have traveled far enough to reach any other codeword).

Complexity

We now show that the algorithm can achieve linear time decoding. Let

\tfrac{n}{m}

be constant, and

be the maximum degree of any vertex in

. Note that

is also constant for known constructions.

Pre-processing: It takes

O(mr)

time to compute whether each vertex in

has an odd or even number of neighbors.

Pre-processing 2: We take

O(dn)=O(dmr)

time to compute a list of vertices

v_i

which have

o(i)>e(i)

Each Iteration: We simply remove the first list element. To update the list of odd / even vertices in

, we need only update

O(d)

entries, inserting / removing as necessary. We then update

O(dr)

entries in the list of vertices in

with more odd than even neighbors, inserting / removing as necessary. Thus each iteration takes

O(dr)

time.

As argued above, the total number of iterations is at most

This gives a total runtime of

O(mdr)=O(n)

time, where

and

are constants.

Notes

This article is based on Dr. Venkatesan Guruswami's course notes.^[4]

References

10.1109/18.556667. Expander codes . 1996 . Sipser . M. . Spielman . D.A. . IEEE Transactions on Information Theory . 42 . 6 . 1710–1722 .
Book: M. . Capalbo . O. . Reingold . S. . Vadhan . A. . Wigderson . Randomness conductors and constant-degree lossless expanders . http://dl.acm.org/citation.cfm?id=510003 . STOC '02 Proceedings of the thirty-fourth annual ACM symposium on Theory of computing . ACM . 2002 . 978-1-58113-495-7 . 659–668 . 10.1145/509907.510003. 1918841 .
D. . Spielman . Linear-time encodable and decodable error-correcting codes . IEEE Transactions on Information Theory . 42 . 6 . 1723–31 . 1996 . 10.1109/18.556668 . 10.1.1.47.2736 .
Web site: V. . Guruswami . Lecture 13: Expander Codes . 15 November 2006 . CSE 533: Error-Correcting . University of Washington .
Web site: V. . Guruswami . Notes 8: Expander Codes and their decoding . March 2010 . Introduction to Coding Theory . Carnegie Mellon University .
V. . Guruswami . Guest column: error-correcting codes and expander graphs . ACM SIGACT News . 35 . 3 . 25–41 . September 2004 . 10.1145/1027914.1027924 . 17550280 . subscription .

Expander code explained

Expander codes

Definition

Rate

Distance

Proof

Lemma 1

Proof

Corollary

Lemma 2

Proof

Corollary

Encoding

Decoding

Proof

Correctness

Lemma 3

Proof

Lemma 4

Proof

Complexity

See also

Notes

References