Zemor's decoding algorithm

In coding theory, Zemor's algorithm, designed and developed by Gilles Zemor,^[1] is a recursive low-complexity approach to code construction. It is an improvement over the algorithm of Sipser and Spielman.

Zemor considered a typical class of Sipser–Spielman construction of expander codes, where the underlying graph is bipartite graph. Sipser and Spielman introduced a constructive family of asymptotically good linear-error codes together with a simple parallel algorithm that will always remove a constant fraction of errors. The article is based on Dr. Venkatesan Guruswami's course notes ^[2]

Code construction[edit]

Zemor's algorithm is based on a type of expander graphs called Tanner graph. The construction of code was first proposed by Tanner.^[3] The codes are based on double cover $d$ , regular expander $G$ , which is a bipartite graph. $G$ = $\left(V,E\right)$ , where $V$ is the set of vertices and $E$ is the set of edges and $V$ = $A$ $\cup$ $B$ and $A$ $\cap$ $B$ = $\emptyset$ , where $A$ and $B$ denotes sets of vertices. Let $n$ be the number of vertices in each group, i.e, $|A|=|B|=n$ . The edge set $E$ be of size $N$ = $nd$ and every edge in $E$ has one endpoint in both $A$ and $B$ . $E(v)$ denotes the set of edges containing $v$ .

Assume an ordering on $V$ , therefore ordering will be done on every edges of $E(v)$ for every $v\in V$ . Let finite field $\mathbb {F} =GF(2)$ , and for a word $x=(x_{e}),e\in E$ in $\mathbb {F} ^{N}$ , let the subword of the word will be indexed by $E(v)$ . Let that word be denoted by $(x)_{v}$ . The subset of vertices $A$ and $B$ induces every word $x\in \mathbb {F} ^{N}$ a partition into $n$ non-overlapping sub-words $\left(x\right)_{v}\in \mathbb {F} ^{d}$ , where $v$ ranges over the elements of $A$ . For constructing a code $C$ , consider a linear subcode $C_{o}$ , which is a $[d,r_{o}d,\delta ]$ code, where $q$ , the size of the alphabet is $2$ . For any vertex $v\in V$ , let $v(1),v(2),\ldots ,v(d)$ be some ordering of the $d$ vertices of $E$ adjacent to $v$ . In this code, each bit $x_{e}$ is linked with an edge $e$ of $E$ .

We can define the code $C$ to be the set of binary vectors $x=\left(x_{1},x_{2},\ldots ,x_{N}\right)$ of $\{0,1\}^{N}$ such that, for every vertex $v$ of $V$ , $\left(x_{v(1)},x_{v(2)},\ldots ,x_{v(d)}\right)$ is a code word of $C_{o}$ . In this case, we can consider a special case when every edge of $E$ is adjacent to exactly $2$ vertices of $V$ . It means that $V$ and $E$ make up, respectively, the vertex set and edge set of $d$ regular graph $G$ .

Let us call the code $C$ constructed in this way as $\left(G,C_{o}\right)$ code. For a given graph $G$ and a given code $C_{o}$ , there are several $\left(G,C_{o}\right)$ codes as there are different ways of ordering edges incident to a given vertex $v$ , i.e., $v(1),v(2),\ldots ,v(d)$ . In fact our code $C$ consist of all codewords such that $x_{v}\in C_{o}$ for all $v\in A,B$ . The code $C$ is linear $[N,K,D]$ in $\mathbb {F}$ as it is generated from a subcode $C_{o}$ , which is linear. The code $C$ is defined as $C=\{c\in \mathbb {F} ^{N}:(c)_{v}\in C_{o}\}$ for every $v\in V$ .

In this figure, $(x)_{v}=\left(x_{e1},x_{e2},x_{e3},x_{e4}\right)\in C_{o}$ . It shows the graph $G$ and code $C$ .

In matrix $G$ , let $\lambda$ is equal to the second largest eigenvalue of adjacency matrix of $G$ . Here the largest eigenvalue is $d$ . Two important claims are made:

Claim 1[edit]

$\left({\dfrac {K}{N}}\right)\geq 2r_{o}-1$
. Let $R$ be the rate of a linear code constructed from a bipartite graph whose digit nodes have degree $m$ and whose subcode nodes have degree $n$ . If a single linear code with parameters $\left(n,k\right)$ and rate $r=\left({\dfrac {k}{n}}\right)$ is associated with each of the subcode nodes, then $k\geq 1-\left(1-r\right)m$ .

Proof[edit]

Let $R$ be the rate of the linear code, which is equal to $K/N$ Let there are $S$ subcode nodes in the graph. If the degree of the subcode is $n$ , then the code must have $\left({\dfrac {n}{m}}\right)S$ digits, as each digit node is connected to $m$ of the $\left(n\right)S$ edges in the graph. Each subcode node contributes $(n-k)$ equations to parity check matrix for a total of $\left(n-k\right)S$ . These equations may not be linearly independent. Therefore, $\left({\dfrac {K}{N}}\right)\geq \left({\dfrac {({\dfrac {n}{m}})S-(n-k)S}{({\dfrac {n}{m}})S}}\right)$
$\geq 1-m\left({\dfrac {n-k}{n}}\right)$
$\geq 1-m\left(1-r\right)$ , Since the value of $m$ , i.e., the digit node of this bipartite graph is $2$ and here $r=r_{o}$ , we can write as:
$\left({\dfrac {K}{N}}\right)\geq 2r_{o}-1$

Claim 2[edit]

D\geq N\left({\dfrac {(\delta -({\dfrac {\lambda }{d}}))}{(1-({\dfrac {\lambda }{d}})}})\right)^{2}

=N\left(\delta ^{2}-O\left({\dfrac {\lambda }{d}}\right)\right)

\rightarrow (1)

If $S$ is linear code of rate $r$ , block code length $d$ , and minimum relative distance $\delta$ , and if $B$ is the edge vertex incidence graph of a $d$ – regular graph with second largest eigenvalue $\lambda$ , then the code $C(B,S)$ has rate at least $2r_{o}-1$ and minimum relative distance at least $\left(\left({\dfrac {\delta -\left({\dfrac {\lambda }{d}}\right)}{1-\left({\dfrac {\lambda }{d}}\right)}}\right)\right)^{2}$ .

Proof[edit]

Let $B$ be derived from the $d$ regular graph $G$ . So, the number of variables of $C(B,S)$ is $\left({\dfrac {dn}{2}}\right)$ and the number of constraints is $n$ . According to Alon - Chung,^[4] if $X$ is a subset of vertices of $G$ of size $\gamma n$ , then the number of edges contained in the subgraph is induced by $X$ in $G$ is at most $\left({\dfrac {dn}{2}}\right)\left(\gamma ^{2}+({\dfrac {\lambda }{d}})\gamma \left(1-\gamma \right)\right)$ .

As a result, any set of $\left({\dfrac {dn}{2}}\right)\left(\gamma ^{2}+\left({\dfrac {\lambda }{d}}\right)\gamma \left(1-\gamma \right)\right)$ variables will be having at least $\gamma n$ constraints as neighbours. So the average number of variables per constraint is : $\left({\dfrac {({\dfrac {2nd}{2}})\left(\gamma ^{2}+({\dfrac {\lambda }{d}})\gamma \left(1-\gamma \right)\right)}{\gamma n}}\right)$ $=d\left(\gamma +({\dfrac {\lambda }{d}})\left(1-\gamma \right)\right)$ $\rightarrow (2)$

So if $d\left(\gamma +({\dfrac {\lambda }{d}})\left(1-\gamma \right)\right)<\gamma d$ , then a word of relative weight $\left(\gamma ^{2}+({\dfrac {\lambda }{d}})\gamma \left(1-\gamma \right)\right)$ , cannot be a codeword of $C(B,S)$ . The inequality $(2)$ is satisfied for $\gamma <\left({\dfrac {1-({\dfrac {\lambda }{d}})}{\delta -({\dfrac {\lambda }{d}})}}\right)$ . Therefore, $C(B,S)$ cannot have a non zero codeword of relative weight $\left({\dfrac {\delta -({\dfrac {\lambda }{d}})}{1-({\dfrac {\lambda }{d}})}}\right)^{2}$ or less.

In matrix $G$ , we can assume that $\lambda /d$ is bounded away from $1$ . For those values of $d$ in which $d-1$ is odd prime, there are explicit constructions of sequences of $d$ - regular bipartite graphs with arbitrarily large number of vertices such that each graph $G$ in the sequence is a Ramanujan graph. It is called Ramanujan graph as it satisfies the inequality $\lambda (G)\leq 2{\sqrt {d-1}}$ . Certain expansion properties are visible in graph $G$ as the separation between the eigenvalues $d$ and $\lambda$ . If the graph $G$ is Ramanujan graph, then that expression $(1)$ will become $0$ eventually as $d$ becomes large.

Zemor's algorithm[edit]

The iterative decoding algorithm written below alternates between the vertices $A$ and $B$ in $G$ and corrects the codeword of $C_{o}$ in $A$ and then it switches to correct the codeword $C_{o}$ in $B$ . Here edges associated with a vertex on one side of a graph are not incident to other vertex on that side. In fact, it doesn't matter in which order, the set of nodes $A$ and $B$ are processed. The vertex processing can also be done in parallel.

The decoder $\mathbb {D} :\mathbb {F} ^{d}\rightarrow C_{o}$ stands for a decoder for $C_{o}$ that recovers correctly with any codewords with less than $\left({\dfrac {d}{2}}\right)$ errors.

Decoder algorithm[edit]

Received word : $w=(w_{e}),e\in E$
$z\leftarrow w$ For $t\leftarrow 1$ to $m$ do // $m$ is the number of iterations { if ( $t$ is odd) // Here the algorithm will alternate between its two vertex sets. $X\leftarrow A$ else $X\leftarrow B$ Iteration $t$ : For every $v\in X$ , let $(z)_{v}\leftarrow \mathbb {D} ((z)_{v})$ // Decoding $z_{v}$ to its nearest codeword. } Output: $z$

Explanation of the algorithm[edit]

Since $G$ is bipartite, the set $A$ of vertices induces the partition of the edge set $E$ = $\cup _{v\in A}E_{v}$ . The set $B$ induces another partition, $E$ = $\cup _{v\in B}E_{v}$ .

Let $w\in \{0,1\}^{N}$ be the received vector, and recall that $N=dn$ . The first iteration of the algorithm consists of applying the complete decoding for the code induced by $E_{v}$ for every $v\in A$ . This means that for replacing, for every $v\in A$ , the vector $\left(w_{v(1)},w_{v(2)},\ldots ,w_{v(d)}\right)$ by one of the closest codewords of $C_{o}$ . Since the subsets of edges $E_{v}$ are disjoint for $v\in A$ , the decoding of these $n$ subvectors of $w$ may be done in parallel.

The iteration will yield a new vector $z$ . The next iteration consists of applying the preceding procedure to $z$ but with $A$ replaced by $B$ . In other words, it consists of decoding all the subvectors induced by the vertices of $B$ . The coming iterations repeat those two steps alternately applying parallel decoding to the subvectors induced by the vertices of $A$ and to the subvectors induced by the vertices of $B$ .
Note: [If $d=n$ and $G$ is the complete bipartite graph, then $C$ is a product code of $C_{o}$ with itself and the above algorithm reduces to the natural hard iterative decoding of product codes].

Here, the number of iterations, $m$ is $\left({\dfrac {(\log {n})}{\log(2-\alpha )}}\right)$ . In general, the above algorithm can correct a code word whose Hamming weight is no more than $({\dfrac {1}{2}}).\alpha N\delta \left(({\dfrac {\delta }{2}})-({\dfrac {\lambda }{d}})\right)=\left(({\dfrac {1}{4}}).\alpha N(\delta ^{2}-O({\dfrac {\lambda }{d}})\right)$ for values of $\alpha <1$ . Here, the decoding algorithm is implemented as a circuit of size $O(N\log {N})$ and depth $O(\log {N})$ that returns the codeword given that error vector has weight less than $\alpha N\delta ^{2}(1-\epsilon )/4$ .

Theorem[edit]

If $G$ is a Ramanujan graph of sufficiently high degree, for any $\alpha <1$ , the decoding algorithm can correct $({\dfrac {\alpha \delta _{o}^{2}}{4}})(1-\in )N$ errors, in $O(\log {n})$ rounds ( where the big- $O$ notation hides a dependence on $\alpha$ ). This can be implemented in linear time on a single processor; on $n$ processors each round can be implemented in constant time.

Proof[edit]

Since the decoding algorithm is insensitive to the value of the edges and by linearity, we can assume that the transmitted codeword is the all zeros - vector. Let the received codeword be $w$ . The set of edges which has an incorrect value while decoding is considered. Here by incorrect value, we mean $1$ in any of the bits. Let $w=w^{0}$ be the initial value of the codeword, $w^{1},w^{2},\ldots ,w^{t}$ be the values after first, second . . . $t$ stages of decoding. Here, $X^{i}={e\in E|x_{e}^{i}=1}$ , and $S^{i}={v\in V^{i}|E_{v}\cap X^{i+1}!=\emptyset }$ . Here $S^{i}$ corresponds to those set of vertices that was not able to successfully decode their codeword in the $i^{th}$ round. From the above algorithm $S^{1}<S^{0}$ as number of unsuccessful vertices will be corrected in every iteration. We can prove that $S^{0}>S^{1}>S^{2}>\cdots$ is a decreasing sequence. In fact, $|S_{i+1}|<=({\dfrac {1}{2-\alpha }})|S_{i}|$ . As we are assuming, $\alpha <1$ , the above equation is in a geometric decreasing sequence. So, when $|S_{i}|<n$ , more than $log_{2-\alpha }n$ rounds are necessary. Furthermore, $\sum |S_{i}|=n\sum ({\dfrac {1}{(2-\alpha )^{i}}})=O(n)$ , and if we implement the $i^{th}$ round in $O(|S_{i}|)$ time, then the total sequential running time will be linear.

Drawbacks of Zemor's algorithm[edit]

It is lengthy process as the number of iterations $m$ in decoder algorithm takes is $[(\log {n})/(\log(2-\alpha ))]$
Zemor's decoding algorithm finds it difficult to decode erasures. A detailed way of how we can improve the algorithm is

given in.^[5]

References[edit]

^ "Gilles Zémor". www.math.u-bordeaux.fr. Retrieved 9 April 2023.
^ http://www.cs.washington.edu/education/courses/cse590vg/03wi/scribes/1-27.ps ^{[dead link]}
^ "Lecture notes" (PDF). washington.edu. Retrieved 9 April 2023.
^ N. Alon; F.R.K. Chung (December 1988). "Explicit construction of linear sized tolerant networks". Discrete Mathematics. 72 (1–3). CiteSeerX 10.1.1.300.7495. doi:10.1016/0012-365X(88)90189-6.
^ "Archived copy". Archived from the original on September 14, 2004. Retrieved May 1, 2012.{{cite web}}: CS1 maint: archived copy as title (link)

[vg1-1] "Gilles Zémor". www.math.u-bordeaux.fr. Retrieved 9 April 2023.

[vg2-2] ttp://www.cs.washington.edu/education/courses/cse590vg/03wi/scribes/1-27.ps ^{[dead link]}

[vg3-3] "Lecture notes" (PDF). washington.edu. Retrieved 9 April 2023.

[vg4-4] N. Alon; F.R.K. Chung (December 1988). "Explicit construction of linear sized tolerant networks". Discrete Mathematics. 72 (1–3). CiteSeerX 10.1.1.300.7495. doi:10.1016/0012-365X(88)90189-6.

[vg5-5] "Archived copy". Archived from the original on September 14, 2004. Retrieved May 1, 2012.{{cite web}}: CS1 maint: archived copy as title (link)

[1]

[2]

[3]

[4]

[5]

Code construction[edit]

Claim 1[edit]

Proof[edit]

Claim 2[edit]

Proof[edit]

Zemor's algorithm[edit]

Decoder algorithm[edit]

Explanation of the algorithm[edit]

Theorem[edit]

Proof[edit]

Drawbacks of Zemor's algorithm[edit]

See also[edit]

References[edit]