A VLSI Single Chip (255,223) Reed-Solomon Encoder With Interleaver

I. S. Hsu, L. J. Deutsch, and T. K. Truong
Communications Systems Research Section
I. S. Reed
Electrical Engineering Department
University of Southern California

This article presents a description of a single-chip implementation of a Reed-Solomon encoder with interleaving capability. The code used was adapted by the CCSDS (Consultative Committee on Space Data Systems). It forms the outer code of the NASA standard concatenated coding system which includes a convolutional inner code of rate 1/2 and constraint length 7. The architecture, leading to this single VLSI chip design, makes use of a bit-serial finite field multiplication algorithm due to E. R. Berlekamp.

I. Introduction

A concatenated coding system consisting of a convolutional inner code and a Reed-Solomon outer code has been adopted as a guideline for downlink telemetry for future space missions by CCSDS (Consultative Committee for Space Data Systems) [1]. The participants in this committee include the European Space Agency (ESA) and NASA as well as space agencies from many other nations.

The convolutional inner code is the same (7, 1/2) code used by NASA's Voyager project. The outer Reed-Solomon code is a (255, 223) block code on 8-bit symbols and it is capable of correcting up to 16 symbol errors. The performance of such a scheme is investigated in [2]. It is shown that this concatenated channel provides a coding gain of almost 2 dB over the convolutional-only channel at a decoded bit error rate of 10^-5.

One of the benefits of concatenated coding, and one of the main motivations for its acceptance as a standard system, is that it provides for nearly error free communication links at fairly low signal power levels. This means that source data compression techniques [3] can be used to help increase channel throughput without a substantial change in overall error rate. Data compression algorithms, while promising to remove substantial information redundancy are very sensitive to transmission errors. An end-to-end study of a system using concatenated coding with data compression can be found in [4].

The interleaving buffers are required because the inner decoder errors tend to occur in bursts, which occasionally are as long as several constraint lengths. The outer decoder remains undisturbed by burst errors which occur within a given 8-bit symbol (which corresponds to about one con-
II. Review of Interleaving Algorithm

The basic block diagram of a concatenated coding system is shown in Fig. 1. The inner encoder-decoder is a convolutional encoder, while the receiver utilizes Viterbi (maximum likelihood) decoder. The outer code is a high rate block code.

It is demonstrated [2] that this concatenated channel provides more than 2 dB of coding gain over the convolutional-only channel. However, the performance of the recommended Reed-Solomon coding scheme in the concatenated coding system can only be achieved when the bursts of errors appearing at the Viterbi decoder output are dispersed in such a manner that the RS symbols at the outer decoder input are randomized sufficiently. This randomization can be achieved by interleaving [15].

In the (255, 223) RS code, each code word contains 255 symbols, and each symbol contains 8 bits. These symbols are grouped into frames. The frames, together, form a block or format. Figure 2 shows an interleaving scheme for the RS outer coding scheme. The numbers used in this figure are for illustration purposes only.

To achieve interleaving, the information and parity symbols are transmitted column-by-column from left-to-right and each column from top-to-bottom. As a result, a bursty error event of the Viterbi decoder will tend to affect only one symbol of each code word provided the number of rows in each interleaving array is sufficiently high. This number is called the interleaving depth.

The standard RS interleaving depths used for deep-space probes are less than or equal to five [1]. As a consequence, a chip was designed which is capable of being programmed. The interleaving depth can be changed from one to five at the user's convenience.

III. A Single VLSI Chip of a (255,223) RS Encoder With Interleaver

Berlekamp's bit-serial multiplication algorithm for a (255, 223) RS-encoder over GF (2⁸) is presented in [7] and [8] (also see M. Perlman and J. J. Lee "Reed-Solomon Encoders: Conventional Versus Berlekamp's Architecture," JPL Internal Document, No. 3610-81-119ISMP, dated July, 1981). The block diagram of the (255, 223) RS encoder with programmable interleaver is exhibited here in Fig. 3.

In Fig. 3, one observes that the circuit is divided into five units: the Product unit, Remainder and Interleaver unit, Quotient unit, I/O unit, and Control Unit. The use of each unit is explained in detail in [8]. However, the Remainder and Interleaver unit is different from that in [8]. This is due to the fact that the interleaver is included in this design, while in [8], no such capability was available.

The function of the Remainder and Interleaver unit is discussed in detail in what follows. Figure 4 shows the block diagram of the Remainder and Interleaver unit. The Remainder and Interleaver unit is used to store the coefficients of the remainder during the division process. This unit also performs the interleaving operation.
In Fig. 4, $S_{i,j}$'s for $0 \leq i \leq 30, 1 \leq j \leq 5$ are 8-bit shift registers. The addition used in the circuit is a modulo 2 addition or Exclusive-OR operation.

The “Turn” and “No-turn” signals are used to program the interleaving depth and the “No-turn” signals are the logical complement of the corresponding “Turn” signals. For example, if “Turn 2” equals 1 and all other “Turn” signals equal 0, then obviously signal “No-turn 2” will equal 0 and all other “No-turn” signals will equal 1. These will turn transistors $T_2$, $T_3$, $T_6$, $T_4$, $T_9$ and transistors in the corresponding positions in Fig. 2 on. Also, transistors $T_1$, $T_{10}$, $T_{11}$, $T_{12}$ and their corresponding transistors in Fig. 2 will be turned off. Therefore, data can be transferred from registers $S_{i,1}$'s to $S_{i,2}$'s for $0 \leq i \leq 30$.

The outputs of registers $S_{i,2}$'s, for $0 \leq i \leq 30$, are sent to the inputs of modulo 2 adders by bus lines $L_{i}$'s, for $1 \leq i \leq 31$. Since transistor $T_4$ and transistors in its corresponding positions in Fig. 4 are off, outputs of $S_{i,2}$'s can not be sent to $S_{i,3}$'s, for $0 \leq i \leq 30$, which means only a depth of 2 interleaving operation is allowed in this manner. The depths of 3, 4 or 5 of interleaving can be carried out similarly.

An overall block diagram of the chip is shown in Fig. 5. In Fig. 5, $V_{DD}$ and GND are power pins. The signals $\phi_1$, and $\phi_2$ are the two phases of a system clock. The information symbols are fed into the chip serially through the data-in pin, DIN. Similarly, the encoded code word is transmitted out of the chip sequentially from the data-out pin, DOUT.

The control signal, SL, is set to 1 (logic 1) when the information symbols are loaded into the chip. After this, SL is set to 0. The control signal “START” resets a 3-bit word counter in the chip before the encoding process begins. The “Turn” signals are used to program the interleaving depth.

The entire chip was simulated on a general purpose computer using ESIM (a logic level simulation program) [9] and SPICE (a transistor level circuit simulation program) [10]. The layout of this chip was accomplished using the program CAESAR [11]. LYRA [12] was used to check the resulting layout against a set of geometric rules supplied by the fabricator. Timing simulation was accomplished using CRYSTAL [13].

This circuit comprises about 10,000 transistors. It was fabricated using the MOSIS service [13]. The technology used is 4 $\mu$m NMOS. The layout of this encoder is shown in Fig. 6. The chip was successfully tested and the maximum clock frequency is about 1.5 MHz. The operating speed of this chip is limited mainly by the long clock lines associated with the two-phase clocks used in the dynamic shift registers and there are more than 1250 dynamic registers used on this chip. The chip area is about 4800 $\times$ 3220 $\mu$m$^2$. The power consumption at the maximum clock frequency is about 300 mW.
References


Fig. 1. A concatenated coding system

<table>
<thead>
<tr>
<th>INFORMATION SYMBOLS</th>
<th>PARITY SYMBOLS</th>
</tr>
</thead>
<tbody>
<tr>
<td>CODE WORD NO. 1</td>
<td>1, 2, ..., 223</td>
</tr>
<tr>
<td>CODE WORD NO. 2</td>
<td>224, 225, ..., 446</td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td>CODE WORD NO. 5</td>
<td>893, 894, ..., 1115</td>
</tr>
</tbody>
</table>

Fig. 2. An interleaving scheme for the RS outer coding scheme

Fig. 3. Block diagram of the (255,223) RS encoder
Fig. 4. The block diagram of the remainder and interleaver unit.
Fig. 5. The symbolic diagram of a (255,223) RS encoder chip

Fig. 6. Layout of the (255,223) RS encoder with interleaver chip