Error Coding for Engineers 9781461355892, 9781461515098, 1461355893, 1461515092

Error Coding for Engineersprovides a useful tool for practicing engineers, students, and researchers, focusing on the ap

267 125 14MB

English Pages 246 [248] Year 2012;2001

Report DMCA / Copyright

DOWNLOAD PDF FILE

Recommend Papers

Error Coding for Engineers
 9781461355892, 9781461515098, 1461355893, 1461515092

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

ERROR CODING FOR ENGINEERS

THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE

ERROR CODING FOR ENGINEERS

A. Houghton Synectic Systems, Ltd., United Kingdom

SPRINGER. SCIENCE+BUSINESS MEDIA, LLC

Library of Congress Cataloging-in-Publication Data Houghton, A. Error coding for engineers / A. Houghton. p.cm. - (The Kluwer international series in engineering and computer science; SECS 641)

Includes bibliographical references and index. ISBN 978-1-4613-5589-2 ISBN 978-1-4615-1509-8 (eBook) DOI 10.1007/978-1-4615-1509-8 1. Signal processing. 2. Error-correcting codes (Information theory) 1. Title. II. Series. TK5102.9 .H69 2001 005.7'2--dc21 2001038560

Copyright © 2001 by Springer Science+Business Media New York Originally published by Kluwer Academic Publishers in 2001 Softcover reprint ofthe hardcover Ist edition 2001 AII rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Printed on acid-free paper.

Table of Contents Preface

IX

1. Introduction 1.1 Messages Need Space 1.2 The Hamming Bound 1.3 The Gilbert Bound 1.4 Where Do Errors Come From? 1.5 ABrief History of Error Coding 1.6 Summary

1 1 4 6 7 12 13

2. A Little Maths 2.1 Polynomials and Finite Fields 2.2 Manipulating Field Elements 2.3 Summary

15 16 19 24

3. Error Detection 3.1 The Horizontal and Vertical Parity Check 3.2 The Cyc1ic Redundancy Check 3.3 Longhand Ca1culation of the CRC 3.4 Performance 3.5 Hardware Implementation 3.6 Table-Based Ca1culation of the CRC 3.7 Discussion

25 25 27

4. Error Correction by Parity 4.1 Correcting a Single Bit 4.2 Extending the Message Size

41 43 45

28 31 32 35 39

5. Error Correction Using the CRC 5.1 A Hardware Error Locator 5.2 Multiple Bit Errors 5.3 Expurgated Codes 5.4 The Perfect Golay Code 5.5 Fire Codes

49 52 53 54

6. Reed-Muller Codes 6.1 Constructing a Generator Matrix For RM Codes 6.2 Encoding with the Hadamard Matrix 6.3 Discussion

67 67 74 77

7. Reed-Solomon Codes 7.1 Introduction to the Time Domain 7.2 Calculation of the Check Symbols for One Error 7.3 Correcting Two Symbols 7.4 Error Correction in the Frequency Domain 7.5 Recursive Extension 7.6 The Berlekamp-Massey Algorithm 7.7 The Fomey Algorithm 7.8 Mixed-Domain Error Coding 7.9 Higher Dimensional Data Structures 7.10 Discussion

79 79 80 83 88 91 97 100 101 107 118

8. Augmenting Error Coding 8.1 Erasure 8.2 Punctured Codes 8.3 Interleaving 8.4 Error Forecasting 8.5 Discussion

119 119 121 125 130 131

9. Convolutional Coding 9.1 Error Correction with Convolutional Codes 9.2 Finding the Correct Path 9.3 Decoding with Soft Decisions 9.4 Performance of Convolutional Codes 9.5 Concatenated Codes 9.6 Iterative Decoding 9.7 Turbo Coding 9.8 Discussion

133 135 136 138 140 141 142 143 144

56

63

vi

10. Hardware 10.1 Reducing Elements 10.2 Multiplication 10.3 Division 10.4 Logs and Exponentials 10.5 Reciprocal 10.6 Discussion

147 147 149 153 158 160 164

11. Bit Error Rates 11.1 The Gaussian Normal Function 11.2 Estimating the Bit Error Rate 11.3 Applications 11.4 Discussion

165 165 166 172 176

12. Exercises 12.1 Parity Exercises 12.2 CRC Exercises 12.3 Finite Field Algebra 12.4 Convolutional Codes 12.5 OtherCodes 12.6 Solutions to Parity Exercises 12.7 Solutions to CRC Exercises 12.8 Solutions to Finite Field Algebra 12.9 Solutions to Convolutional Coding 12.10 Solutions to Other Codes 12.11 Closing Remarks 12.12 Bibliography

177 177 178 180 183 183 184 186 192 202 203 208 208

AppendixA Appendix B Appendix C Appendix D AppendixF

211 219 223 229 233

Primitive Polynomials The Golay Code Solving for Two Errors Solving some Key Equations Software Library

Index

245

vii

Preface

Error coding is the art of encoding messages so that errors can be detected and, if appropriate, corrected after transmission or storage. A full appreciation of error coding requires a good background in whole number maths, symbols and abstract ideas. However, the mechanics of error coding are often quite easy to grasp with the assistance of a few good examples. This book is about the mechanics. In other words, if you're interested in implementing error coding schemes but have no idea what a finite field is, then this book may help you. The material covered starts with simple coding schemes which are often rather inefficient and, as certain concepts are established, progresses to more sophisticated techniques. Error coding is a bit like the suspension in a car. Mostly it's unseen, but the thing would be virtually useless without it. In truth, error coding underpins nearly all modem digital systems and without it, they simply couldn't work. Probably like some car suspensions, the elegance of error coding schemes is often amazing. While it's a bit early to talk about things like efficiency, two schemes of similar efficiency might yield vastly different performance and, in this way, error coding is rather 'holistic'; it can't be treated properly outside of the context for which it is required. WeIl, more of this later. For now, if you're interested in whole number maths (which is actually really good fun), have a problem which requires error coding, are an undergraduate studying DSP or communications, but you don't have a formal background in

maths, then this could be the book for you. Principally, you will require a working knowledge ofbinary; the rest can be picked up along the way.

x

Chapter 1

INTRODUCTION

The chapter deals with an overview of error coding, not especially in terms of what' s been achieved over the years or what is the current state of the subject, but in pragmatic terms of what' s undemeath it that makes it work. The maths which surrounds error coding serves two key purposes; one, it shows us what can theoretically be achieved by error coding and what we might reasonable have to do to get there and, two, it provides mechanisms for implementing practical coding schemes. Both of these aspects are, without doubt daunting for anyone whose principal background is not maths yet, for the engineer who needs to implement error coding, many of the practical aspects of error coding are tractable.

1.1 Messages Need Space Space is at the heart of all error coding so it seems like a good place to start. If I transmit an 8-bit number to you by some means, outside of any context, you can have absolutely no idea whether or not what you receive is what I sent. At best, if you were very clever and knew something about the quality of the channel which I used to send the byte (an 8-bit number) to you, you could work out how likely it is that the number is correct. In fact, quite a lot of low-speed communication works on the basis that the channel is good enough that most of the time there are no errors. The reason that you

A. Houghton, Error Coding for Engineers © Kluwer Academic Publishers 2001

2

1.1 Messages Need Space

cannot be certain that what you have received is what I sent is simply that there is no space between possible messages. An 8-bit number can have any value between and inc1uding 0 and 255. I could have sent any one of these 256 numbers so whatever number you receive might have been what I sent. You've probably heard of the parity bit in RS232 communications. If not, it works like this. When you transmit a byte, you count up the number of Is in it (its weight) and add a ninth bit to it such that the total number of Is is even (or odd; it doesn't matter which so long as both ends of the channel agree.) So we're now transmitting 8 bits of data or information using 9 bits of bandwidth. The recipient receives a 9-bit number which, of course, can have 512 values (or 2n where n is the number of bits). However, the recipient also knows that only 256 of the 512 values will ever be transmitted because there are only eight data bits. If the recipient gets a message which is one of the 256 messages that will never be transmitted, then they know that an error has occurred. In this particular case, the message would have suffered a parity error. In essence, what the parity bit does is to introduce some space between messages. If a single bit changes in the original 8-bit message, it will look like another 8-bit message. If a single bit changes in the 9-bit, parity-encoded message, it will look like an invalid or non-existent message. To change the parity-encoded message into another valid message will require at least 2 bits to be flipped. Hopefully this is really obvious and you're wondering why I mentioned it. However space, in one form or another, is how all error coding works. The key to error coding is finding efficient ways of creating this space in messages and, in the event of errors, working out how to correct them. Space between messages is measured in terms of Hamming distance (d). The Hamming distance between two messages is a count of the minimum number of bits that must be changed to convert one message into the other. Error codes have a special measure called dmino d min is the minimum number of bits that must be changed to make one valid message look like another valid message. This, ultimately, deterrnines the ability of the code to detect or correct errors. Every bit-error that occurs increases the Hamming distance between the original and corrupted messages by one. If terrors are to be detected then dmin > t.

(1.1)

In the case of the simple 8-bit message with no parity, dmin = 1, so t = O. However, with the addition of a parity bit, dmin increases to two, so t = 1, i.e.

3

CHAPTER 1. INTRODUCTION

we can reliably detect a single-bit error. Should we want to be able to correct messages, then (1.1) is modified to (1.2), below. dmin > 2t.

(1.2)

Clearly parity, as it has been presented above, cannot correct errors. So where does (1.2) come from? Error correction works by finding the nearest valid message (in terms of d) from the corrupted message. Using the simple ca se of parity checking, dmin = 2. A single-bit error will change a valid message into an invalid message which has a d of 1 from it. However, because dmin is only two, there will be other messages which are also only one bit away from the invalid message. In fact, any one of the nine bits could be changed in the invalid message to create a valid message. The problem is that there are several messages equally close to the corrupted message and we have no means to choose between them. If dmin was three, however, (i.e. > 2t) the original message would, as before, be one bit away from the invalid message, but no other valid message could be nearer than two bits away. In this case, there is a nearest candidate. A good way to visualize this is to draw 2n circles on a piece of paper (where there are n message bits) and fill in 2k of them (where k bits are data or information) as evenly spread over the page as possible. The circles represent all possible messages, while those that are filled in are the sub set of valid messages. Figure 1.1 is my attempt at this.

o o 2n message points (circIes)

o

°

o

o

o

~=~, ° . . \ ()

0 . .· · · · · . .

o .........

o



o

....................................

0

0///

o o

o

.'

o

./

k

.... 2 valid messages ................. . 0 (filled circIes)

0

o

0/



o

o o

Figure 1.1 Valid data points within the total message space.

o

4

1.2 The Hamming Bound

Starting at a valid message, introduce an error by moving away to the nearest circle. Each time you add an error bit, move further away from the starting message. EventuaIly, the correct (starting) message will no longer be the nearest filled in circle. When this happens, the capacity of the code has been exceeded and an attempt at correction will make matters worse. This illustration is actuaIly the projection of an n-dimensional problem onto a 2dimensional space (your paper) and so is rather approximate. It none-the-Iess describes quite weIl the principle.

1.2 The Hamming Bound The use of n and k, above, leads to the idea of (n, k) or (n, k, t) codes, another way in which error codes are often defined. n is the total number of message bits, so there are 2n possible messages for the code, while k defines how many of those bits are data or infonnation bits. So out of the 2n messages, 2k will be valid. The job of the error code is to spread the 2k valid messages evenly across the 2n message space maximizing dmin • A question you might reasonably ask is, is there a relationship between t, n and k? Another way of putting this might be to wonder how many bits of redundancy (n - k) must be added to a message to create a t-error correcting code. There is a pragmatic and a mathematical way of considering this question. It's useful to start at the mathematical end since, if you can acquire a broad understanding of what's involved in error coding right at the start, it aIl makes more sense later on. Here goes then: n, k and t are a littIe intertwined and it's easiest to choose three values and see if they can be made to satisfy certain constraints. For an example, consider the possibility of a (7, 4, 1) code. Could such a thing ex ist? There are 16 valid messages (24) in a pool of 128 possible messages (27) and we want to correct I-bit errors. Each valid message must be surrounded by a sea of invalid messages for which d t. So in this example, each of the 16 messages must have seven surrounding messages of d = 1 that are unique to them to permit correction. Each valid message requires 1 (itself) + 7 (d = 1 surrounding) messages. We need, therefore, 8 messages per valid message and there are 16 valid messages giving 8 x 16 = 128 messages in total. In this case there are exactly 128 available so, in principle at least, this code could exist. This limit is caIled the Hamming bound and codes (of which there are a few), which satisfy exactly this bound

CHAPTER 1. INTRODUCTION

5

are called perfect codes. For most codes 2n is somewhat greater than the Hamming bound. In general terms, an n-bit message will have n! (1.3) d!(n-d). messages that are d bits distance from it. This is simply the number of different ways that you can change d bits in n. So, for a terror correcting code, the Hamming bound is found from

which simplifies to (1.4) If you can visualize n-dimensional space, then each message represents a point in this space. Around each point, there is a surface of other points with radius 1 bit. Around that surface there is another of radius 2 bits and so forth up to n bits. (1.3) teIls us how many points exist in each of these surfaces. Some codes which satisfy exact1y the Hamming bound inc1ude (3, 1, 1), (5, 1,2), (7, 4, 1), (7, 1,3), (9, 1,4), (11, 1,5), (13, 1,6), (15, 11, 1), (15, 1, 7), (17, 1, 8), (23, 12, 3), (90, 78, 2). Most of these contain only one data bit so the two valid messages (2 1) might be all zeros and all ones. Codes with the form (2 m - 1, 2m - m - 1, 1) such as (15, 11, 1) are known as Hamming codes and we'll see how these are constructed later. The code (23, 12, 3) is called the Golay code and, according to Golay, there is no solution to the code (90, 78,2). In some ways perfect codes are the most efficient codes since there are absolutely no wasted points in the 2n message space. However, this has its drawbacks. With perfect codes all received messages (whether valid or in error), will decode to a solution. There are no invalid messages more than tbits away from a valid message. This means that if more that t bit-errors occur, there is no way of telling. The Golay code, which will correct up to three bit-errors, is usually extended to a (24, 12,3) code by adding an overall parity bit. While it can still only correct three errors it can now detect the presence of four.

6

1.3 The Gilbert Bound

1.3 The Gilbert Bound The Hamming bound is one extreme end of the (n, k, t) relationship, packing the most capability into the least redundancy. At the other end is something called the Gilbert bound. This bound gives the smallest size of the message space (2 n) that absolutely guarantees that there will be a t-error correcting code for k data bits. It's trivial to construct working codes at this end of the (n, k, t) relationship although that doesn't mean they're any good. The Gilbert bound works on the following argument. If we want a t-error correcting code then we choose at random from the 2n message space a valid code. The message and all messages within 2t bits distance of it are deleted. The next message is chosen from the remaining pool and all messages within 2t of it are deleted, and so forth. Each valid message thus requires

Ld!(nn'-.d) 21

d=O

messages out of the total pool. If there are k data bits then a total of

2k

XL 21

,

n. d=O d!(n - d)

messages are needed so the Gilbert bound is generated from (1.5)

L 21

d=O

,

n.

< 2 n- k

d!(n - d) -

(1.5)

Most error codes sit somewhere between these bounds which means that often, even though the capacity to correct errors may have been exceeded, the code can still detect unrecoverable errors. In this instance the received error message will be more than t from any valid code so no correction will be attempted. So what about the pragmatic approach? During decoding the extra bits added to a message for error coding give rise to a syndrome, typically of the same size as the redundancy. The syndrome is simply a number which is usually 0 if no errors have occurred. If terrors are to be corrected in a message of n bits then the syndrome must be able to identify uniquely the t bit-error locations. So in n bits, there will be

CHAPTER 1. INTRODUCTION

7

1: n'. t

d=O

d!(n - d).

possible error combinations (including the no-error case). The syndrome must have sufficient size to be able to identify all of these error combinations. If the syndrome is the same size as the redundancy (n - k), then to satisfy this requirement gives

1: t

,

n. ~ 2 n- k d=O d!(n - d).

which is, of course, the Hamming bound. Another way of viewing this is to consider the special case of messages where n = 2i - 1 (i is some integer). At least i bits will be needed to specify the location of one bit in the message. For example, if the message is 127 bits then a 7-bit number is required to point to any bit location in the message (plus the all-zero/no error result). So for terrors, at least t x i check bits will be needed. We can compare this result with the (15, 11, 1) code. To find one error in 15 bits needs 1 x 4 = 4 check bits which there are. If you try this approach with the Golay code, you find that 15 check bits should be needed as opposed to the 11 that there actually are. See if you can work out why this is and repeat the calculation for the (90, 78, 2) code. A few more basic measures can be derived from (n, k, t) codes. Redundancy is the name given to the extra bits (r) added to a message to create space in it. The amount of redundancy added gives rise to a measure called the coding rate R which is the ratio of the number of data bits k, divided by the total bits k + r (= n). In percent, R gives the code ejficiency.

1.4 Where do Errors Come From? Errors arise from a number of sources and the principal source which is anticipated should, at least in part, determine the code that is used. There are two main types of error which are caused by very different physical processes. First there are random errors. Any communications medium (apart perhaps from superconductors which, as yet, are not ubiquitous), is subject to noise processes. These may be internal thermal and partition noise sources as electrons jostle around each other, or they may be the cumulative

8

1.4 Where Do Errors Come From?

actions of extemal electromagnetic interference being picked up en route. This kind of noise gives rise to the background hiss in your hi-fi system or the fuzzy haze on the TV screen in areas with poor reception. Noise like this typically has what is called a Gaussian PDF (probability density function). So travelling down the communications cable is the data (ls and Os) plus noise. When the data (plus noise) reaches its destination, a receiver circuit has to decide whether the data is 1 or 0 (crudely speaking). The noise content adds a little uncertainty into this process and leads ultimately to the addition of random errors. The great thing about random errors is that they can be precisely modelIed. You can predict exactly how likely an error is, how many will occur each hour and so forth. Unfortunately, you can never predict which bit it'll be. Taking a very simple example, suppose that a data 1 is represented by +1 volt and a data 0 by -1 volt on a piece of wire. The receiver will decide on the current state of the signal by comparing it with 0 volts. Figure 1.2 illustrates the example.

lV

Figure 1.2 Gaussian PDF.

The curves about the ±1 volt markers are the Gaussian function (also called the normal function) which has the form in (1.6), below.

1

---==e

(j.J27r

(X_X)2 22 (J

(1.6)

cr is the standard deviation of the function, while xis the mean. The curve describes in terms of probability, what the received signal will be. The total area under the curve is 1 so the received signal must have some value. In this example, we could calculate how likely a 0 would be of being misinterpreted as a 1 by measuring the area under the solid curve that falls to

CHAPTER 1. INTRODUCTION

9

the right of the 0 volt decision threshold. Suppose it was 0.001, or Ihooo. This means that one in one thousand Os will be read as a 1. The same would be true for Is being misread as Os. So all in all, one in five hundred bits received will be in error giving a bit error rate (or BER) of 0.002. The BER can be reduced by increasing the distance between the signals. Because of the shape of the PDF curve, doubling the 0/1 voltages to ±2 volts would give a dramatic error reduction. Channe1s are often characterised graphically by showing BER on a logarithmic axis against a quantity called Eh/No. This is a normalized measure of the amount of energy used to signal each bit. You can think of it like the separation of the Os and 1s in Figure 1.2, while the noise processes which impinge themselves on the signal control the width (or 0") of the Gaussian PDF. The greater Eh/No, the smaller the BER. Eh/No can be thought of in a very similar way to SNR or signal to noise ratio. Figure 1.3 show an example of bit error rate against signal strength o

2

3

4

5

6

7

Or---~--~--~--~--~--~~

·1

Coding loss

ffi ·2

m C5' ·3 (;

0-4 ..J

Coding gain

·5

·6 Eb/NO (dB)

Figure 1.3 BER against

Eh/No.

The solid line represents an uncoded signal whereas the dotted line is the same channel, but augmented by error coding. This gives rise to an enormously important idea called coding gain. At very low powers (and consequently high error rates), the addition of error coding bits to the message burdens it such that an Eh/No of Figure 1.3 in the coded channel, gives the same performance as the uncoded signal with an Eh/No of O. The reason is that to get the same number of data bits through the channel when coded, more total bits must be sent and so more errors. Because the errors rates are high, the coding is inadequate to offset them (t is exceeded), so

10

1.4 Where Do Errors Come From?

we're actually worse off. However, at signal powers of 2, the coded and uncoded channels have similar performance while above 2, the coded channel has a better performance. The early part of the graph represents coding loss, while the latter, coding gain. So why is this important? If a particular system requires aBER of no greater than one error in 105, then an uncoded channel in this example will need an Eb / No of 5. However, the coded channel can provide this performance with an E b / No of only 4, so the code is said to give a coding gain of IdB (Eb / No). Put this in terms of a TV relay satellite and it starts to make good economics. Error coding can reduce the required transmitter power or increase its range dramatically which has an major impact on the weight and lifetime of the satellite. Approximately, when error coding is added to a signal, the two distributions in Figure 1.2 get closer together since, for the same transmitter power, more bits are being sent. Normally, this would increase the BER. However, the effect of adding error coding reduces the widths (a) ofthe distributions such that the BER actually decreases. Before leaving random errors, there is another aspect to them which has been usefully exploited in some modem coding schemes. When a random error occurs, the nature of the Gaussian PDF means that the signal is increasingly likely to be near to the decision boundary (0 volts in this example). Rather than make a straight-forward 1/0 decision, each bit can be described by its probability of being a 1 or a 0 (a soft decision). Figure 1.4 shows the typical soft output.

-0.5

o

0.5

1

1.5

x Figure 1.4 Probability distribution with soft boundaries.

CHAPTER 1. INTRODUCTION

11

A random error that turns a 0 into a 1 is statistically likely to have a low probability of being a 1. In simple terms, the signal undergoes an analogue to digital conversion rather than slicing so that instead of outputting a 0 or 1, the decoder might output a value between 0 and 7. A good 0 would be 0, while a good 1 would be 7. A weak 0 would be 3, while a weak 1 would be 4. Overall (although not exclusively), random errors will give rise to weak 1s and Os. If the decoder detects an error, the most likely suspects will be the weaker bits. In essence, what this process does is to increase the resolution of d from whole bits to fractions of bits. This is called soft decision decoding and, although not directly relevant to all codes, it helps error coding to approach its theoreticallimits. The second main type of error is called a burst error. A burst error is characterized by b - 2 bits which may or may not be in error, sandwiched inbetween two bits that definitely are in error, and preceded by at least b errorfree bits. Errors of this kind typically come from electromagnetic transients caused by heavy electrical machinery switching. Deep scratches on the surface of a CD or other media defects might also cause burst errors. These are unpredictable in every way and will normally be handled very differently from random errors. An everyday example of the source of burst errors could be the clicking picked up on the radio when a badly suppressed car engine is running nearby. Returning to the example of a damaged or imperfect CD, many successive bits may be lost in one go. An error code capable of handling this kind of data loss would be excessively large and complex so other processes may be used in conjunction with error coding including interleaving. If you're on of those people who read the technical specs. on your hi-tech appliances in order to compare notes with friends, you'll doubtless have noticed that your CD player boasts "cross-interleaved Reed-Solomon coding" (they all do!) Cross-interleaving is a relatively simple process wh ich decorrelates or spreads errors out. A media defect can give rise to large gouts of lost data so it is necessary to divide the error in order to conquer it. In approximate terms, data is read into a memory row by row and each row is error coded. It is then written onto the media in a colurnn-wise basis. Upon reading, the data is arranged back into the memory colurnn-wise, ready to be decoded row-wise. In the event of a large error, even if a whole colurnn is lost, there will still be only one error in each row, wh ich can be easily corrected. Devising efficient spreading algorithms is quite an art in itself and we'll visit this later. Question: Will interleaving help when combatting random errors?

12

1.5 ABrief History of Error Coding

1.5 ABrief History of Error Coding Modem error coding, while based on maths that was discovered (in some cases) centuries ago, began in the late forties with most of the major innovations occurring over the fifties, sixties and seventies. In 1948 Claude Shannon proved that, if information was transmitted at less than the channel capacity then, it was possible to use the excess bandwidth to transmit error correcting codes to increase arbitrarily the integrity of the message. The race was now on to find out how to do it. During the early fifties Hamming codes appeared, representing the first practical error correction codes for single-bit errors. Golay codes also appeared at this time allowing the correction of up to three bits. By the midfifties Reed-Muller codes had appeared and, by the late fifties, ReedSolomon codes were born. These last codes were symbol, rather than bit, based and underpin the majority of modem block coding. With block codes, encoding and decoding operate on fixed sized blocks of data. Of necessity, the compilation of data into blocks introduces constraints on the message size, and latency into the transmission process, precluding their use in certain applications. It is important to remember that, while the codes and their algebraic solutions now existed, digital processing was not what it is today. Since Reed-Solomon codes had been shown to be optimal, the search was shifted somewhat towards faster and simpler ways of implementing the coding and decoding processes to facilitate practical systems. To this end, during the mid-sixties, techniques like the Fomey and Berlekamp-Massey algorithrns appeared. These reduced the processing burdens of decoding to fractions of what had been previously required, heralding modem coding practice as it is today. Almost in parallel with this sequence of events and discoveries was the introduction in the mid-fifties, by Elias, of convolutional codes as a means of introducing redundancy into a message. These codes differed from the others in that data did not need to be formatted into specific quantities prior to encoding or decoding. Throughout the sixties various methods were created for decoding convolutional codes and this continued until1967 when Viterbi produced his famous algorithrn for decoding convolutional codes which was subsequently shown to be a maximum-likelihood decoding algorithm (i.e., the best you could get). The eighties contributed in two key ways to error coding, one of which was the application of convolutional coders to modulation, creating today's reliable, high-speed modems. Here the error coding and modulation

CHAPfER 1. INTRODUCTION

13

functions ceased to be separate processes but were integrated into a single system. The mathematical space generated in messages by error codes was translated into modulation space. The second was the application of socalled algebraic geometry to Reed-Solomon coding. This step created a class of codes called Quasi-Maximum Distance Separable codes, based on elliptic curves. Today, much attention is being given to Turbo codes. These are a class of iteratively decoded convolutional codes which appear to be approaching the theoreticallimits of what error coding can achieve. Clearly error coding as we know it started life in the fifties but, curiously, the first example of both the need for and the application of error coding could go back as far as 1400BC. In 1890, the Russian mathematician Dr Ivan Panin found a rigorous coding scheme that permeates the length and breadth of the bible. The bible's authors must have feIt the information was important enough to warrant protection of some sort and, presumably with this in mind, built a very complex heptadic (based on sevens) structure into it. Since both the Old Testament (in Hebrew) and the New Testament (in Greek) share characters which are both letters and numbers, each word also has a numerical value. Apparently, only a simple test is required to see if a book complies with the code and, where two manuscripts are found to differ, it is easy to see which is the more accurate (according to the code). Coding of this nature is asymmetric. While it is simple to decode, the encoding process is unknown to date and hasn't been reproduced on any significant scale. Panin, however, used the code to create his own version of parts of the bible, correcting any deviations from the heptadic structure.

1.6 Summary While the introduction is in fairly broad brush-strokes, the ideas covered are central to error coding. Space, in one form or another, is what makes error coding possible. The Hamming and Gilbert boundaries link the amount of space required to the message size and error correcting capability. The maths that I included at this point needs a little thought. If equations like (1.3) are new to you, work through a few simple examples with small numbers that are manageable. Take, for example, four red snooker balls and three blue. See how many different ways you can arrange the blues between the reds then compare your results with

~=35. 3!4!

14

1.6 Summary

You've actually calculated the number of different ways that you can change three bits in a total of seven. Take any 7-bit message and there are 35 other messages that are 3 bits distance from it. At this point, you might even be able to construct an error code using the Gilbert bound coupled with a random code selection and deletion process. Error types and sources have been considered, leading into processes which can augment error coding like soft-decision decoding and interleaving. When all these ideas are brought together, some spectacular results emerge. Messages that come back from deep space probes can be buried deeply in noise and yet, with error coding, the data are still recoverable. Digital TV places tight constraints on acceptable BER and yet, with error coding, it's possible without enormous transmitter power. CDs and especially DAT (Digital Audio Tape) players simply wouldn't work without error coding. Systems like modems combine error coding with signal modulation to provide communications almost at the limit of what is theoretically possible. So what we need now are some practical schemes for implementing error coding and decoding.

Chapter 2

A LITTLE MATHS

This chapter introduces some of the maths that is required in order to perform certain block coding schemes. Treatment is from a practical, rather than theoretical, point of view, always with hardware implementation in mind. No attempt is made to prove the ideas considered here since there are plenty of exceHent texts on the subject and (mathematicians: look away now) generaHy in engineering, "if it works twice, it's a law". No one who has ever used maths for anything practical can have failed to notice how amazing it is that diverse approaches to problems converge upon common answers. Logic, of course, predicts this and this is also why mathematicians go to great lengths to prove that "it wasn't a fluke and it'H work more than twice". Whole number maths, which underpins much error coding, is one of those amazing subjects which opens up a whole vista of possibilities for engineers. Imagine being able to perform all sorts of complex operations using only a finite set of numbers, and never having to worry about fractions! Hopefully this chapter will be fun, rather than onerous and, if you've never heard of finite fields are, you're in for a pleasant surprise.

A. Houghton, Error Coding for Engineers © Kluwer Academic Publishers 2001

16

2.1 Polynomials and Finite Fields

2.1 Polynomials and Finite Fields In error coding, certain ideas, terms and mathematical operations appear quite frequently. Since this is so, these ideas are introduced early on. At first, they'll seem quite abstract but, once a few coding examples are worked through, it'll all make good sense. Inevitably some ideas will get expanded later on but a few basic mIes now provide a good foundation to work from. Generally, operations are executed modulo-2. This means that many functions can be realised by eXclusive ORing, making implementation in hardware trivial. As a result there is no need to be concerned with negative numbers since 1 + 1 = 0, so 1 = -1. Also, much of the maths needed operates over finite fields. Finite fields are closed sets of numbers constructed using things called primitve polynomials or generator polynomials (GP). Any mathematical operation performed over a finite field will result in a number which is in the field. There is a certain amount of terminology and notation which surrounds this type of whole-number maths and it' s useful to acquire at least a superficial recognition of it. For reasons of generality, numbers are often expressed in the form

for example. Since we will only be dealing with binary, x is simply 2. The above example could be written

However, while these representations may be interchangeable, the form used must reflect the operation that you are performing. For example, to multiply two such polynomials (as they are called), say 10101 and 110010, it would be wrong to assume that the answer was 21 x 50 (=1050). This operation would have to be done modulo-2 as

CHAPTER 2. A LITTLE MATHS

17

= 111101 1010 (or 986)

(don't forget, 1 is xo). Usually, the result would have been

but where there is an even number of similar terms the result is zero (i.e. 1 + 1 = 0) so the x 5 term goes since x5 + x5 = (1 + 1)x5 • It may be that the finite field we're using doesn't contain elements as high as x9 in which case the ans wer is modified further, using the generator polynomial to 'wrap' the higher order terms back into the field. We'll see how this operates shortly. A finite field is constructed using a generator polynomial (GP) which is primitive. This means that it has no factors (or is irreducible). Consider the following: (x2 +x+ 1)(x+ 1)

expandsto which simplifies to x 3 + 1 is not primitive because it has at least two factors, x 2 + x + 1 and x + 1. So a primitive polynomial has no factors, and some examp1es inc1ude: x 2 +x + 1 = 0, x 3 + X + 1 = 0, x 3 + x 2 + 1 = 0,

x5 + x4 + i + x + 1 = O ...

The set of numbers which the GP describes contains elements one bit less in length than the GP itself. Technically speaking, the degree of the elements (their highest power or x) is one less than that of the GP. So the field that x4 + x + 1 forms, for example, contains elements which inc1ude bits up to x 3

18

2.1 Polynomials and Finite Fields

but no higher. You may already be familiar with finite fields but in a different guise. Pseudo random binary sequences are sometimes generated in the same way as finite fields. PRBSs are sequences of numbers which repeat in the long term, but over the short term, appear random. Finite fields are often referred to in the form GF(2 n ). The G stands for Galois who originated much of the field theory, while the F stands for field. The 2 means that the field is described over binary numbers while n is the degree of the GP. Such a field has, therefore, 2n element (or numbers) in it. Taking a small field, GF(2\ we'll use the polynomial x 3 + x + 1 = O. To ca1culate the field we start off with an element called a wh ich is the primitive root. It has this title because all elements of the field (except 0) are described uniquely by apower of a. In this ca se a is 2, 010 or x. For any finite field GF(2n), a 2n - 1 = aO = 1. What's special about these fields is that the elements obey the normal laws of maths insofar as they can be added, multiplied or divided to yield another member of the field. Table 2.1 shows how the field is constructed. Table 2.1 GF(2 3 )

a al

a a3 2

Numeric value

Calculation

:::

x X.x x.x.x

::: :::

X ~ ~

:::

010 (2) 100 (4)

there is no term ~ but, from the primitive polynomial ~ + x + 1 ::: 0, we can see that ~::: X + 1, remembering also that 1::: -1. Substituting 'folds' ~ back into the bottom 3 bits.

a3 a4

a5 a6 a7 a8

~

:::

a.a3 a.a4 a 2.a4 a.a6 a.a7

::: ::: :::

::: :::

x+I x.(x + 1)

:::x2 +x x.(~+x) :::~+~ ::: (x+ 1) +~ ~.(~ + x) ::: x.(x + I) + (x + I) ::: ~ + 1 x.(~ + 1) :::~+~ :::(x+l)+x :::a a.I

::: ::: :::

Oll (3) 110 (6) 111 (7) 101 (5) 001 (1)

Notice that a 7 = aO ( = 1) and sequence starts to repeat (i.e. a 8 = ( 1). A reducible or non-primitive polynomial does not produce this maximal sequence. For example, Table 2.2 repeats the above ca1culations using i + x2 + X + 1 =0 which has three equal factors, (x + 1).

19

CHAPTER 2. A LITTLE MATHS Table 2.2 Generating a sequence using a non-primitive polynomial

a a1 a2 a3

Numeric value

Calculation

= =

x X.x x.x.x

=

~ ~

= =

010 (2) 100 (4)

=

111 (7) 001 (1) 010 (2)

thistime, x3=~+X+ 1 so a3 a4 a5

= =

x3 a.a3 a.a4

=

~+x+1 x.(~ + x + 1) = ~ + x 2 + x x.(l) = x =

=1 a1

=

and the cycle repeats. This time, we have not generated all non-zero elements with three bits. In fact, there are three cycles, corresponding to the three factors of ~ + ~ + x + 1 = O. Exactly which one is generated depends on the starting conditions. The above cycle contains 2, 4, 7, 1. Starting with a value not represented here, say 3, then ?

a' ? a.a' = a.a?+I=

x+1 x.(x+ 1) x.(x2 +x)

= = =

=

x+l

=

Oll (3) 110 (6) Oll (3)

generating a smaller cycle containing 3 and 6. Starting with the one remaining unused number, 5, then

a7

a.a?

=

x2 +1

x.(~ + 1)

= ~+x

=

~+ 1

101 (5) 101 (5)

Appendix 1 lists some primitive polynomials for you to experiment with.

2.2 Manipulating Field Elements fu order to do useful things with finite fields it is necessary to understand how to use the elements. This, perhaps, is where whole number maths starts to depart from the traditional concept of mathematical operations although, surprisingly, it's not difficult with a little practice. Using the field in Table 2.1 consider addition (which is also subtraction.) To add two elements, simple bit-wise XOR; for example

20

2.2 Manipulating Field Elements

or in binary

100

LUEB 011

Multiplication and division can be performed in two ways. First, using powers.

Multiplying in this way is simply a matter of adding powers modulo-7. However, the same result may be achieved using the bit patterns of the field elements. Taking the second example

first, eliminate common terms

now use the GP to bring x 4 back into the field = xx3 + X = x(x + 1) + x

sb we've achieved the same result. For division, consider a / a 5 or

CHAPTER 2. A LITTLE MATHS

21

To evaluate this we need to examine long division over finite fields. This is not altogether dissimilar to normallong division and is best attempted bitwise. Rearranging the problem gives

010 1 1 1 Step 1 involves aligning the most significant set bit of the denominator with the most significant set bit in the numerator (assuming both are nonzero). So the sum above becomes 1 (result) 0 f 0 (align) \1i 1 1 (f) 0 1 1 (remainder) ~; ~

and a 1 is placed in the result, coinciding with the least significant bit position of the denominator. A remainder is ca1culated by exc1usive-oring the denominator with the numerator. When the remainder is zero, the division is complete. The two vertical lines represent the valid range of the result which, over this field, inc1udes x 2 , Xl and xo. Now you can see that the first ca1culated bit of the result falls in the position X-I, out of range. To prevent this from happening, we can use the fact that 1 = x + i, from the generator polynomial. In table 2.1, you may recall that the GP was used to change x 3 into x + 1 or = (13. For numerical purposes, we could think of i as (13 and x4 as (14 and so on for any Xi. For example, we saw, in the previous multiplication, how x 4 was brought back into the field using the substitution x(x + 1). This can also be done for negative i. In the division above, there is a result and remainder term in the [ I column. In just the same way that x4 can be thought of as (14 even though it is not strictly in the field, so X-I can be thought of as (1-1, or (16 =~ + 1 . Rearranging the GP, we have so The GP can, therefore, be used to move bits both to the right and the left in order to bring them back into the field. Rearranging, the ca1culation above becomes

22

2.2 Manipulating Field Elements 0 (align)

1 1 (result) 0 0 1 1 EB 0 1 (remainder)

However, continuing the calculation in this way does not lead to a solution since, in this case, it is not possible to get a zero remainder. Instead, the division can be performed as follows:

h 16 11 1 1

0 1 1 1

1 1

1 1

0 1 1 1

14

h

18 15 13

0 1 0 1 M" 0 0 11 EB 0 0 1 12 EB 1 1 0 1 1 h 0 o 1 1A('0 0 1 14 EB 1 0 0 1 1 15 0 1 1 1 A(' 0 1 16 EB 0 0 1 h EB 1 1 1 1 1 18 0 0 0

(result) (X (X

(becomes)

(X3

(remainder)

_ (X5 (X4

(remainder)

_ (X5 aO (x0

(remainder) (becomes)

_ a5 (X2

(remainder)

_ a5 a 3 a 3

(remainder) (becomes)

_ a5 a 3 _ a5

(remainder)

(X5

(remainder)

_ (X5

0

(remainder)

To summarise, when dividing would result in a bit in the result to the right of xO, the left-most 1 in the current numerator is replaced by substitution using the GP. As a result of this iterative solution, the result (the top three rows) looks a bit confusing. However, the bits are summed vertically modulo-2 to give 011 or a 3, the correct result.

CHAPTER 2. A LITTLE MATHS

23

Logs and anti-logs of elements are important and useful functions. Generally theyare most easily accomplished using look-up-tables. However, over large fields or in low-cost microcontroller-based solutions, this may not be practical. If look-up-tables cannot be used then some slower algorithmic solutions must be found. The simplest way to calculate the anti-log or exponential of a value, say i, is to multiply 1 (ci) by a, i times. First, of course, i must be brought into the range 0 to 2n - 2 for the field GF(2n ), either by using the modulo function or addinglsubtracting 2n - 1 as appropriate. Logs can be found in a similar way by counting how many times the operand must be multiplied by a- I until the result is 1 (ao). Over a large field this will be quite slow so ways of speeding up the operation are required. Considering first anti-logs, it is possible to construct a circuit or algorithm which calculates the square of an element. Essentially this is a multiplier with its inputs connected together. The circuit is fed back into itself via a register and preloaded with a. Clocking it n times produces the sequence Z 4 8 a,a,a,a ...

and so on up to a zn -1 • A second multiplier-accumulator (MAC) is required wh ich is initialised to 1. The inputs to this MAC are itself and the square generator. The value i to be anti-Iogged is arranged in a shift (right) register which is clocked synchronously with the square generator. If the output from the shift register is 1, the MAC is latched, otherwise its contents remain unchanged. After n clocks (maximum), the MAC will contain the bit pattern of a i• This gives an exponential speed-up of the original algorithm and is illustrated in Figure 2.1, below.

Clock Figure 2.1 An anti-logging circuit.

24

2.3 Summary

A speeded up logging algorithm works by noting that the first n elements of a field GF(2n ) have a weight of only 1. For example, the field GF(2 3) has 0.0 = 001, 0. 1 = 010 and 0.2 = 100. Any value a i can be changed into the form rf!ab>

d4 d6

Et>

Po

PI

P2

d'3 d's d'6 p'o d'3 d'4 d's P'l d'3 d'4 d'6 P'2

45

®

E9 E9 E9 Et>

Figure 4.2 Encoding and Decoding a Hamming (7,4) Message.

In the encoder, the appropriate bits are summed to produce the three parity bits, and this process is repeated at the decoder. The received parity bits are XORed with the recalculated vers ions to give the syndrome. The syndrome feeds into a three-to-eight line decoder (with true outputs). A zero syndrome results in the 'No Error' output (Yo) going high. A non-zero syndrome results in one of Yl to Y7 going high which then inverts the appropriate received bit via the final XOR gates at the right. There is little point in correcting the parity bits since these are not used again unless the message is to be passed on.

4.2 Extending the Message Size Extension of Hamming codes to longer messages becomes increasingly difficult to visualize. The three-dimensional view of Figure 4.1 has to include 'magie tunnels' in order to provide greater dimensionality to the systems of equations that result. Implementation, however, is as trivial as for the (7, 4) code above. Extending the message so that there are four parity bits allows the syndrome to have 16 values, enough for 15 bits plus a no-error code of O. So the complete message is 15 bits, or a (15, 11) code. To construct the parity check matrix the data and parity bits are ordered as follows:

46

4.2 Extending the Message Size

The bit numbering starts with 1 at the right, increasing to the left. The parity bits are placed in positions corresponding to increasing powers of 2, i.e. 2°, 21, 22, 23, while data bits fill the remaining spaces. Table 4.3 Generating the parity check equations for a (15, 11) code

I?: 1 1 1

11111111

111

I

Table 4.3 shows the four parity bits in the left-hand column, and the 15 message bits along the top row. The parity bits must be set or c1eared so that all non-shaded elements in each row add up to zero (for even parity). From the table it can be seen that the PI row contains only the parity bit PI. Likewise, the other rows contain only their own respective parity bits. This means that the outcome of any parity bit will not affect the ca1culation of any other parity bits. From Table 4.3 the following parity generator equations can be generated: PI P2 P4 P8

= = = =

+

ds

+

+

d7 d7

+

d6

+

d3 d3

+

d7

+

d6

+

ds

+

dB

+

dA

+

d9

dF

+

do

+

dB

+

d9

+

dF

+

dE

+

dB

+

dF

+

dE

+

do

+

dA dc

dF

+

dE

+

do

+

dc

Each parity bit is associated with one of the four message index bits. For example, PI is associated with all message bits whose position index within the message is odd, or has the least significant bit set, i.e. bits 1 (PI), 3 (d 3), 5 (ds) etc. Table 4.4 shows the ca1culation of the parity bits for the lI-bit data word 11011010100. In each row the parity bit is either set or c1eared to give the row weight even parity (a binary summation of 0). With the test data, this gives rise to the parity code 1001. The way that the data and parity are ordered during transmission is a matter of preference. In a bus-oriented system it will probably be more convenient to keep the data together, and keep the parity apart. In aserial communication system it makes little difference whether the

CHAPTER 4. ERROR CORRECTION BY PARITY

47

parity bits are interspersed with the data, or kept separate. Keeping the parity separate in this example, the 15-bit word 110110101001001 would be transmitted. Table 4.5 shows the situation at the receiver when an error occurs in dB • Table 4.4 Calculating the parity bits for a (15. 11) Hamming code

1 PI P2 P4 Ps

1

0

1

0

1 1 1

1

1

1

1

0 0

1 1

1

0

1 1

0

1

0

1

Ps

1

1

0

1

0 0 0

I 1

0

0

P4

0

P2

0 0

0

0

P2

0 0

1

PI

1

0

1

Table 4.5 Lacating a single-bit error in a (15. 11) Hamming code

PI P2 P4 Ps

1

1

I I 1 1

1 I

0

1

0

1

0 0

1 1

0

0

0 0

0

0

0

1

Ps

I

1

0

0

1

0 0 0

1 1

0

0

P4

0

PI

Pe

0

I 1

0 1

The right-hand colurrm of Table 4.5 indicates a parity error. If the reca1culated parity does not match the transmitted parity, then a 1 is placed in the appropriate box. In this case the syndrome due to the error is 1011. 1011 is, of course, 11 10 or B 16 and points to bit dB in the message which must be inverted. Like the (7, 4) code, this does not show if there is more than a single bit in error. We can arrange for the Hamming distance to be increased by one from three to four, however, by adding an overall parity bit to the message. When a single-bit error occurs the overall parity of the message will be wrong, and the inner (correcting) parity bits will point to the error. In the case of a double-bit error the weight of the message will still be consistent with the extra parity bit (hence no overall parity error), but the inner parity bits will be indicating that an error has occurred. In this case the error is not correctable.

Chapter 5

ERROR CORRECTION USING THE CRC

Hamming codes work well for small messages where parallel implementation of the computation is possible, but for longer messages the CRC provides a more economic solution. A single-bit error can be corrected provided the CRC is long enough to describe every bit location within the protected part of the message, and that the CRC is based on a primitive generator polynomial. If the CRC is n bits, then the total protected message length (Le. including the CRC itself) must be less than 2n bits. A CRC encoded message, do to dm-l. (m = 2n - 1) satisfies the equation m-I

L,dia i =0. i=O

If a single bit is corrupted during transmission or storage, then the syndrome will no longer be zero. Suppose d 2 is corrupted. The syndrome is calculated as

which is rearranged to

S =d m-I.Um-I + d m-2.Um-2 + ..... + d 3.a3 + d 2.a2 + a 2 + d I.aI + d O.a0

A. Houghton, Error Coding for Engineers © Kluwer Academic Publishers 2001

50 but we know that d m-I.am-I + d m-2·am-2 + ..... + d 3·a3 + d 2·a2 + d I·aI + d o·a0 = 0

so the syndrome (or remainder) is a 2 • The order of the syndrome provides the position of the error bit so we need to find its log. We can test this by looking at an earlier example where a CRC was used for error detection. Table 5.1 shows the field created by the polynomial x4 + x 3 + 1, used as a GP in Chapter 3 to create the message 101101100010. Table 5.2 shows a longhand reworking of the calculation of the remainder at the receiver, except that a single-bit error has been introduced at bit position 6, shown italic and bold. The remainder is 15 which, from Table 5.1 above, coincides with the field element a6• Repeating the operation in Table 5.3, but this time with an error in bit position 9, we have a remainder of 5 which, from Table 5.1 above, is consistent with a 9 • So from this we see that a single-bit error at position d k in the message, gives rise to a remainder of a k at the receiver. Tabte 5.1 Generating thefinitefieldfor x4 + K + 1 =0

Power

Evaluation

(J?

XO

al a2 a3 a4 aS a6 a7 a8 a9 a lO all a l2 a l3 a l4 ais a

x x.x x.~

x.2 x(2 + 1) x(2+x+l) x(2+~+x+l) x(~ +x+ I) x(2 +x2 +x) x(x2 + 1) x(2 +x) x(2 +~+ 1) x(x + 1) x(x2 + x) x(2 +~) x(I)

Decimal 1 2 4 8

~ 2 2+ 1 2+x+ 1 2+~+x+l ~+x+ 1 2+~+x ~+ 1 x 3 +x 2+~+1 x+l x 2 +x

11 15 7 14 5 10

x3+~

12

1 x

9

13 3 6

2

CHAPTER5. ERROR CORRECTION USING THE CRC Table 5.2 Locating an error at bit 6 using the CRC

GP

1

1 0

Quotient

0

1

I

1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1

CRC

0 0

1 0 1 0

0

0

~

~

0 0 0

0

1 0

~

0

1 ~ ~ 1 1 0 0 0 1 1 1 1 1 1 0 1 1 Remainder=

1 ~ ~ 1 1 0 1 0 0 1 1 1 1 1

Table 5.3 Locating an error at bit 9 using the CRC

GP

1

1 0

Quotient

0

1

I

1 1 1 0 1 1 1 1

0 1 0 1 0 0 0 1 1 0

0 0

CRC

0

1 0 1 0

1 1 ~ 1 1 0 1 ~ 1 1 1 0 1 1 1 0 0 1 ~ 1 0 0 0 1 1 0 0 1 0 0 1 1 0 1 0 1 1 1 1 Remainder=

0

0

1 0

~

0

-

1 ~ 1 0 0 1 ~ 1 1 1 0 0 1 ~ 1 1 0 0 1 0 0 1 0 1 0 1

51

52

5.1 A Hardware Errar Lacatar

5.1 A Hardware Error Locator Chapter 2 covers the idea of calculating logs which are required to find the ord,er of an element, hence the index of the error bit. If time permits, and space for a look-up-table does not, it is possible to use almost the same circuit that calculates the CRC, to find the error. Figure 5.1 shows a small reworking of the basic CRC calculator, necessary to provide error-Iocation.

o

o Calculate

Dataclock Remainder out Figure 5.1 Extending the CRC generator/checker to correction.

The input to the right-hand register is now controlled by a 2-to-l multiplexer. During calculate, while the data bits within the message are arriving, the input is sourced as normal from the feedback path. As the CRC bits arrive at the end of the message the switch is set to REM Out, or remainder out, and the right-hand register now takes its input from the remainder output. This has the effect of placing the remainder in the registers and as soon as they are fUll the switch is returned to calculate mode. The circuit has the effect of multiplying its contents by a. In the example of Table 5.2 the remainder was 1111 corresponding to a6, and a bit error in position 6. If this remainder is replaced in the circuit, then after one clock the registers will contain a 7 • By counting the number of clock cycles that it takes for a non-zero rernainder to return to the value 1 (or an), the location of the error is found. For a6, nine clocks are required since we are counting forwards. This number therefore represents 2n - k - 1 for GF(2 n ). So in this case, k = 15 - docks. Consider the second example above where the remainder is 0101 (a9 ). First, the registers are pre-Ioaded with the remainder 0101, shown bold in Table 5.4. (Note they appear in reverse order in the table.) Successive clocks are then applied to the circuit until the register contents are 0001. At this

CHAPTER 5. ERROR CORRECTION USING THE CRC

53

point, we have reached the error index, shown bold within a double box, in the 15 - Clocks column. Table 5.4 Loeating a single-bit error using hardware A =FB'

B=A'

C=B'

1 0

0 1

1 0

1 1

0

1

1 1

0

0 0

0 0

1 1

0

D

= C' &FB'

FB=D

15 - Cloeks

DCBA= 1

0 1 1

0

0 0

0 0

x x x x x x -/

1

1

15 14 13 12 11 10

0

0

9

1 1

It is possible to configure a circuit to multiply by a- 1• This means that we can count down, ending up at the correct count without having to subtract. Where there is space on an integrated circuit for this extra feature and the message length is substantially shorter than 2n bits, this represents a faster and attractive possibility. Over the field GF(24), a- I is actually a 14 • See if you can work out how to modify the circuit in Figure. 5.1 to do this.

5.2 Multiple Bit Errors Tables 5.2 and 5.3 show two examples of single-bit errors, but what would happen if they occurred together to form a double error? Table 5.5 shows a rework of the division with both errors present. This time the remainder is 1010 or a lO (from Table 5.1). Now the remainders for the two errors singly were 1111 and 0101, and adding them gives 1010. In other words, summing the remainders for the individual bit errors produces the remainder for the combined errors. The syndrome will, in fact, be the sum of the elements corresponding to all errors. Following this, is it possible to know that more than a single error has occurred? Where the message length is 2n - 1 bits, dmin is 3 wh ich means that we can only correct al-bit error. Further, in this case, the code is perfect which means that all syndromes decode to a solution. While the CRC is excellent for error detection, on its own this makes it unreliable as a means of error correction because, in the event of an error, we can have no idea how many bits were involved. However, messages will typically be truncated. For

54

5.3 Expurgated Codes

example, a 16-bit CRC is often used in communications, but the messages it protects are sei dom 65535 bits. If the 4-bit CRC of the previous example had been used to create an 8-bit coded message, then a remainder of a lO would be a little suspicious since this indicates an error in a non-existent bit. Unfortunately, simply truncating a message like this won't guarantee that the remainders for two or more error bits will produce a syndrome that is distinguishable from a single-bit error. It is possible to increase d min to 4, however, by adding an overall parity bit or by using a process called expurgation, which is the trading of 1 data bit for one extra parity or check bit. This allows detection (although not correction) of even bit errors. Table 5.5 Finding the remainder for two errors

Quotient

GP

1

1

0

0

1 1 11 1

1 0 1 1 1

0 1 0 1 0 0 0 1 1 0 1 1 1 1

0 0 1 0 0 0 1 0 1 V. 1 0 0 1 V. 1 1 1 0 0 1 V. 1 1 0 0 1 1 0 0 Remainder=

CRC

0

0

1

0

V. 0 1 1

V. V. V. 0

1 0

5.3 Expurgated Codes Expurgated codes were mentioned in the context of error detection using the cyclic redundancy check. In the previous example, it was shown that simply shortening the message size does not guarantee that two-bit errors will be distinguishable from a single-bit error even though the probability of this increases. Using a (7, 3) code, rather than a (15, 11) code, with x4 + x 3 + 1 = 0, errors at bit positions 0 and 3 look like a single error at bit position 4. You can verify this by adding aO + a 3. Multiplying the generator polynomial by (x + 1), however, gives x 5 + x 3 + X + 1 = O. This still generates a sequence of 15 non-zero elements, but produces a CRC of five bits rather than four. Table 5.6 below shows the 15 element sequence.

CHAPTER 5. ERROR CORRECTION USING THE CRC

55

By trading one of the possible 11 data bits (k) of a (15, 11) code for the extra CRC bit (expurgation) the message length stays the same but the minimum Hamming distance between codes is increased by 1. The resulting (15, 10) code thus has a dmin of 4 rather than 3. This means that, although we cannot correct two-bit errors (for which d min must be 5 or more), if two errors occur their remainder will never be mi staken for that of a single-bit error. You can check this result by examining Table 5.6. The sum of any two of the listed powers of a will never be another member of the sequence. In other words, the remainder produced by a two-bit error will result in a value not represented in the table. The underlying maths involved in expurgation is a little beyond the scope of this book, but it is a useful tool and costs little to implement. Table 5.6 Cyclic codesfor (x4 + x! + l)(x + 1)

Power

Function

Binary value

aD

XO x x2 x3 x4 x3 +x+1 x4 + x 2 + x x2 +x+1 X3 +X2 +X x4 + x 3 + x 2 x4 + x + 1 x3 +x2 +1 x4 + x3 + x x4 + x 3 + Y + x+ 1 x4 + Y + 1 1

00001 00010 00100 01000 10000 01011 10110 00111 01110 11100 10011 01101 11010 11111 10101 00001

a a2 a3 a4 a5 a6 a7 a8 a9 a lO a ll a l2 a l3 a l4 alS

Decimal value

1 2 4 8 16 11 22 7 14 28 19 13 26 31 21 1

In brief, taking any of the previous powers of a from Table 5.1 (Le. those generated by the polynomial x4 + x3 + 1) and multiplying them by x + 1, creates values all of even weight, and none of which are present in Table 5.6,

5.4 The Perfect Golay Code

56

the new sequence. The corollary to this is that all the new powers of a have an odd weight (the other half of the 32 possible patterns). A little careful thought or a few trial examples reveals that, if two numbers of odd weight are added (modulo-2), the result always has an even weight. Extending this idea a little, it should be apparent that any error with an even weight will be detected since it will result in the addition of an even number of oddweighted values from Table 5.6. Expurgation can only be applied where dmin is initially odd, so we can't keep increasing dmin by multiplying with x + 1.

5.4 The Perfect Golay Code Like Hamming codes and the CRC, the Golay (23, 12, 3) code is an example of a perfect code. The binary Golay code (there is also a ternary one) uses either of the following polynomials or

which are mirror images of each other. Since the polynomials are 12 bits long, the remainder will be lI-bit, forming the added redundancy to the 12bit message. Recalling the CRC, you will remember that an error pattern equal to the GP produces a zero remainder. In fact, in this event the corrupted message now looks exactly like another valid message. The polynomials above contain seven set bits. So to move from one valid message to another means that at least seven bits must be changed, Le. the distance between any valid codeword is 7, giving t = 3. This is only true here because of the limited size of the original data (12 bits). Longer data words (over 12 bits) lose this property because the GP overlaps itself during the calculation of the remainder. Consider the following 12-bit message

011010000101 Using the first of the two GPs, the 11 redundant bits are calculated in just the same way that a CRC would be generated.

CHAPTER 5. ERROR CORRECTION USING THE CRC

0

1 1 0

1 1 0

0

0 0

1 0 1 1 0

0

0

0

0 0 1 1 1 0

1 1 1 1 0 0 1 1 1 0 0 1 1 0

1 0 1 0 1 1 1 1 0 1 1 1 0 0 0 0 1 1 0 1 1 0 0 0 0 1 1 0

1 0 1 1 0 1 1 0 1 1 0

0

1 0 1 1 0 1 1 0 1 1 0 1

0 0

0 0

1 0

0 0 0 1 0 1 1 0 1 0 1 0 1 1 1 1 1 0 0 0 1 0 0 1 0

0

0

57

1 0

0 0

1 0

0 0

0 0

1 0

0 0

0 1 1 1 0 1 1

0 0 0 1 0 1 1 0 1 1

0 1 1

0 0

0 1 1

0

0

The completed message is, therefore,

o 1 1 0 1 0000 1 0 1 1 000 1 1 1 1 0 1 O. Decoding will not be as straight forward as for a single-bit error correction. Shortened codes are sometimes decoded by means of syndrome traps. The syndrome is cycled around the CRC checker until the syndrome (or parts of it) form a recognisable pattern. The pattern can be decoded and, by noting how many c10cks were required to cyc1e the syndrome into this state, the correct error positions can be found. A slightly more elegant, algebraic solution can also be used in some instances. The maths involved, while a little complex, forms quite a good example of how finite field algebra is used. You may wish to skip it for now and return at a later time, however. First, we need to consider how the redundancy and, later, the syndrome is formed. If the bits in the Golay encoded message are denoted dO-22 , with dO-lO being redundancy, initially set to zero and d ll -22 data, then 22

r=

LdiCl/ i=ll

where powers of aare generated from the encoder polynomial. In this case, the generator polynomial is 12 bits long so the result of this computation will be the lI-bit redundancy, making the message p = d + r. At the receiver, the syndrome is found from

i=O

i=O

i=ll

58

5.4 The Perfect Golay Code

where p' = d' + r' is the received message. In the event of no errors, S will be zero since we have already arranged for

because 22

Lp;a =0 i

i=O

If errors have occurred, say at bit positions i, j and k, then S will be nonzero and can be found from S = (J} + ci + ak • This can be verified by considering Table 5.7, which lists the powers of a. Table 5.7 Powers 0/ ausing Jl + JO + x 6 +:X! + x4 + ~ + x O = 0

an

Bits

0 aO

an a ll

1

a l2

aB

a l

x

a 2

a l4

a 3

x2 x3

a4

x4

a l6

a5

x5

a l7

a6

x6

a l8

a7

x7

a l9

a 8

i

a20

a9

x9

a 21

a lO

x lO

a22

a l5

Bits

xiö + i + x 5 + x 4 +:XZ + XO xlO + x 7 + x 4 + x 3 + x 2 + X + XO x lO + x 8 + x 6 + x 3 + xl + x O x lO + x 9 + x 7 + x 6 + x 5 + Xl + XO i + x 7 + x 5 + x 4 + Xl + XO x 9 + i + x 6 + x 5 + ~ + Xl x lO + x 9 + x 7 + x 6 + x 3 + x 2 i + x 7 + x 6 + x 5 + x 3 + x 2 + XO x 9 + i + x 7 + x 6 + X4 + x 3 + Xl x lO +X9 + i +X7 +X5 +X4 +X2 x 9 + i +X4 +X3 +X2 +XO x lO +X9 +; +X4 +X3 +X1

The message calculated previously was

o 1 1 0 1 0 0 0 0 1 0 1 1 0 0 0 1 1 1 1 0 1 O. Adding three errors at bit locations 3, 5 and 16, the message becomes

01101010010110001010010

CHAPTER 5. ERROR CORRECTION USING THE CRC

59

and the syndrome is found from

o 1 1

0 1 1 0 0 0 0

1 0 1 1 0

0 0 0 1 1 1 0

1 1 0 0 0 1 1 1 0

0 1 1 0 1 0 1 1 0

0 1 0 1 1 0 1 0 1 1 1 1 0 1 1 1 1 0 0 0 0 0 1 1 1 0 1 1 0 0 0 1 1 0 1 0 1 1 0 0 0 1 1 0 1 1 0 0 0 0 S =

1 1 0 0 0 1 1 1 0 0 0 0 0 0

0

0

0

0 1 1 0 1 1 0 1 1 0 1 1

0 0 0 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 1 1 0 1 0

1 0

1 0

0

1 0

1 1 0 0 0 0 0 1 1 1

1 0 1 1 0 0

0 1 1 1

1 0

0 1 1 1 0 0 0 0

0 1 1 0 1 1

1

0

Using Table 5.7 gives

so which is the correct answer. For reasons which lie in the domain of something called a cyclotomic coset (for which the curious reader should look in the bibliography) it is possible to calculate two further syndromes, in this example, giving three in total. If they are denoted SI. S3 and S9, then

Sl is the same as has just been calculated, but S3 and S9 are computed using different powers of a. In practice (for a decoder) this means either calculating the appropriate shift register feedback to give multiplications by a 3 and a 9 instead of a in the CRC checker circuit (see later), or padding the data with zeros. For example, to find S3, the same circuit as used for Sl will do, but the data bits must have two zeros inserted between them. m thus becomes

0001001000001000001000000001000001 001000000000001000001000000001000

60

5.4 The Perfect Golay Code

with the original bits bold. This generates a syndrome S3 = a 3i + a 3j +a3k . The third syndrome, S9 is calculated similarly, by padding the bits with eight zeros to give S9 = a 9i + a 9j +a9k . In the worst case, there are now three equations and three unknowns, permitting algebraic evaluation of the errors. SI = a i + ci +ak S3 = a 3i + a 3j +a3k S9 = a 9i + a 9j +a9k

(5.1)

(5.2) (5.3)

Rearranging (5.1) gives

Substituting into (5.2) and (5.3) leaves S3 = S{ + SI (a 2j S9

+ a 2k )+ a j (S; + a 2k )+ a k (S12 + a 2j )

= si + SI (a 8j + a 8k )+ a j (S18 + a 8k )+ a k (S~ + a 8j )

Rearranging a 2j (SI +a k )+a j (SI2 +a2k )+Sl +S3 +Sla2k +Sl2ak =0 a 8j (SI + a k )+ a j (S~ + a 8k )+ + S9 + S1a8k + S~ak = 0

si

and making the substitution P = SI + a k ,Q =SI3 + S3 + Sla kPand R =SI9 + S9 + Sl a 8k + S81 a k

leaves a 2j P+a j p 2 +Q=O a 8j P + a j p 8 + R = 0

(5.4)4 gives

Substituting (5.4)4 and (5.4) into (5.5) to remove a 8j and ci leaves

(5.4) (5.5)

CHAPTER 5. ERROR CORRECTION USING THE CRC

61

Let (5.4) have two solutions a and b such that

from which aa+b = QIP and aa + ab =P. Repeating with (5.6) gives

Multiplying throughout by p 8 gives (5.7) This expression is a polynomial in a k and by cycling k through 0 to 22 we can see where the condition is satisfied. Expanding (5.7) gives

a 9k (si + S3)+ a 8k (S\4 + S\S3)+ a 6k (S\6 + s;)+ a 4k (s~ + S\2 sn+ a 3k (S{ + S9)+ a 2k (S\S9 + s\4sn+ a k (s~ S3 + S\2S9 )+ S{S3 + si S9 + S~ s; + st = 0 (5.8) Using the previous example with errors at bits 3, 5 and 16, gives the following syndromes S\ =x9 + x8 + x6 + x 3 + x2 + Xl ~=~+~+~+;+~+~+i+~ S9 = x lO + x 9 + x6 + x5 + x 3 + Xl Substituting into (5.8) gives

a 9\x lO + x8 + x6 + x5 + x4 ) + a 8\xlO + x9 + x7 +; + x3 + Xl) + a 6k(x lO + x8 + x6 + x 3 + x2 + xo) + (l\x lO + x7 + x6 + x5 + x4 + x3 + x2 + Xl) + a 3\xlO + x9 + ~ + x7 + x5 + x4 + Xl + xo) + a 2k(x lO + x9 + x5 + x4 + x2) +

62

5.4 The Perfect Golay Code cl(x9 + i + x7 + x 6 + x4 + XZ + Xl + xo) + (x lO + x 8 + x6 + x 3 ) =o.

Table 5.8 shows the results of cyc1ing k through 0 to 22 in the above polynomial which is satisfied at k =3, 5 and 16, the error locations. Table 5.8 Searching for the error locations

k

Result

1393 1896 486 0 543 0 381 372 2012 1383 10 704 11 125 0 1 2 3 4 5 6 7 8 9

=O? Je Je Je

./ Je

./ Je Je Je Je Je

k

Result

12 13 14 15 16 17 18 19 20 21 22

596 816 1463 998 0 768 834 1989 758 379 1944

=O? Je Je Je Je

./ Je Je Je Je Je Je

Je

What if there are less than three bit-errors? The polynomial gives the correct answer for two- and three-bit errors, but returns all zeros for a single bit error. Decoding for a single-bit error is obviously trivial since SI reveals directly where the error iso In this case, the remainder in the CRC checker circuit for SI can be c10cked until the results are 1, the c10ck count identifying the error location. Appendix B lists an example program which performs error location over the Golay (23,12) code. Because the code is perfect, it is not possible to know that its capacity has been exceeded. It is usual, therefore, to extend the code to a (24, 12) code by adding an overall parity bit. In this case, if the solution to the Golay code indicates an odd number of errors but the overall parity is correct, then the message is unrecoverable (or we have to resort to probability-based decoding), and vice versa.

CHAPTER 5. ERROR CORRECTION USING THE CRC

63

5.5 Fire Codes Fire codes, like the Golay code, have been inc1uded in this chapter because they represent an extension of the operation of the CRC. The Golay code introduces the idea of multiple-bit error correction while Fire codes introduce techniques for speeding up decoding. In communications and storage systems, much emphasis is placed on high speed and low latency. If an error coding scheme is used, decoding time may be critical. This has led to an interesting c1ass of codes which attempts to speed up the error correcting process. One example is the Fire code, designed to correct a single burst error of length up to 1 in a message of variable size. The generator polynomial is based upon two factors as

and p(x) is a primitive polynomial whose most significant power of x (its degree) is greater than or equal to 1. p(x) must not be a factor of (22/- 1 + 1). For example, 1 = 3 and p(x) =x 3 + X + 1, gives g(x) = x 8 + x6 + x? + x3 + Xl + 1. This produces eight bits of redundancy in the message which can be calculated serially in a similar manner to the CRC. Decoding, however, is effected in a slightly different way. Recalling the CRC, the shift register circuit divides the message by the generator polynomial, replacing the long-hand divisions that were used initially. In this case, because the generator has at least two factors, rather than constructing a single divider circuit based upon g(x), the division can be split into two parts. The encoded message will be divisible by both p(x) and (X2/ - 1 + 1) and two syndromes can be created from these two independent divisions. Figure 5.2 shows the decoder circuit for this example.

~--'Message

Syndrome Figure 5.2 Fire code syndrome generator.

in

64

5.5 Fire Codes

The upper half ofthe circuit divides the message by x5 + 1, while the lower half divides it by x 3 + X + 1. The two syndromes are denoted So and SI in this example. It may, or it may not, be obvious from the construction of the upper divider, that it will contain the bit pattern of the burst error (So) after the message has been fully entered. If it isn't, hopefully an example will make this c1earer. Continuing with g(x) = x 8 + x 6 + x 5 + x 3 + Xl + 1, consider the message 110010111001. To encode it, we need to add eight zeros, and then divide by g(x)

1 1

1 0 1 1

0 0 1 0 1 1 1 0 1 0 1 1 1 1 1 0 1 1 0 1 1 0 0 1 0 1 0 1 1 0 1 0 0 1 0 1 1 1

1 1 0 0 0 1 1 1 0 0

1 1 1 0 1 0 0 1 1 1 1 0 1 1 0 0 1 1 1 1 1

1 0

0 1

1 0

0 0

1 0

0 1 1 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 1 1 1 1 1 1 0 1 1 0 1 0 0 1 1 0 1 1 0 1 0

0 0

0 0

1 0

0 0 1 0 1 0 0 1

0 1 1 0

1

1

0 1 1 1 0

0 0

1

1 0

0 1 1

so the encoded message is 11001011100101011101. With no errors, we should expect both So and SI to be zero. Try this long-hand for yourself. Suppose now that the error x5 + x3 occurs such that the received message is 1100101110010111QlO1. Now, So = 01001 since, dividing the message into groups of five bits gives

lI001~101~. So represents the error pattern coincident within the five-bit framework. We know the relative positions of the errors but not their location. Recalling the CRC, SI provides the sum of the errors in terms of powers of a. In this case SI is 100, or a 2 from a 5 + a 3 • Using So we are able to generate a set of possible errors which satisfy the maximum burst error length constraint as folIows:

11001"01001"

CHAPTER 5. ERROR CORRECTION USING THE CRC SI SI

65

=a l + a 3 =aO =a 6 + a l = a 5

Clearly, only one of these possible error patterns satisfies SI. The problem can be expressed algebraically but So must be considered first. The two set bits in So are more than 1 (3) apart. In order that the burst error length constraint is not violated, we must assume that the error pattern is 1 01000, rather than 01001. From this, (5.9) follows (5.9) where i is the index of the 5-bit grouping into which the first error bit falls, 0 being the index of the most right-hand group.

+ a 5i + 5 True?

a 5i + 3

0

1

2

3

a 2

aO

a 5

a 3

./

x

x

x

Replacing SI with a 2, gives

so i = O. Where large bursts are to be corrected and, therefore, the degree of p(x) is also large, a cyclic error trapping solution based on g(x) could be very slow. Tackling the problem in this way leads to a speed up approaching 21 - 1, after the syndromes have been evaluated. Notice that So reveals the error's magnitude while SI reveals its position. The message length can be up to the lowest common multiple of 21 - 1 and the period of, or number of non-zero elements in, p(x), in this case, 5x7 = 35 bits.

Chapter 6 REED-MULLER CODES

Reed-Muller (RM) codes are a dass of low-rate codes that have been largely replaced by Reed-Solomon codes. However, they have certain attributes which may weIl see their re-emergence in modem applications. In particular, coding and decoding is very fast. With Hamming codes (and the CRC used for correction), the message size was n bits where n = 2r - 1 and r was the number of redundant bits. Reed-Muller codes are based on maximal length codes which are, in some ways, the opposite of Hamming codes. Taking a typical Hamming code, say (15, 11), with four redundancy bits, the corresponding maximallength code would be a (15, 4) code so now, n = 2k _ 1 and k is the number of information bits. This gives a very low coding rate of about 0.27.

6.1 Constructing a Generator Matrix for RM Codes These codes are referred to as the dual codes of Hamming codes and share some of their encoding and decoding properties. The Hamming code generator matrix can, for example, be used as the parity check matrix for these codes. To construct a maximal sequence code, a generator g(x) is created from

A. Houghton, Error Coding for Engineers © Kluwer Academic Publishers 2001

68

6.1 Constructing a Generator tor RM Codes x n +1 g(x)=-p(x)

where p(x) is a prnrutlve polynomial whose most significant bit is 1'. Technically, the polynomial is of degree k. Continuing with the (15, 4) code and using p(x) =x4 + X + 1, g(x) is found as follows: 1 1

0 0

0 0

0 1 1 1

1 0 1 1 0 1 1

0 0

0 0

1 0

0 0 0 0

0 1 1 0 1 1

0 1 1 1 0 0

1 0

0 1 1 0 1 1

0 0

1 0

0 1 1 0 1 1

0 1 1 0 1 0 1 1

0 0

1 0

0 0

1 0

1 1

0 1 1 0 1 0 1 1 0

0 1 1 1 0 0 0 0 0

0 1 1 1 0 0 0

0 1 1 1 0

1 1 0

giving g(x) = 100110101111. For these codes, dmin is 2k- 1, the number of Is in g(x) and the codewords are created by placing g(x) in all the n positions that it can occupy in the n-bit message (wrapping around the end of the message). The all-zero message is also a codeword and, not surprisingly, all non-zero messages are dmin from the all-zero codeword. Table 6.1 lists the 16 codewords for this example. C1early, a very simple decoding strategy can be used to decode these, compatible too with soft decisions. In digital signal processing terms, the generator g(x) can be correlated with the incoming message. Reducing this to XNORing and finding the weight of the result, the correlation will be seven in alllocations except the correct one, where it will be fifteen. Errors simply reduce the differential between the correlation peak and the surrounding floor.

CHAPTER6. REED-MULLER CODES

69

Table 6.1 Maximallength (15,4) code Message

0 1 2 3 4 5 6 7 8 9

10 11

12 13 14 15

Pattern

0 1 0 0 0 1 1 1 1 0 1 0 1 1 0 0

0 0 1 0 0 0 1 1 1 1 0 1 0 1 1 0

0 0 0 1 0 0 0 1 1 1 1 0 1 0

0 1 0 0 1 0 0 0 1 1 1 1 0 1 0

0 1 1 0 0 1 0 0 0 1 1 1 1 0 1 0

0 0 1 1 0 0 1 0 0 0 1 1 1 1 0

0 1 0 1 1 0 0 1 0 0 0 1 1 1 1 0

0 0 1 0 1 1 0 0 1 0 0 0 1 1

0 1 0 1 0 1 1 0 0 1 0 0 0

0 1 1 0 1 0 1 1 0 0 1 0 0 0

0 1 1 1 0 1 0 1 1 0 0 1 0 0 0

0 1 1 1 1 0 1 0 1 1 0 0 1 0 0 0

0 0 1 1 1 1 0 1 0 1 1 0 0 1 0 0

0 0 0 1 1 1 1 0 1 0 1 1 0 0 1 0

0 0 0 0 1 1 1 1 0 1 0 1 1 0 0

Take, for example, message 9. If it is corrupted with errors to 011100011001101, the decoding attempt results in the following correlation Index Correlation

0 1 2 3 4 5 6 7 8 7 5 7 7 9 7 7 7 9

9 13

10 5

11 5

12 7

13 9

14 7

15 9

All correlations are either unchanged, or have changed by two, corresponding to the two errors. Extending this to include soft codes is trivial. By adding an extra zero to each codeword the (15, 4) code, above, becomes a (16, 4) code. This does not change dmin , but it does give the codes a very useful property called orthogonality. If the bits were thought of as bipolar, being ±1 rather than 0 or 1 then, repeating the above correlation on these orthogonal codes, results (in the absence or errors) with zeros for all but the correct code. This is rather the same as multiplying a sine and eosine wave together. Over a complete cycle the sum is zero because they are orthogonal. This property is important in modulation schemes. Reed-Muller codes, also known as bi-orthogonal codes, make a further modification to this scherne. For ease of computation, consider a smaller (7, 3) maximallength code, using x 3 + x + 1 = O. This gives rise to the generator g(x) = 10111. Like Hamming codes, these codes are linear, so it is possible

70

6.1 Constructing a Generator for RM Codes

to define a subset of codes from which all others can be formed as in G, below 1 0 1 1 1 0 01 G= [ 0 1 0 1 1 1 0

o

0 1 011

1

This is g(x) in the left most position (top row), shifted right by one bit (middle) and two bits (bottom). Combining these will produce eight unique codewords. To convert G into an orthogonal code generator, an all-zeros column is added giving 10111001 0101110

o

0 1 011

1

Finally, to create a bi-orthogonal code an all-ones row is added. This allows an extra bit to be encoded into the message and does not reduce d min • So G becomes

GRM

1 1 1 1 111 1 01011100 = 0 0 1 0 1 1 1 0 00010111

So the code was transformed through (7, 3) the message 0101 would encode to

~

(8, 3)

~(8,

4). Using

0.1 + 1.0 + 0.0 + 1.0, 0.1+1.1+0.0+1.0, 1 1 1 1 1 1 1 1 [0 1 0 1

0.1 + 1.0+0.1 + 1.0,

0.1 + 1.1 +0.0+ 1.1, 0 1 0 1 1 1 0 0 = 0 0 1 0 1 1 1 0 0.1 + 1.1 + 0.1 + 1.0, 0 0 0 1 0 1 1 1 0.1 + 1.1 +0.1 + 1.1, 0.1 + 1.0+0.1 + 1.1, 0.1 + 1.0+0.0+ 1.1 =0 1 0 0 1 0 1 1

~,

71

CHAPTER 6. REED-MULLER CODES

This is actually a first order Reed-Muller code. The codes can be extended into higher orders. Higher order codes have a higher coding rate, but at the expense of a decreasing dmin • To increase the order, more rows are added into GRM, derived from existing rows. Examination of GRM reveals that all columns are different and express a value between and inc1uding 8 to 15, with the most significant bit on the top row. The ordering of the columns determines only the order in which coded bits appear in the final codeword. Provided the encoder and decoder are agreed on this ordering, GRM can be changed to the following binary sequence, 8, 9, A, B, C, D, E, F. Encoding is now possible using a simple binary counter and a few gates as in Figure 6.1. Here, the three registers are configured to produce a binary sequence, starting at zero for each new 4-bit data word. 1 1 1 1 1 1 1 1

G~M =

0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 Encoder clock

1

Data

--~-------4------~

XOR Serial c deword Figure 6.1 First-order Reed-Muller encoder.

Decoding is generally performed by the Green machine, a technique developed for use with the Mariner c1ass space missions. In mathematical terms, this means multiplying the received codeword with something called a Hadamard matrix in order to perform the Hadamard transform. Recalling

72

6.1 Constructing a Generator for RM Codes

orthogonal codes, upon which Reed-Muller codes are built, the correlation between any valid codeword and any other is zero, while the autocorrelation of a codeword is 2 k (for k data bits). The extension that Reed-Muller codes make to the orthogonal code is to add an extra data bit which, if set, inverts the codeword. This doubles the number of codewords, but the new codewords are simply inverted copies of the existing orthogonal codes. Rather than double the number of correlations performed, decoding may be performed on the same basis as for orthogonal codes. For the (8, 4) code a correlation peak of 8 means the extra bit is zero, while a peak of -8 means that the extra bit is 1. A decoder matrix for G~ above can be constructed by generating the eight codes with the most significant data bit being zero. All zeros are changed to minus ones and the eight codes are arranged into eight columns. This gives the matrix H. In H, -1 is simply denoted - in order to keep its appearance simple. -

1 -

-

1

-

1 -

-

1 1 -

-

1 -

- -

1 -

1

1 H=

1 -

-

1

1

-

1

Consider the Reed-Muller code for 1011, using G~M as the generator. This gives the codeword 10011001.

-

1 -

1 -

1 -

1

[1 -

-

1 1 -

-

1

1

1

1 -

-

1

-

1 -

1

-

-

1 1 -

1 1 1

1 -

In order to decode this, it is converted to its bipolar form 1- -11- -1 and multiplied by H which gives 0, 0, 0, -8, 0, 0, 0, so the solution is column four, or 011; the first column represents 000. Since the result is negative, the most significant bit set, i.e. 1011. Unlike encoding, summation of the matrix

°

73

CHAPTER 6. REED-MULLER CODES

multiply is not performed modulo-2. The reason for this is that decoding uses a correlation process. Decoding can be enhanced by using soft decisions where data are input to the decoder typically in the range to 7, with zero being a good and seven being a good 1. Using the same codeword, and introducing three errors, suppose we receive 65477203

°

(in the range

°

°

to 7). In bipolar form (subtracting 3Y2) this becomes

The result of multiplying the matrix and soft codeword gives -6,0,-6,-12,-10,4,6,4. Again, column four (011) has the greatest magnitude and its sign indicates that the most significant bit is 1 giving 1011. Using hard decisions, the result would have been -2, -2, -2, -2, -6, 2, 2, 2 giving a decoder failure with output 1100. To extend a first-order Reed-Muller code to a second order requires ANDing all combinations of row pairs, excluding the all-ones row. This increases the coding rate at the expense of dmin • The generator matrix for a first order (8, 4) Reed-Muller code is shown in Table 6.2, below, and extended into a second order (8, 7) code. Table 6.2 Extending a Reed-Muller code to its second order Row

3 2 1 0 0.1 0.2 1.2

0 0 0 0 0 0

0 0 1 0 0 0

1 0 1 0 0 0 0

0 1 1 1 0 0

1 0 0 0 0 0

0 1 0 1 0

1 0 0 0 1

1 1 1

74

6.2 Encoding With the Hadamard Matrix

Notice that this is simply a parity check, with dmin now only 2. Each increase in order reduces dmin by a factor of 2. A (16,5) code, with dmin 8, extended in this way would become a (16, 11) code, with dmin 4. Even so, this has a higher code rate than the (8, 4) code for equal dmin • Third order codes are created by combining groups of three rows as weIl as the pairs of the second order, and so forth. Again, this is at the expense of dmin • Reed-Muller codes are often specified as 9\(r, m), where r sets the order of the code and 2m is the number of message bits. The second order system, above is, therefore, a 9\(2, 3) code while a first order (16, 5) code is defined as 9\(1, 4). Decoding of Reed-Muller codes may be performed by majority logic circuits which calculate and combine various parity checks made on the incoming message. However, these get more complicated as the order increases. The nature of the codes means that analogue decoding is possible, operating direct1y upon the incoming signal. This is like soft-decision decoding, operating at the maximum SNR of the signal and permits some quite interesting signal processing.

6.2 Encoding With the Hadamard Matrix It is possible to increase the coding rate of orthogonal codes by exploiting dynamic range, where the modulation or storage scheme has an analogue capability. The Hadamard matrix, above, comprises eight 8-bit patterms which are bi-orthogonal. Because the correlation between any two different codes is always zero, they can be combined and subsequently separated. For example, taking the code for 001 from the second row (-1, 1, -1, 1, -1, 1, 1, 1) and adding to it the code for 010 from the third row (-1, -1, 1, 1, -1, 1, 1, 1) gives -2,0,0,2, -2, 0, 0, 2. Multiplying this combined codeword by the Hadamard matrix gives

0, 8, 8, 0, 0, 0, 0, 0. The result is 8 for the two codes summed together and zero elsewhere. Naturally, all eight codes could be combined and then separated in this way. If we have an 8-bit codeword, then each bit can be associated with one of the eight codes in the Hadamard matrix. Consider the message 0100 11 1 (or-l, 1, -1, -1, 1, 1, -1, 1). Multiplying this by the encoding matrix (non-modulo2) gives

°

CHAPTER 6. REED-MULLER CODES

-

[- 1 -

-

1 1 -

1

75

1 - 1 1 1

1 - 1 1 1 1 1 1 1 - - - 1 111 1 - 1 1 - 1 -

-

-

1 1 1 1 -

=

o

4

1 1 -

1

o

-4

o

4

o

4

-8

+8

Decoding is simply a repeat of this process.

-

~ 4 -4 0 4 0 0 4

-

-8

+8

1 -

1 -

1

1 1 1 -

1 1 1 1

1 1 1 1 -1-11-1-

=

1 -

-8

1 1 1 1 1 1 - 1

-8

+8

+8

The magnitudes of the results reveal the width of the operation while their sign gives the original value. Suppose that the analogue values in the coded signal get modified by noise to 1,4, -3, 0, 3, 0, 1,5. Repeating the decoding process gives

[1 4 -3 0 3 0 1 5

=

-11

+7

-5

-

1 -

-

1 1 1 1

-

1 -

-

1 1 1 1 1 1 - 1

-7

+7

1 -

1 -

1

1 1 1 1 -

1 1 1 1 1 - 1 -

+5

-11

+7

76

6.2 Encoding With the Hadamard Matrix

While the magnitudes are modified, the signs still reveal the original binary data. A coding scheme like this might weIl be appropriate in storage devices. Although memory devices like hard disks, ROMS and RAMS appear digital to the outside world, intemally, many operate in an analogue fashion. To implement error coding digitally requires some of the potential storage area be sacrificed for the error codes. Coding gain still means that we can win overall, but a scheme like this can exploit the underlying analogue nature of the medium to acquire the space for error coding, rather than using physical area. The net result is more storage per unit area of medium. Coding and decoding are extremely fast and simple too and could be embedded directly in the analogue readlwrite circuitry. This process could be taken a step further, by encoding two bits for each bit previously. Instead of limiting the data to ±1, let each of the previous bits be any one of -1.5, -0.5, +0.5 or +1.5, where -1.5 is 11, -0.5 is 10,0.5 is 00 and + 1.5 is 01. So a data word 10 11 0001 1000 1111 becomes

-0.5, -1.5, 0.5, 1.5, -0.5, 0.5, -1.5, -1.5 Multiplying by the matrix as before gives the result

3,1,1,-1,-3,-1,7,-3 and again, to decode gives

-4,-12,4,12,-4,4,-12,-12 or 8x the original message. The trade-off is dynamic range. We could keep increasing the number of bits per message, but the burden on dynamic range increases too. This coding effectively spreads each bit across the entire codeword in much the same way that the Fourier transform spreads a single value in one domain across all values in the transformed domain. By doing this each message bit is no longer entirely dependent on the integrity of a single bit or value in the storage medium or communications process. Any effect in a coded value, upon decoding gets spread over the entire message or thinned out. You can think of this in rather the way that a lens acts on an image. If you place a lens so as to focus an image onto a surface, obscure part of the transformed (out of focus) image by, say, drawing a spot on the lens surface, the result on the image is negligible. Figure 6.3 illustrates this effect.

77

CHAPTER 6. REED-MULLER CODES

Input Message

Encoder

Coded Message

Decoder

Output Message

Figure 6.3 Bit Spreading Using an Orthogonal Code.

Input data bits are spread over the coded message by the encoder, and brought back into focus by the decoder. However, an error (the open circle) in one of the coded bits is spread out at the same time as data are brought back into focus, minimising its effect on the output bits. Channel capacity is a function of dynamic range and bandwidth. In all earlier examples (at least prior to modulation) redundancy has been added to a coded message by adding extra bits - using more bandwidth (or physical storage area). This coding, however, pushes the redundancy into the dynamic range. There are as many coded symbols as input symbols. However, the coded symbols are no longer simply ±1, but occupy a range of values. These analogue values can, of course, be expressed as binary values too. The 8-bit example produces outputs in the range ±8, ±6, ±4, ±2, 0 which require 3.17 bits to code, giving a coding rate ofO.315.

6.3 Discussion For error correction, so far we've considered Hamming codes, the CRC, the Golay code, Maximallength codes, Reed-Muller codes, orthogonal codes and Fire codes. Error correction is about constructing the message space in such a way that it is possible, by some means, to work back to the nearest valid message in the event of errors. Each of codes illustrated achieves this is a particular way. With Hamming codes, the idea of parity is extended into multi-dimensional data structures where simple matrix calculations are used to generate and check the parity. In terms of its operation, extension of the CRC to error correction is very similar to Hamming codes. However, the

78

6.3 Discussion

technique used to detect and find errors is based in finite field algebra, lending itself to serial ca1culation. Subsequent codes extend these basic ideas in order to deal with multiple bit errors. The Golay code is a special example of the CRC, operating over a truncated message. The Fire code deals with multiple bit errors over large messages, but imposes a constraint on the maximum 1ength of the error. It demonstrates how the process of error correction can be speeded up by separating errors into magnitude (the error bit pattern) and position. ReedMuller codes illustrate how fast coding and decoding is possible. Their biorthogonal structure lends itself the physical world where encoding and decoding can take place in very fast analogue circuitry. It may weIl be that this kind of coding becomes an enabling technology in very high density memory devices, where both coding redundancy and multi-bit capability are packed into the analogue quantities of the memory cells. These codes also show how space can be measured in fractions of bits, introducing the idea of soft-decision decoding.

Chapter 7

REED-SOLOMON CODES

So far, we've looked at bit-oriented error correcting schemes. ReedSolomon (RS) codes, however, are symbol-based. In other words, bits are combined into symbols upon which the coding is performed. RS codes are a special example of a more general class of block codes called BCH codes after Bose, Chaudhuri and Hocquenghem, forefathers of the theory. ReedSolomon error correction can be understood and implemented in a variety of ways, the principal ones being 'time domain' and 'frequency domain'. Time domain coding is easy to grasp but rather limited in application, while frequency domain co ding is perhaps a little harder to grasp but its application is much more amenable to generalized solutions. We can of course combine both of these to yield a third solution in some cases, but more of this later.

7.1 Introduction to the Time Domain The use of the term 'time domain' comes from an analogy with signal processing. In general signals are expressed and seen in the time domain, for example, a sine wave on an oscilloscope. Another representation of the signal, however, is in the frequency domain where a sine wave would appear as a single point in two-dimensional space describing frequency and amplitude. Signals usually exist in the time dornain and have to undergo

A. Houghton, Error Coding for Engineers © Kluwer Academic Publishers 2001

80

7.2 Calculating Check Symbols for One Error

transformation in order to be seen in the frequency domain. It is at this point that the analogy ends. With time domain error coding the encoded message contains the protected data in its original format. If we were to eavesdrop onto a digital network, time domain ASCII messages would be quite visible in their original format plus error codes and framework bits. In other words, the data is transmitted as is, with additional protection coding appended. In the frequency domain (discussed later) the data undergoes transformation before being launched onto the communication or storage medium. This is not usually for the sake of encryption, but convenience of coding. In this case, the eavesdropper would see nothing recognizable without first converting the message back into its original domain.

7.2 Ca1culating Check Symbols for One Error This kind of coding works over symbols rather than bits which eases the maths but makes it less amenable to soft decision decoding. Codes might be described as one-symbol or two-symbol correcting. Symbols are the same size (in bits) as the finite field elements used to perform the coding. For example, a message encoded over GF(2 3) would comprise a number (seven in fact), of 3-bit symbols. If you recall the CRC, a remainder was added to the message such that rn-I

Ldiai ==0 i=O

was satisfied, or the message was exact1y divisible by a generator polynomial. In this case, d i represents the m bits in the message. However, this can be modified to rn-I

LPiai ==0

(7.1)

i=O

where Pi are symbols, and the message is m symbols long. Clearly, at least one or more of Pi must be a check symbol that can be set to make this possible. Using GF(23) (x3 + X + 1 == 0) consider a message 3, 4, 6, 0, I, I, Po, where Po is a check symbol. We can convert this message to finite field 00 2a, 4 0 ,a, 3 a, ·fy elements as a, a ,Po and we want to satls

or

CHAPTER 7. REED-SOLOMON CODES

so

81

5

po=a.

Evaluating (7.1) during decoding will give 0 in the absence of errors in a similar way to the CRC. Suppose that P2 is corrupted by the error e during transmission such that the received message is 3 2 4 0, (e+a0) ,a,a 0 5 a,a,a,

the syndrome from evaluating (7.1) will be

or

now the terms in brackets sum to 0, so (7.2) Like the CRC, the error has been detected but this time, it can't be corrected. Here, symbol-based coding departs from bit-based coding. With bit-based coding, a bit error and its position are one in the same thing. The magnitude of an error (Le. its pattern) is always 1, so the only unknown is position. In this example, a single-symbol error can have up to seven magnitudes (any non-zero combination of three bits); to correct an error two unknowns must be solved, magnitude and position. This can be achieved at the expense of one further symbol, say PI, by imposing a second constraint. Let the message satisfy rn-I

L.PiaiXj =0 i=O

for j =0 and 1. j = 0 gives a simple vertical checksum, the summation of all the symbols (since UD is 1), whilej = 1 gives (7.1) as before. So how can we impose these two constraints? Using the previous example (without PI), the message must satisfy

j=O: j = 1:

82

7.2 Calculating Check Symbols for One Error

which simplify to Po+ PI = 0

and

Po + o.pI =

Eliminating Po

(i.e. Po = PI)

0.3•

so

Since Po = PI in this example, the message is

a3 a2 ,

N4

,...,.,.,

0

NO NO NO

,VIt,\.At,\.At

At the receiver, two syndromes are calculated from m-I

Sj =

L Pi

aiXj

i=O

Adding the error 101, to P3 gives the message

and syndromes

Since these are non-zero, an error has obviously occurred. Note that So is the error magnitude. We know from (7.2) that SI = eo.k where e is the error magnitude and k is the error index so

or

and k = 3. Correction is performed by adding So to the received symbol Pk. In order for k to represent a unique symbol, the message size must be no greater than 2n - 1 symbols in the range Po to P2" -2 over GF(2n ). This me ans that a message over GF(2n ) will convey a maximum of n(2n - 1) bits. Messages can be truncated much like the Golay code is special case of a

CHAPTER 7. REED-SOLOMON CODES

83

truncated CRC. In this case, however, truncation does not guarantee that dmin will increase. RS codes are not perfect codes, but trying to find their position between the Hamming and Gilbert bounds is not simple. The reason is that, while up to three bits could be corrected in this example, they must be contained within a single symbol, i.e. their position within the message is not arbitrary.

7.3 Correcting Two Symbols If we were to extend this example to correct two symbol-errors, we' d require four check symbols in order to evaluate four unknowns, two error magnitudes and two error locations. In this case, the message would have to satisfy m-I

LPP./Xj =0 i=O

for j = 0 to 3. Continuing with this example, the message becomes

from which we have

j=O: j

a + a + a + P3 + P2 + PI + Po = 0 3

2

4

= 1:

j=2: j=3: or

Po + PI + P2 + P3

= UO

84

7.3 Correcting Two Symbols

Substituting to eliminate Po gives

o

Cl

o

Cl

o

Cl

or

I

2

3

2

4

6

3

6

2

5

+ PI + P2 + P3 + PI U + P2U + P3U = Cl

+ PI + P2 + P3 + PIU + P2U + P3U = Cl 4

+ PI + P2 + P3 + PI U + P2U + P3U = Cl

or

Repeating to eliminate PI gives

so substituting back gives

_

Po-Cl

2

and PI --

4 Cl

so the message is or

3,4,6,7,4,6,4

Introducing the error e = 0, eb, 0, ea, 0, 0,

where ea = 2 and eb = 6, and a = 3 and b = 5 i.e.

°

CHAPTER 7. REED-SOLOMON CODES

85

0,6,0,2,0,0,0 so that the received message is 3,2,6,5,4,6,4 and the syndromes are So = a 2, SI = a 1, S2 = , S3 = a 2

°

The syndromes are calculated from

Solving for aa (see Appendix C) gives

a 4a (S5 S2 + SOSI2)+ a 3a (S5 S3 + SOSI S2)+ a 2a (S12S2 + SOSI S3 )+ aa (SI si + SOS2S3)+ (si + SI S2S3)=

°

(7.3)

Using the error vector 0,6,0,2,0,0,0, the syndromes will be 2

So = a , SI

= a, S2 = 0, S3 = a 2,

so (7.3) becomes (7.4)

°

Evaluating for a = to 6 (all the symbol positions) gives Table 7.1 Evaluating the Error Positions

a

(7.4)

=O?

°1

a2 a5 a2

X

2 3 4 5 6

° a° a4

O

X X

./ X

./ X

86

7.3 Correcting Two Symbols

Clearly (7.4) is satisfied at the error locations. The distinction between a and b is notional so the one equation gives both solutions. Substituting these values back into Sl gives (fromSo) and and eliminating eb gives

so and

eb -a - 4.

Retuming to (7.4), with two errors, we expect two solutions. Removing the common factor a2a in fact leaves a quadratic so the form of (7.4) could be generalised to where there are two roots, i = a and i = b. Expanding this form gives (7.5)

Removing the factor a2a+4 from (7.4) leaves

This quadratic can be compared with (7.5) above, giving

This looks easy to solve, and would negate the need to evaluate all possible solutions as above since a and b could be found directly. However, performing a simple substitution for, say, ab gives

or

CHAPTER 7. REED-SOLOMON CODES

87

which is, of course, where we started. The two simultaneous equations lie in different domains. aa + ab = a 2 is a function of symbols or bit patterns, while aa+b = a is a function of powers. Both domains are linked by the GP, which provides a solution, albeit rather tedious. If

where aj represent the three bits which comprise aa and, similarly,

then satisfying aa + ab = a 2 means that

since a 2 =x 2, and satisfying aa+b = a means that

From the GP, x 4 = x 2 + x and x 3 = X + 1 so the higher order terms can be redistributed among the lower order terms giving 2

X (a2bo + albl

+ aob2 + a2b2) + x(albo + aobl + a2b2 + a2bl + al b2) + aobo+ a2bl + al b2 = a

so a2bo + albl + aob2 + a2b2 = 0 ajbo + aob j + a2b2 + a2b j + a jb2 = 1 aobo+ a2bl + a j b2 = 0

since for a the x2 and X O terms are 0, while the x term is 1. The relationships

tell us that

allowing substitutions into the second set of equations so a2aO + ajaj + ao!a2 + a2!a2 = 0 alao + aoal + a2!a2 + a2al + al!a2 = 1

88

7.4 Error Correction in the Frequency Domain

which minimise to so

or ?11. In actual fact, ? is both 1 and 0 giving 011 and 111, or a 3 and a 5 • Where results are indeterminate, substitute in a value for one of the unknowns and evaluate any remaining unknowns on this basis. The alternative to trying to solve the quadratic is to use it to minimise the search. Table 7.1 shows the solutions to all seven values of a in this example. Since evaluating a = 0 does not yield a correct answer, a + ab = a 2 teIls us that a = 6 won't either. Similarly aQ +b = a teIls us that neither will a = 1. In this way it is not necessary to test all values of i. What would happen if there was only a single symbol in error? Removing es gives syndromes Q

Substituting into (7.3) yields all zero terms. Since this is so, we can use So and SI to correct the error as in the single error case where e = So and ak = SI/SO, so e = a and ak = a 4/a = a 3 and k = 3. Because RS codes are not perfect, if more than two errors occur, there is a good chance that (7.3) will not have any roots and will not decode to a solution. This happens because the corrupted message vector falls into a kind of no-mans-Iand, greater than half dmin from any valid message, but not particularly nearer to any valid message than any other valid message. If this happens, unlike perfect codes, we know the error cannot be corrected.

7.4 Error Correction in the Frequency Domain The time-domain solution for correcting a single symbol error is vastly different from the solution for correcting two errors. Generating the check symbols is different too. In fact, only real similarity between the one- and two-symbol correcting cases is the ca1culation of the syndromes. For fixed problems this may be acceptable, but time-domain error correction cannot easily provide a scalable solution. This quest leads us on to frequency domain error correction. The idea of time and frequency domains in finite fields is mostly analogous. True, messages travelling over communications media are

CHAPTER 7. REED-SOLOMON CODES

89

in the time domain, but those same messages could just as easily be stored on a hard disk which, presumably would be in the spatial domain. The concept of a time and frequency domain arises from what the Fourier transform actually does to signals. Potentially, a single bit changing in the input vector can change all bits in the Fourier-transforrned output vector. The same idea is also illustrated by the earlier example of bi-orthogonal codes, where a codeword is generated by multiplying the message by the Hadamard matrix. In this way each message bit is spread across the entire codeword and does not rely on any specific part of the codeword for its reconstruction. Just as with Fourier transform, a correlation process then brings the dispersed bits back into focus upon decoding, recreating the original message. The Fourier transform of a one-dimensional data set over finite fields is performed by (7.6), below rn-I iXj F. = '" LJ d J.a 1

(7.6)

j=O

while the inverse transforrn is performed by (7.7) rn-I iXj d.1 = '" LJ F.aJ

(7.7)

j=O

over GF(2n ) where m = 2n - 1. The Fourier transform, you may have noticed, is identical to the process used to ca1culate the syndromes in the previous time-domain RS examples. Borrowing the two-symbol coded message from the earlier time-domain example we have 3,4,6,7,4,6,4 Applying the Fourier transforrn to this gives

or

Fo =a 3a O + a 2 a O + a 4 a O + a 5a O + a 2 a O + a 4 a O + a 2a O =0 F 1 =a 3a 6 + a 2a 5 + a 4 a 4 + a 5 a 3 + a 2a 2 + a 4 a l + a 2 a O =0 F2 = a 3 a 5 + a 2 a 3 + a 4 a l + a 5 a 6 + a 2 a 4 + a 4 a 2 + a 2 a O = 0 F 3 = a 3a 4 + a 2a l + a 4 a 5 + a 5 a 2 + a 2 a 6 + a 4 a 3 + a 2 a O = 0 F4 = a 3a 3 + a 2 a 6 + a 4 a 2 + a 5a 5 + a 2 a 1 + a 4 a 4 + a 2a O = a 2 F5 = a 3a 2 + a 2 a 4 + a 4 a 6 + a 5a l + a 2 a 3 + a 4 a 5 + a 2 a O = a 3 F6 = a 3a 1 + a 2 a 2 + a 4 a 3 + a 5a 4 + a 2 a 5 + a 4 a 6 + a 2a O = a 3

90

7.4 Error Correction in the Frequency Domain

°° °

a 3, a 3, a,2 , ,0,

Now we'd expect the four zeros because these constraints were imposed on the message when it was coded. However, the interesting corollary to this is that, if the message is created in the frequency domain with four zeros, applying the inverse Fourier transform will create a time-domain codeword with the necessary properties for two-symbol correction. In this case, the message starts life in the frequency domains as 3,4,6,0,0,0,0 where the data are as before. Instead of calculating check symbols, zeros are appended to the message. Applying the inverse Fourier transform gives a suitably coded time domain message 7, 2, 6, 2, 1, 1, 1. At the receiver, the coded message is Fourier transformed to restore the original data and zeros. The codingldecoding process now scales easily. For at-symbol error correcting code, F2t- t to Fo are replaced by zeros prior to inverse Fourier transforming. As we've seen, if errors occur, non-zero syndromes appear on decoding. Taking the coded message, above, and corrupting it with the error used before, 0, 6, 0, 2, 0, 0, gives

°

7,4,6,0,1,1,1 and the Fourier transform is 1,2,0,4,0,2,4. The original data, 3, 4, 6, are lost from view and the zeros have become 4, 0,2,4, the syndromes calculated previously. It is informative to examine the Fourier transform of the error, a luxury not normally available. This is 2,6,6,4,0,2,4 The last four symbols are underlined because they are identical to those of the transformed, corrupted message (and the syndromes in the time-domain example). Any zeros in the uncoded message provide a window onto the transformed error vector. The size of this window determines how complex

CHAPTER 7. REED-SOLOMON CODES

91

an error ean be evaluated. From the time domain example, we've seen already how this information ean be proeessed to find the error(s). Appendix C shows how the two-symbol solution is derived and it is eonsiderably more eomplex than the single-symbol ease. Progressing to three symbols gets yet more eomplex, so there is a need to bring some generality into determining the errors.

7.5 Recursive Extension Rather than try to solve the errors direetly from the syndromes it is possible to use a proeess ealled reeursive extension which, using the 2t known error speetra, ealculates the remainder of the error speetrum, normally obseured by data. Using the earlier example, this would be 2, 6, 6. Onee the eomplete error speetrum is known, an inverse Fourier transform generates the time-domain errors. These ean be added to the reeeived message wh ich undergoes a further Fourier transform the give the eorreeted message. From the error e = 0, 6, 0, 2, 0, 0, 0, it is possible to ereate an error locator polynomial, 1 = 16 , 0, 14 , 0, 12 , 11. 10 whieh, when multiplied by the error, will always give zero regardless of the values of Ij . One example, might be

whieh produees the sequenee I = 7, 0, 5, 0, 2, 5, 7. Fourier transforming this gives L =4, 1, 0, 0, 0, 0, 2. Notice first, the four zeros. The number of zeros in L, refleets roots in e, the error veetor. For eaeh additional error in e (henee extra root in I), there will be one less zero in L. Now, if you're familiar with signal proeessing, you'll doubtless be aware that multiplication in the time domain is the same as convolution in the jrequency domain or multiplication in the jrequency domain is the same as convolution in the time domain. E = 2, 6, 6, 4, 0, 2, 4 is the frequeney domain form of the error, while L = 4, 1, 0, 0, 0, 0, 2 is the frequeney domain form of the arbitrary loeator with zeros at the error positions. If the above is true, then eonvolving the two veetors will produee zero. Convolving at an arbitrary offset (remembering to reverse the ordering of one of the veetors) gives

92

7.5 Recursive Extension

You can try any offset between the vectors (wrapping around the ends) and the result will still be zero. The convolution is described by t

"'LEk LJ J -J.=0 j=o

for terrors and any k and can be performed by a recursive extension circuit (RBC) like Figure 7.1, below. r---~L-"

+

'J

,L

)

R'es;;U: 0- - -

!--......-o>--_Q

D' - - -

- ,'- -

Clock

Figure 7.1 A Recursive Extension Circuit.

The shift registers store any t + 1 sequential elements from E, while the multipliers have weights derived from L. Clearly, since any non-zero element in the locator 1 may have an arbitrary value, there is a great deal of flexibility in the choice of values for L. What we know is that the result from the circuit must be o. Given this flexibility, it is possible to arrange for La to be 1. This means that the value at the summation point}': must be equal to the value in the dotted register since, with La = 1, they must sum to zero. Because of this the circuit can be c1osed, feeding },: into the first solid register. For the two symbol error this would give the following circuit in Figure 7.2 below.

Figure 7.2 An REC Jor Two Errors,

CHAPTER 7. REED-SOLOMON CODES

93

In principle, this circuit can generate the frequency-domain form of a vector with five roots. Suppose we preload it with the first two of the four known error spectra, 4, 2. The input to QI will be

°

and this must be 0, in order to correctly generate the next known value. Shifting the circuit leaves Q2 =2 and QI = and the input to QI is

which must be a? to generate the last known value. Solving these key equations, as they are called, gives

Lz =a I and LI =a 2. Having found L, for fun and instruction I can be constructed using the polynomial 1 + L Ia-j + L 2a-2j = Ij

°

which is derived from the inverse Fourier transform. Cycling j through to m - 1 gives the sequence 1= 1,0, 7, 0, 6, 6, 7. The two zeros atj = 3 andj = 5 belie the error positions and, knowing these, we could solve the errors using the syndromes So and SI as follows

bI SI = a a ea + a eb = a .

Substituting for a and b leaves

a

eliminating e3

aI so

I

= a 3e3 + a 5e5

= a 3(a 2 + e5) + a 5e5

For generality, however, errors are found by completing the error spectrum using the REC. Setting LI (a2) and Lz (al) and preloading the circuit with two sequential components from the known error spectra, this circuit will

94

7.5 Recursive Extension

proceed to construct the entire spectrum of the error if clocked repeatedly. With the last two known spectra in the registers, the input to Ql will be clocking again gives and again and again

back to the start of the known spectra. Once the complete error spectrum is known the time-domain errors can be found using the inverse Fourier transform. In this way the message can be corrected. Repeatedly clocking the REC will cyclically output the error spectrum and this leads to a useful property. If after seven clocks the contents of the REC are not the two original error spectra, (i.e. the cycle 1ength is greater than seven), then there are more than two symbol-errors present. So we have a convenient way of detecting uncorrectable errors. In this example, there are

7! 49x-=1029 5!2! possible two-symbol errors that could occur. The four syndromes from which the REC is generated can, of course have 84 - 1 = 4095 non-zero va1ues. Very superficially, if an uncorrectable error leads to a random result in the four syndromes then about one in four (1029 in 4095) of these cases will pass undetected. If there are fewer errors than registers in the REC, then L cannot be resolved, rather like two simultaneous equations cannot be resolved if they are not linearly independent. In this case, the size of the REC must be reduced. The three-symbol error, 0, 1, 1,0,0,5, has the spectrum 0,5,2, 1,3,0,5 of which in this example we would know 1,3,0,5. Solving for L l and!Jz, as before gives

°

CHAPTER 7. REED-SOLOMON CODES so Lz = (l and LI = sequence

ci.

95

Putting in these values and clocking gives the

a?(l + (1.0(1.4 = (1.5

+ (1.5(1.4 = (1.1 (1.5(1.4 + (1.1(1.4 = (1.3

(1.0(1.4

(1.1(1.4

+ (1.3(1.4 = (1.2

+ (1.2(1.4 = (1.2 (1.2(1.4 + (1.2(1.4 = (1.2(1.4 + 0(1.4 = (1.6 0(1.4 + (1.6(1.4 = (1.3 (1.3(1.4

°

and so forth. Lining these up gives the sequence [ ... 3,5,0,4,4, ][3, 2, 7,1,3,0,5]

The four bold numbers were used to calculate the locator L, while the three underlined numbers are the completed spectrum from the REC. Notice first, that the completed spectrum is different from the actual error spectrum (0, 5, 2, 1, 3, 0, 5.) Working at the limit of the code's correcting capability, the underlined numbers are not visible, being obscured by data spectra, so we won't know immediately that recursive extension has been unsuccessful (i.e. that there were more than two errors.) However, finding the inverse Fourier transform of the extended error spectrum will produce a time-domain message with more than two errors. In this case, the time domain error message would be 4,0,3,1,4,6,1, with six errors. We can save ourselves the expense of the inverse Fourier transform by examining the remainder of the extended spectrum. Strictly, only three unknown spectra (underlined) were required but by clocking the REC a few extra times, the spectrum should repeat itself. Clearly it is not doing so, outputting 4, 4, 0, .. instead of 5, 0, 3, ... To recap frequency domain error correction, a message starts life in the frequency domain as m symbols (Fo to F rn-I), each n-bits over GF(2n) where m = 2n _1. • •

For at-symbol error correction, F o to F 2t- 1 are replaced by zeros. The message is inverse Fourier transformed over GF(2n) to give the time-domain coded message.

96

7.5 Recursive Extension •

At the receiver, the message is forward Fourier transformed over GF(2n ) which, in the absence of any errors, recreates the original message.



If the syndromes ( F~-2t-1 ) are 0, the message is deemed error free.



If the syndromes are not zero, they are used to generate a set of key equations which are solved to find the error locator polynomial L. L is used to construct a recursive extension circuit wh ich completes the error spectrum. If the error capacity is not exceeded, the REC will cycle over m clocks. The completed error spectrum is inverse Fourier transformed to create the time-domain errors. The time domain errors are added to the received message F' to correct it. The corrected time-domain message is Fourier transformed to generate the original message and zeros.



• • •

In this two-symbol example, solution of the key equations is quite tractable, especially when compared with the analytical solution of the previous section. However, as t increases, the solution gets more complex. The following are the key equations for a three-error correction

So.L3 + SI.L2 + S2.L1 = S3 SI.L3 + S2.Lz + S3.L l = S4 S2.L3 + S3.Lz + S4.L l = S5 which can be rearranged into a matrix form as follows

or

S.L=R. To evaluate the matrix L, S must be inverted so L

= R.!)l and S.!)l = 1

If there are fewer than three errors present, then the last identity, S.!)l = 1, will not hold true. In fact, !)l will contain all zeros. To test this identity only

CHAPTER 7. REED-SOLOMON CODES

97

one row or column of ~l need be evaluated which represents a saving over evaluating the entire inversion. The solution to this particular example is SlS4 +S2 S 3

SOS4 +S~ SOS3 +SlS2 where

Matrix solutions can be generalised but for larger cases are none-the-Iess computationally quite expensive (see Appendix D). There are, however, two algorithms which simplify evaluation of the key equations.

7.6 The Berlekamp-Massey Algorithm The Berlekamp-Massey algorithm provides an alternative solution to the key equations, based upon trial and error, using the error to feed back and modify the trial. Roughly, the algorithm works in the following way. As shown, if the syndromes (i.e. the known error spectra) are convolved with the frequency-domain error locator polynomial L, then the result will be zero. The convolution need only extend to the order of the error spectrum (i.e. one register in the REC per error in the original message). The algorithm starts by assuming that a single-symbol error has occurred and perforrns the convolution. The result should be zero and, if not, an error is generated. Depending on the phase of the algorithm, this error might modify L, or it might both modify L and increase the order of the problem (i.e. add another register to the REC). In other words, the feedback error is used to try to modify L to fit the problem. If it cannot, then the solution must require a higher order. Using the previous example, we have four known out of seven error spectra for a double-symbol error, ?,?,?, 4, 0, 2, 4. Previously, the REC was constructed by solving a couple of simultaneous equations. This time, we'll use the Berlekamp-Massey (BM) algorithm. The algorithm requires an iteration counter k, a trial LCk)' the frequency domain locator at iteration k, an

98

7.6 The Berlekamp-Massey Algorithm

error ~k) which is the difference between convolution at k and the syndrome input, and a modifier polynomial T, used create L(k+i), based on ~k) and L(k)' Also included is the order of the problem, j, and the syndromes Sk. Table 7.2 lists the sequencing of events, starting with an overview at the top and an example at the bottom. L(k) is the locator which, at any point, will output the previous syndromes, while ~k) is the difference between the locator output and the current syndrome. If the order of the locator is reached before all the known syndromes have been input, then A will be 0 thereafter. L is described in the polynomial form Ix = 1 + o.-xL i + o.-2xLz, as seen earlier. From this, you can work out Li and Lz in order to find L *S at each iteration. If we consider ~4) from the table, it is ci (the current input) + a. (the result of a.6 .0 + a.5.a., or ~3) convolved with last two syndromes). From ~3), Li = 0.6 and Lz = 0.5 • T, may be recalculated (if 2j < k as in step 5, iteration 3), but it is always multiplied by a.-x each cycle. At step 5, iteration 3, T is calculated from L(2)! ~3) or 1 + a. 6-x 0 6-x --:-- = a. + a.

a. 0

At the next step (6), it is multiplied by the previous one plus T x ~k)' So

a.-x • The next locator L is found from

for example. In Table 7.2, t has its usual meaning, the maximum number of correctable symbols. Each of the expressions in the L(k) column generate increasing amounts of the known syndromes until, at the end, the final form will generate all the known syndromes. At this point the roots should be checked to ensure that the correct number exist (equal to the number of errors) and hence, that the capacity of the code is not exceeded.

Ix = 1 + a.2 - x + a. i - 2x =O for x equal to the positions of the errors. From this, Li = a.2 and Lz = a. i , the same answer we got when solving for L analytically. Normally, the completed spectrum would undergo an inverse Fourier transform yielding the added time-domain errors. However, the time domain pattern will consist largely of zeros with, at most, t non-zero elements at the error locations x,

CHAPTER 7. REED-SOLOMON CODES

99

where x satisfies Ix = O. Rather than complete the whole time domain pattern, only the error patterns (or magnitudes) at x need be computed. Table 7.2 The Berlekamp-Massey algorithm Step

k

Sk-l

j

ß(k)

4.k)

+1

Jump?step

T

Sk-l +S*L

ß=O ?6

2

3

4.k) = 4.k-l) + 2j >= k?6

4 5

k-j

4.k-l)/ß(k) Txa- x

6

k < 2t? 1

7 [nil

0

0 0.2

0.2 +0=0.2

1 + a2 -

3

x

1/0.2 = 0.5

5

a5- x

6 2

3

2

6

2

3

3

5

3

6

3

3

4

3

4

6

4

End

a- x

0. 1

0. 1 +

0.4

= 0.2

1 + a6 - x

a5 -

2x

0+1 = 0.0

0 1 +a6 - x +a5 - 2x

2

l+a6 -

x

a- x +a6 0.2

0.2

1 + a2 - x + a l -

+0.6

=0.°

2x

Check Roots

The BM algorithm saves us the matrix calculation for the solution to the REC based on the syndromes and it also removes the trial and error associated with finding out how many errors have occurred. To check if the error capacity of the code is exceeded, we can either cycle Ix through all the values of x corresponding to the symbol positions in the message and check the number of zeros (or roots) is consistent with our attempted solution, or cycle the REC round and see if it ends up at the same values with which it

100

7.7 The Forney Algorithm

started. The BM algorithm is, in fact, based on Euclid' s algorithm which is used in whole number rnathematics to find the greatest cornrnon divisor between two numbers.

7.7 The Fomey Algorithm Rather than perform an entire inverse Fourier transform to find the time domain errors, we need only find values coincident with the roots in the error locator L. It turns out, however, that even this is unnecessary. The Forney algorithm allows direct calculation of the error patterns and has the following form:

Table 7.3 shows the conditions of the original added error, with symbol position x, followed by the error ex added to the time-domain message, the frequency-domain error spectrum Ex (of which the bold symbols are known in a double-symbol correcting message) and the frequency-domain locator, L(x) which is generated by the Berlekamp-Massey algorithm and yields the error positions at its roots. Table 7.3 Preparing for the Forney algorithm x

e(x)

Spectrum

L(x) = 1 + a,z-x +al - 2x

!lex) = a 2 + a 2- x

0

0 0 0

4 (a2) 2 (al) 0 4 (a2) 6 (a4) 6 (a4)

7 (a5) 6 (a4) 6 (a4) o(root) 7 (a5) o(root) 1 (ao)

0

2

3 4 5 6

2 (al)

0 6 (a4) 0

2 (al)

a4 a6 aO a3 al a5

!lex) is called the error magnitude (or pattern) polynomial and it is generated by multiplying the error locator polynomial, L(x), by the known syndromes (bold in Table 7.3) Sex). The position of the syndromes is indicated by multiplying each by decreasing powers of i.e.

er,

CHAPTER 7. REED-SOLOMON CODES Sex)

101

= a 2 + a l - x + 0 + a 2- 3x

so

ignoring powers of a greater than 2 (the order ofthe problem for two errors). While Tab1e 7.3 shows all values of Q(x), only those at the roots of L are needed (corresponding to the positions where errors occur). To complete the algorithm, we need the value of A'(x). This is the formal derivative of A(x), or L(x) in this ca se and is, surprisingly, generated in a very similar way to that of anormal polynomial in x although treat a-x as er.

If L(x) = a D + a 2- x + a l - 2x then L'(x) = O.a2- x + l.a2 + 2.al - x Simply imagine that a 2- x was a 2x and and all odd terms become 1 leaving L'(x)

a l - 2x was ax2, etc. All even terms go,

=a 2

which is constant. To complete the algorithm, we must multiply the error pattern values by erIL'(x). So, at x = 3 we get an error pattern given by

and atx= 5

es =a S. a 11a 2 =a 4 or 6

which are, of course, the added errors.

7.8 Mixed-Domain Error Coding While frequency domain Reed-Solomon coding lends itself to large, scalable error correction schemes, it also suffers from some draw-backs. The first of these is latency (wh ich does not always matter) owing to the nature of the forward and inverse Fourier transforms (and, of course subsequent processing if errors occur). To implement these transforms, an entire block of data must be compiled first, which defines the minimum coding and decoding time. For this reason, in some applications with tricky, real-time,

102

7.8 Mixed-Domain Error Coding

low latency constraints, this kind of coding may prove unworkable. A second problem is what happens when the error capacity of the code is exceeded. When the message is converted by inverse Fourier transform into the timedomain coded form, its contents are effectively concealed. The coded timedomain message bears no obvious resemblance to the original uncoded message. Only after applying the forward Fourier transform is the original message revealed. Unfortunately, if the message is not correctable, application of the forward Fourier transform produces only rubbish, a mix of the frequency spectrum of the errors and the message. Rather than errors being confined to bytes here and there, they are spread out over, and consequently obscure, the entire message. In contrast, time-domain coding transmits the message largely as is, augmented by check symbols. Even if correction is not be possible, much of the message may still be intact. While some file types won't tolerate significant errors, (binary executables for example,) many have a context within themselves which means that they are still useful. Examples might be text messages, images and so forth. At the limit, some correction schemes have tried to exploit this in-built context to assist correction. In particular this has been applied to images. What we might like is the convenience of scalable frequency-domain co ding with the potential fault tolerance and lower-Iatency of time-domain coding. As it happens, there is a way that these two techniques can be merged into a mixed approach which yields low latency and data visibility. Starting with a message at the transmitter (and we'll consider a two-symbol correcting code) the message is arranged as

and four syndromes (i = 0 to 3) are found from 6

s;=Ldpii. j=O

These are calculated in exactly the way that they would be at the receiver, for a time-domain message. The only difference is that we know they'll be non-zero because the message hasn't been coded yet. These can actually be coded as the message is being output, keeping latency to aminimum. Once d4 has been output, all syndromes are known. Because there are zeros where

CHAPTER 7. REED-SOLOMON CODES

103

there should be check symbols, the errors in the message are, simply, Po to P3. So it follows that 00 so = a oP3 + a0 P2 + a PI + a Po 3 2 I 0 SI = a P3 + a P2 + a PI + a Po 6 4 2 0 S2 = a P3 + a P2 + a PI + a Po 2 6 3 0 S3 = a P3 + a P2 + a PI + apo. There are several ways that Po to P3 can be determined. These equations can be arranged into a matrix and solved by inversion, dedicated (but nonscalable) solutions could be constructed, we could use recursive extension or, since only magnitudes are required as we know position, Forney' s algorithm might be employed. Which solution is chosen depends on permissible latency, hardware complexity and so forth. Using the message

6,5,4,0,0,0,0 gives S3 = 2, S2 = 3, SI =

°and So = 7.

Forney' s algorithm probably represents the fastest and most economic way to tackle this. Unfortunately, to evaluate this algorithm, we require So to S4 since !'lex) requires powers up to a-4 for four errors. The first thing that must be done, therefore, is to calculate S4 using recursive extension. This does not involve solving the key equations since the positions of the errors (check symbols in this case) are known to be at x = to 3 so the locator 1 will be

°

or giving

where 10- 3 form

=

O. Previously (by multiplying with a-4X) this has been in the

104

7.8 Mixed-Domain Error Coding

and rearranging by multiplying Ix by a-4X gives LI = a 2, ~ = a S, L 3 = a S and L4 = a 6 . These values depend only on the order of the error code (t), not on the syndromes. In this respect, they represent a minimal amount of information that must be pre-calculated for a scalable solution. Using these and the known syndromes, the spectrum of the check symbols can be recursively extended giving S4 = L 3S0 + ~SI + L IS 2 + LoS3 Ss = L3S1 + ~S2 + LIS3 + LoS4 and so forth. The completed spectrum is 305 a5 , a4, 5 !!.,., a, a, ,a

although for Forney' s algorithm we need only the extra underlined value, S4. If the complete spectrum is known, the check symbols can be found by inverse Fourier transforming it, giving 0, 0, 0, a 2, a 3, 0, 0 so the transmitted message is 6, 5, 4, 4, 3, 0, O.

More probably though, over a large message it would be uneconomic and slow to complete the recursive extension and inverse Fourier transform. It was included here by way of illustration. To recap, Forney's algorithm is given by

where O(x)

=S(x)L(x)

=(S0 + Sla-x + S2a-2x + S3a-3x + S4a -4X)(l + L la-x + L2a-2x + L3a -3x + L4a-4X) T-

-4x . k eepmg terms to a ,

From L(x),

A (x) =a 2 + a5- 2x so the check symbols are

CHAPTER 7. REED-SOLOMON CODES

105

At the receiver, 2t syndromes are calculated as normal. If they are zero, then the message is correct and no further processing is required. If not, again, there is a choice of solutions for the code. Adding the error e = 0, 0, 5, 0, 2, 0, gives the received message

°

6, 5, 1, 4, 1, 0, 0. The syndromes are S3 = a5, S2 = a4 , SI = 0, So = a5• Unlike calculation of the fOUf check symbols, this time the error positions are unknown. However, the solution essentially follows that of frequency domain coding. The BM algorithm is used to construct an REC which completes the error spectrum (of which So to S3 form the first fOUf elements in this example). Either an inverse Fourier transform, or the Fomey algorithm is then used to calculate the actual errors which are added directly to the received message. This gives LI = a 1 and ~ = a 6 and roots at x = 2 and x = 4. So now we need ü(x) which is

ignoring values which will exceed a-2x (such as S3)

106

7.8 Mixed-Domain Error Coding

Table 7.4 Calculating L(x)jor Mixed Domain Coding Step

k

1

+1

Sk-l

4.k)

L\k)

j

T

Sk-l +S*L

d=0?6

2

3

4.k) = 4.k-l) +

4

2j >= k?6

5

k-j

4.k-l) Txa-x

6

k < 2t? 1

7 Init

0

0

1

aS

a-x

a S +0=a5

1 + a S- x

3

5

lias =

6

a 2- x

2

3

2

6

2

3

3

3

5

3

6

0

0+a3 =a3 f1,2-2x

a4

a4 +0= a 4 1 + a6-2x 2

3

4

6

4

lIa4 a

3 4

End

lump? step

3- x

a 5 +O=a5

a5

1 + a 1- x + a 6- 2x Check Roots

From L(x), A'(x) = a so the values at X = 2 and 4 are

Correction is no more than adding these to the received time-domain symbols at positions 2 and 4.

CHAPTER 7. REED-SOLOMON CODES

107

Coding and error correction of mixed-domain messages is essentially identical to the time-domain examples shown previously. However, it is possible to both simplify coding/decoding and add scalability, by borrowing from frequency-domain techniques. The net effect is to provide a scheme which requires considerably less processing (and hence latency) than frequency-domain coding but more scalability than time-domain coding. Also data visibility is maintained in the event of uncorrectable errors.

7.9 Higher Dimensional Data Structures You will have noticed that the maximum message size of an RS coded message is set by the field over which the coding is performed. Le., for any GF(2n ), the message can't exceed n(2n - 1) bits. A message which exceeds this limit must either be packaged into two blocks, or a larger field employed and the message truncated. The use of small fields is attractive in low-cost micro-controller solutions since look-up-tables can be used to speed up processing without sacrificing large amounts of memory. However, splitting a message up into small blocks in order to use a small field is not always desirable. Ideally, it would be better to have one block of one hundred bytes, say, coded to withstand ten errors rather than ten blocks of ten bytes, each coded to withstand a single error. The reason for this is that the distribution of ten errors in the former message is irrelevant, whereas in the latter case it is crucial. Unless the ten errors are distributed with no more than one error per lO-byte block, the latter code will be unrecoverable. To assist in the trade between message and field size, it is possible to use both time- and frequency-domain techniques over more than one dimensional data structures. For example, the one-dimensional messages used in most of the previous examples were limited to 21 bits (Le. seven 3bit symbols). By arranging the data in a two-dimensional way, this is increased to 147 bits or forty nine 3-bit symbols. Using three dimensions this increases to 1029 bits and so forth. The operating principles are virtually the same, so any of the previous techniques can be applied with a little care. Consider first a very simple two-dimensional time-domain example over GF(2\ The message is now d i j (0 :::; i, j :::; 6). In order to correct a single error in this structure three constraints are imposed, as follows 6

~d .. =0 LJ I.J

i=O.j=O

(7.8)

108

7.9 Higher Dimensional Data Structures 6

~ d I,} .. ci =0 .LJ i=O,j=O

(7.9)

6

~ d I,} ..a) =0 .LJ

(7.10)

i=O,j=O

To do this, three symbols must be sacrificed for check symbols, and their positions within the structure are not arbitrary. To locate and correct an error three syndromes are required at the receiver, So to S2. With care, it can be arranged that, in the event of a single error, say ei,j, at position (i, j), So will be equal to e, SI will be equal to eci and S2 will be equal to ect. To locate the error, we simply calculate k i = Sl/So and kj = S2/So. The simplest place to put the check symbols is at do,o, dO,1 and dl,o. For each extra dimension, a further check symbol is placed in position 1 of that dimension, with aB other indexes at O. Consider a three-dimensional example over GF(24 ) using 14

~ . . k =O(So) L. d I,},

i=O, j=O, k=O

14

~ L. d·I,},. ka/ =O(SI)

i=O, j=O, k=O

14

~ L. d·I,},. ka./

=0 (S2)

i=O, j=O, k=O

14

~ . . kC/ =0(S3) L. d I,},

i=O, j=O, k=O

we must arrange for the check symbols to reside in different axes. The first check symbol, pl,O,O, is offset one position into the i axis, replacing data symbol dl,o,o. The second check symbol, PO,I,O, is offset into the j axis by one, replacing dO,I,O, and similarly, the third symbol, Po,o,!. is offset into the k axis by one. A last check symbol, Po,O,O, will intersect aB axes in the same place, replacing do,o,o. Over such a large field, algebraic generation of the check symbols would be impractical. Instead, we can use the same techniques as before, treating

109

CHAPTER 7. REED-SOLOMON CODES

the check symbols as errors in known positions. Setting the check symbols to zero and calculating the syndromes gives

So = po,o,o + PI,O,O + PO,I,O + PO,O,1 SI = PO,O,O + a}PI,O,O + PO,1,O + PO,O,I S2 = PO,O,O + pI,O,O + aIpO,l,O + PO,O,I S3 = PO,O,O + pI,O,O + PO,I,O + aIpO,O,1 Solving, using x4 + X + 1 = 0, gives pI,O,O = (So + SI)l(a + 1) = (So + SI)a11 PO,I,O = (So + S2)/(a + 1) 11 = ( So + S2)a PO,O,I = (So + S3)/(a + 1) 11 = (So + S3)a PO,O,O = (So + pI,O,O + PO,I,O + PO,O,I) At the receiver if any recalculated syndrome is non-zero, then the threedimensional error position, i, j, k, is found from

i =St/So j = S2/S0 k = S3/S0 and the error pattern, eij,k from So. To correct the error we simply do the operation dij,k = dij,k + So. Clearly, as t is increased, the complexity of the coding increases exactly as it does with one dimensional data structures. To make the problem tractable, we need to investigate frequency domain techniques over these multi-dimensional structures. The Fourier transforms for two- and three-dimensional data sets are given below.

F. . = I,}

m-I

~d ~

x,y

a(i'x)+U-y)

and F.. = I,},k

x=O,y=o

m-I

d

i,j -

~F

~ x,ya x=O,y=O

rn-I

~d

x,y,z x=O,y=O,z=o ~

a(i·x)+U-y)+(k· z )

rn-I

-(i·x)-U-y)

an

dd

i,j,k -

~F

~ x,y,Za

-(i-x)-U-y)-(k'z)

x=O,y=O,z=O

where m = 2n - 1, over the field GF(2n). Using the manageable field GF(2\ Table 7.5 shows a 49-symbol message arranged into a two-dimensional

110

7.9 Higher Dimensional Data Structures

matrix, with three zeros (bold) inserted into positions (0, 0), (0, 1) and (1,0). These are sufficient to give a single-symbol error correction capability to the message. Here, the origin is at the top-Ieft corner. Table 7.5 Two-dimensional message over GF(2 3 )

2

5 2

3 0 3 0 6

0 2 7 2 6 2 5

6 5 2 1 2 3 1

6 2 3 1 6 2 6

7 3 7 6 0 1

1

4 0 4 0 6 5 6

Table 7.6 shows the message in the time domain after inverse Fourier transforming. The boxed element at (4, 5) will be corrupted upon arrival at the receiver by adding 6 to it. Applying the Fourier transform to the corrupted message gives the results shown in Table 7.7. The Fourier transform of the error itself is given in Table 7.8. Table 7.6 Time domainform ofthe message

4 7 7 0 2 2 7

0 0 1 3 4 2 5

1 4 4 0 3 7 3

5 5 0 7 7 6 6 4 1 3 3 [I] 6 0

7 6 6 5 1 6 2

0 0 7 6 2 2 3

From the two tables, it is clear that wherever there is a zero in the original message, the corrupted message (Table 7.5) now shares a common value with the error spectrum. Because three zeros were deliberately placed in the top left corner of the message, we are guaranteed of knowing at least three error spectra from the received and corrupted message.

CHAPTER 7. REED-SOLOMON CODES

111

Table 7.7 Carruptedfrequency damain message

7

6 1 7 7

5 4 2 7 5

7 1 5 7 0

2 4 5 2 0

6

6

4

7

3 4 7 0 1 1 4

4 1 2 0 4 0

6

5 7 7 2 3 3 2

Table 7.8 Frequency damainfarm afthe errar

1 7 3 2 5

4 1 7 3

7 3 2 5

6 4 1

4 1 7 3 2 5

6

5 6 4 1 7 3 2

3 2 5

6 4 1 7

1 7 3 2 5

6 4

Using an REC for one error, the top row of the error spectrum can be completed using the two known spectra at i = 0 and 1. For a single error, the key equations give Li! = E!,rJEo,o so Li! = a!/a4 or a 4 Using this feedback multiplier and preloading the REC with a 4 , the sequence a 4+4i is generated, where (0 s i s 6) which is 6,2, 7, 4, 5, 3, 1. We also know two error spectra in the first column, 6 and 4. Using these in exactly the same way as above, Lh = a 2/a4 or a 5, giving the sequence a 4+5j where (0 sj s 6) or 6,4, 1, 7, 3,2,5. In this simple case, L(i) = 1 + a4--i and L(j) = 1 + a 5- j and this directly betrays the error location (cycle i andj and note where the zeros occur). Eo,o is, of course, the error magnitude or pattern. None-the-Iess, it's instructive to complete this process. Having constructed the first row and column of the error spectrum using the three known values in the top left corner of the message, at least one spectral component is now known for each row and column. Since L(i) and L(j) are also known, only one value is required per row, or column to start the REC. Consider the second row which starts with Eo,! = a 2 • Using Li! = a 4 , the REC gives the sequence

112

7.9 Higher Dimensional Data Structures

2 4_ 6 E 11 -- a.a - a 6 4 E 21' -- a.a = a3 E 3'1 = a 3.a4 = a O , 0 4 4 E41 = a.a = a , _

4

4_

1

ES1 - a.a - a 1 4_ S E61' -- a.a - a , S 4 2 Eo,l = a.a = a or

4,5, 3, 1,6,2, 7 whieh agrees with the aetual error speetrum. We eould eomplete the speetrum by eolumns in exaetly the same way, preloading with the top element of eaeh eolumn. Taking the seeond eolumn as an example, Ljl = a S and using the known speetral eomponent 2 (a), we have the sequenee

2,5,6,4, 1, 7, 3. Onee the speetrum has been eompleted, an inverse Fourier transform yields the time-domain error whieh ean be added to the reeeived message, correcting it. Extending the two-dimensional transform to more errors requires a little eonsideration of the problem. The REC for two errors has two multipliers, LI and ~. To find these in either direetion (i or j), we need to know four spectral components in a row. Onee LI and ~ are known, two speetra are needed to reeonstruet any other row. To this end the doubleeorrecting two-dimensional message is set up as shown in Table 7.9, while Table 7.10 shows the time-domain (eoded) message. Table 7.9 Creating a double·correcting message

2 3 1

6 2 3 1 6 2 6

7 4 3 0 7 4 6 0 0 6 1 5 1 6

CHAPTER 7. REED-SOLOMON CODES

113

Table 7.10 Time domain (encoded) message before corruption

7 2 7 1

5 0 3

4 4 1 5 6 0 7 [1]3 7 4 6 2 7 0 3 5 1 3 3 1

3 4 3 0

0 3 4 7 0

4

1

1

[I] 5

2 1

4 2 3 2 7

In Table 7.9 the four added zeros (bold) on the first row and column allow initial calculation of the two multipliers in either direction since, in the event of errors, we will know four adjacent spectral components in both horizontal and vertical directions. At position (1, 1) another zero means that two spectral components are known in both the second row and column. We can, therefore, complete both a second row and column using multipliers calculated from the first row and column. In this way, the complete spectrum can be constructed. Table 7.10 shows the time domain message after inverse Fourier transforming, and Table 7.11 shows the decoded (frequency domain) message after the addition of two errors, 6 at (4, 5) and 5 at (2, 2). The error spectrum is also shown in Table 7.12 for comparison. From the key equations

~ = (EoE 3 + EI E 2 )/ (E12 + E oE 2 ) Lz =(EI E3 + Ei )/ (E12 + EoE2 ) and working across the top row of the received message, Lil = (a 3a 3 + 0

y(o + a 3(

2)= a

Li2 = (0+a2a2)/(0+a3a2)=a6 .

Preloading the REC with 3 and 0 generates the sequence

3,0,4,3,4,7,7 the top row of the error spectrum. Doing the same but vertically, Lh = a 3 and Lh = aO. This time we generate the first column after preloading with 3 and

6:

3,6,2,0,2,6,3

114

7.9 Higher Dimensional Data Structures

Table 7.11 Reconstructed message with errors

1 4 5 4 4

0 3 1

2 0 1 5 3 2 3

0 7 7 2 7 7 7

3 2 5 1 4 2 6

4 2 2 4 5 0 5

7 4 0 4 7 6 6

7 2 1 1 2 7 0

Table 7.12 Error spectrum

2 6 3

1 6

7 7 5

As before, we can decide whether to complete rows or colurnns. Having completed the first row and colurnn, there is at least one known spectral element per row and colurnn. Because EI,I is also a known component (unlike the single error case), the second colurnn and row mayaiso be completed. Remember, LI and Lz are known now so only two values are needed to start the REC. Continuing with the next row, the first two components are 6, 6 so the complete row is 6, 6, 4, 0, 2, 4, 2.

There are now two known spectral components in every column, so all the colurnns can be comp1eted, completing the error spectrum. An inverse transform returns these into the time domain, shown in Table 7.13., while Table 7.14 shows one possible ordering of the completion of the error spectrum.

CHAPTER 7. REED-SOLOMON CODES

115

Table 7.13 Time damain errar pattern

0 0 0 0 0 0 0

0 0

0 0

0 0

0 0 0 0

0 0 0 0

0 0

omo

0 0 0 0 0

0 0 0 0 0

0101 0 0 0 0

0 0 0 0 0 0 0

Table 7.14 Recanstructing the errar spectrum

i/j

0

1

0

1 2

-

1 2 3 4 5 6

J,

3 4

J,

2

3

4

5

6

6

7

8

9

~

5

J,

J,

J,

J,

J,

The BM and Forney algorithms ean be used to find Land the error magnitudes, but these need to be applied as two one-dimensional problems. The zeros in L(i) and LV) loeate the error positions, but with two solutions. In this example we would find errors at i = 2 and 4, and j = 2 and 5. However, only by solving Forney's algorithm in both i andj, ean we know that the errors are a622 and a445 rather than a625 and a 442. " " The maximum dimension that a data strueture ean have depends on the size of the field. Onee the dimensions exeeed m, the transform repeats itself. Beeause eaeh error now eonsists of more than two unknowns (sinee position is no longer a one-dimensional quantity) the effieieney of these eoding sehemes will always be lower than the one-dimensional ease. The potential gain is in terms of the simplicity of the proeessing. The smaller the symbol bit-widths, the smaller the number of elements, the simpler the elements are to manipulate in hardware. Using this teehnique would allow messages of up to about 2.4Mbits to be eneoded over GF(2\ The smallest field, GF(2 2), also starts to beeome praetieal at 54 bits. GF(22) is a partieularly interesting field beeause the symbol patterns (or magnitudes) are (apart from 0), one greater than their power, so a O = 1, a 1 = 2, a 2 = 3. All eomputational cireuits

116

7.9 Higher Dimensional Data Structures

are, therefore, very simple to implement with logging and anti-Iogging becoming trivial. An interesting facet of two-dimensional coding is that the code strengths in either dimension need not be the same. Because messages are generally transmitted serially, a burst error is liable to disrupt a group of bits in elose relative proximity. If the two dimensional data structure is transmitted row at a time, this might mean that significant parts of a row are lost, culminating in several corrupted columns. However, the burst error may be confined to only one or two rows. In this case, a single or double correcting code may be sufficient vertically, whereas a stronger code will be required horizontally. Table 7.15 illustrates a possible arrangement to cope with horizontal burst errors. This has the same coding rate as the previous double-symbol correcting code, but is arranged to allow location of three column errors, provided they are contained within a single row. Table 7.15 Arranging Strong Horizontal Codes

0 0 6 1 2 5 2

0 0 3 0 3 0 6

0 0 0/5 7 2 2 1 6 2 2 3 5 1

0 2 3 1 6 2 6

0/4 3 0 7 4 6 0 0 6 1 5 1 6

In practice, we' d probably want a double-correcting code vertically because a burst error could easily cross two rows. However, the small size of GF(2 3) is a bit limiting for this kind of fine-tuning; GF(24) is far more practical. Continuing this examp1e, a burst error of the form in Table 7.16, below, would be correctable, even though it contains three errors. Table 7.16 A 4-bit burst error with three errors

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0

15

0 0 0 0

0 0 0 0 0 0 0

0 0 4 0 0 6 0

0 0 1

0 0 0 0

0 0 10 0 0 0 0

CHAPTER 7. REED-SOLOMON CODES

117

Adding this error to the time-domain form of the message of Table 7.10 gives rise to the following message, after decoding, in Table 7.17. The bold symbols are the known error spectra. Table 7.17 Message with 4-bit burst error, After Decoding

0 0 6 1 2 5 2

0 0 3

3

4 713 6 7 6 3 1 7 4 7

2 1 4

o 11 3 7 6

2 7 7 7 1

°° °° ° ° ° 3

6

2 4 3

1 1

Solving the top line with six known spectra, gives rise to the locator LU) = 1 + a,6-x + a,4-3x

Using this, the top row is completed giving 0, 0, 3, 4, 2, 0, 5. The second row may now be completed, using the same locator, giving 0,0, 7, 6, 3, 0, 2. The code was constructed to correct one vertical error, which requires two elements. From the top two rows it can be seen that L(j)

= 1 + a,2-x

Taking any value in the reconstructed top row, and multiplying it by a,2, gives the value in the same column of second reconstructed row. If the number of vertical errors (rows containing errors) had exceeded one, this would not be true. This allows all columns to be reconstructed, giving the completed spectrum in Table 7.18, below. In the table, subscripts are used to indicate the order in which the spectrum has been reconstructed. From the completed spectrum, the errors are found and correction performed.

7.10 Discussion

118 Table 7.18 Reconstructed error spectrum

0 0

0 0

03 03 03 03 03

04 04 04 04 04

3 7 15

4

62

2 32

45

26

77 17

65

36

47

76 16

67

55

25

56

57

0

51

O2

22

08 08 08 08 08

39

79 19 49

69

7.10 Discussion This chapter has covered some of the most important aspects of ReedSolomon coding. Hopefully you will have seen how the maths works and one or two ways that constraints can be built into coded messages so that errors can be detected and corrected. Any old coding scheme, for example random code selection and deletion, such as used to form the Gilbert Bound, can be made to work, but efficient and fast coding is very important. Using a code which does nothing more than satisfy dmin to protect, say, a four-byte message (about two bytes of information) from 3-bit errors would, on average, require about two and a half thousand trials to correct. Naturally, as messages increase in size this gets worse (at a little over the square of the message size). Compare this with the elegance and scalability of ReedSolomon coding, especially when augmented by the BM and Fomey Algorithms. Various techniques, including time and frequency domain coding, have been considered along with their particular advantages as have multidimensioned data structures. Some of these ideas are food for thought and don't have general application but, for cost sensitive micro-controller based solutions they may just make coding practical where it would not, otherwise, have been.

Chapter 8

AUGMENTING ERROR CODES

I started the book suggesting that error co ding is somewhat holistic. An error coding strategy must be considered in conjunction with its proposed environment if it is to be effective. Not only this but there are some relatively simple steps that can be inc1uded in the coding which have a truly profound impact on the end performance that is achieved, often at relatively little cost. In this chapter some of these strategies will be examined, along with the kinds of environments where they are likely to be effective.

8.1 Erasure Erasure is a technique which improves the efficiency of error coding. In fact, it has already been used in earlier examples to calculate the check symbols. Normally an error Cover symbol-based codes) will comprise two unknowns, symbol position and error magnitude. In order to solve for an error, two quantities must be resolved which, in turn, means that two message constraints are required per error. This is reflected in all the earlier examp1es by having 2t check-symbols. Occasionally, however, the error position is known. This occurred previously when a correction technique was used to evaluate the check symbols during encoding because the position of the check symbols is known apriori.

A. Houghton, Error Coding for Engineers © Kluwer Academic Publishers 2001

120

8.1 Erasure

Position may be known during decoding and later we'll see how codes can be combined into inner and outer forms where each code is tailored to its environment. This sometimes means that solving one of the codes will reveal the error locations prior to the solution of the second code. In the previous section on multi-dimensional coding structures, varying strengths of code were suggested in different dimensions, depending on the order of transmission. It may be that using a low-order vertical solution to determine which rows contain errors in a burst-error situation would facilitate erasure of an entire row to enhance sub se quent decoding. A much simpler case occurs when the receiver recognises a drop in the carrier strength and declares that the corresponding signal is likely to be in error. This is called declaring an erasure. If position is known, it remains only to evaluate magnitude. Since there are 2t check equations, 2t erasures can be solved, doubling the correcting capability of the code. This was how four check-symbols were calculated by error correction, using a double-symbol correcting code, in section 7.3. So a code with 2t check symbols can correct e erasures and r errors where e + 2r ~ 2t.

Erasure decoding works in the following way. Where a symbol is known, or considered to be in error, it is replaced by 0 (erased). Consider a singlesymbol, time-domain correcting code over GF(2\ This would be arranged to satisfy 6

LdiaY =0 i=O

for i = 0 and 1. Suppose, at the receiver, errors are known to be at i = 3 and i = 4. d3 and d4 are replaced by 0 and the syndromes are calculated. Since d3 and d4 have been erased, the error magnitudes are simply d3 and d4 so the syndromes will be

eliminating d4 gives leaving

CHAPTER 8. AUGMENTING ERROR CODES

121

In the multi-dimensional data structures of the previous section, each error carries one unknown per dimension. This makes the gains, when using erasure, that much more signifieant. For example, in the single correcting case over two dimensions, there are three redundant symbols. If the positions of any errors are known then three erasures can be corrected. In the double correcting example over two dimensions, eight erasures could be corrected, and so on.

8.2 Punctured Codes It was demonstrated in the previous section that by using a process called erasure, a single-symbol correcting code can be used to correct two errors if their position within the message is known. In systems where channel noise is subject to large variations (typieal in mobile communications systems), erasure decoding can be used to good effect in a number of ways. First, the receiver can monitor the received signal level or power and, in the event of fades, declare a symbol as erased. The symbol may have been received correctly and it may not. If the fading threshold is set appropriately then there is a good chance that the symbol would have been in error. By declaring the symbol erased, the receiver teIls the decoder where the error iso The decoder does not have to locate the error, simply setting the symbol to zero. In this way potentially up to twice the number of errors can be corrected. Punctured codes exploit this property to improve the throughput on such variable quality channels. If a code has to be arranged so as to cope with the worst possible channel conditions, then for much of the time far more bandwidth is being used for error control than is necessary. This scenario can be partly managed by an automatie repeat request protocol where the receiver requests retransmission of a packet or message in the event of too many errors. The redundancy can now be set for more typieal channel conditions making better use of the available bandwidth for data. In the event of aretransmit request, to simply retransmit the same message again, however, is likely to achieve a similar failed result, especially if the channel fade persists. If a message is constructed such that over half of it is redundancy and it is then divided into two equal parts, it is possible to reconstruct the original message from either of the halves. The missing half of the message forms known erasures. If around one half of the message is transmitted, the majority of its capacity is used to replace the missing half by erasure decoding. Any remaining capacity can be used to correct real channel errors.

122

8.2 Punctured Codes

In the event of transmission failure and aretransmit request, the other half of the message is transmitted. At this point, the second half of the message will be decoded independently of the first, again having minimal capacity to correct real errors. So far we are little better off than if the original message had been retransmitted and, if the channel fade persists, decoding failure results again. If the second half does not decode then the two halves of the message are brought together to create a complete message with substantial correcting capability, hopefully sufficient to overcome the fade. If failure results again, the first half of the message will be retransmitted and so forth. For example, consider a 15 symbol message over GF(24). More than half the message must be redundancy. If nine symbols are redundant, then each half of the message will be able to correct a single-symbol error without the need for transmission of the other half of the message. Transmitting eight out of the fifteen symbols means that there are seven errors which are corrected by erasure (the non-transmitted half of the message), leaving two unused redundancy symbols to correct any unknown errors. Let the six data symbols be 8, 3, A, F, 2, D so that the initial message (in the frequency domain) is

D,2,F,A,3,8,O,O,O,O,O,O,O,O,O converting into the time domain (using x 4 + x + 1 = 0) gives C,9,5,8,7,F,F,F,E,1,6,6,3,D,1 of wh ich F, E, 1, 6, 6, 3, D, 1 forms one of the punctured codes (or transmitted messages). At the receiver the message has the form

x, X, X, X, X, X, X, F, E, 1,6,6,3, D, 1 in the event of no errors, where X represents a known error location. Replacing Xs with (an erasure) and Fourier transforming gives

°

B, A, 8, 9, A, D, 2,8,3, C, 5, 9, A, A, F

where bold symbols are known puncture spectra. Using the erasure positions, the error locator polynomial can be constructed from

Given the general form

CHAPTER 8. AUGMENTING ERROR CODES

123

Solving for L gives

~ =o. E +o. D +o. c +o. B +o. A +0. 9 +0. 8 ~ = a,8(a,9 +a,A +a,B +a,C +a,D +a,E)+ a,9(a,A +a,B +a,C +a,D +a,E)+

... +

~ =a,8+9(a,A +a,B +a,C +a,D +a,E)+ a,8+A(a,B +a,C +a,D +a,E)+ ... +a,8+D+E + a,9+A(a,B +a,C +a,D +a,E)+ ... +a,9+D+E + ...a,C+D+E

ete.

While these equations might look a little tricky, for a given amount of puneturing, they are fixed and need only be evaluated onee, or soureed from tables. Completing gives

which, in the usual form (multiplying by a, is-7) gives

with LI to ~ being a,D, a,c, a,3, a,E, a,0, a,8, and a,2. Using these values, a reeursive extension eireuit ean be eonstrueted to ealculate the erasures and eomplete the untransmitted half of the message. Figure 8.1 shows the REC preloaded with the known frequeney domain errors and ealculated taps LI to ~.

124

8.2 Punctured Codes

Figure 8.1 Recursive extensionfor a punctured code.

Clocking the circuit gives the sequence 6,8,7,3,9,5,2,8,3, C, 5, 9, A, A, F and applying the inverse Fourier transform to the recursively extended sequence gives the error vector which is, of course, the untransmitted half of the message (the punctures). C,9,5,8,7,F,F,0,0,0,0,0,0,0,0. Adding this to the original message by replacing the Xs, gives the complete message with the punctures removed. Normal decoding ensues. Suppose now, that an error does occur such that the received message is

x, X, X, X, X, X, X, F, E, 4, 6, 6, 3, D, 1 Converting to the frequency domain (setting X to 0) gives 3, 7, D, 1, 7, 8, A, 5, 6,4,8, C, 2, 7, A where the bold symbols represent known error/puncture spectra. Recursively extending this based on punctures only gives A,F,9,0,4,6,8,F,6,4,8,C,2,7,A. Of necessity, the first seven symbols are the same since the REC was constructed around these. However, we would have expected the next two symbols (underlined) also to be the same if punctures were the only errors. We now know that transmission errors must also be present. The error can be resolved in a number of ways. Using the BM algorithm, the error locator could be fully evaluated or, we could multiply the current locator, which is based on the puncture positions, by (X-X( er + (Xi) and try values of i (0 to 7)

CHAPTER 8. AUGMENTING ERROR CODES

125

until the locator completes correctly the known spectra, underlined above. In either case, once the locator is known, Fomey' s algorithm can be used to find the errors and punctures. In the event that more than one error is detected, the other half of the message is transmitted and an attempt made at decoding. If this fails, the two halves are combined. Punctured codes find a perfect niche in mobile communications and other situations where the channel conditions are variable. They not only increase the average coding rate, but also enhance the automatic repeat request process. However, punctured codes also appear in the digital video broadcasting via satellite (DVB-S) standard. This standard specifies an inner convolutional (see later) code which has a typical coding rate of Y2, giving out two bits for every one input. By discarding bits at regular intervals (puncturing the code), the coding rate can be increased to 2/3, 3/4, 5/6 or 7/8• While this puncturing reduces dmin , and so the effectiveness of the code, it allows the broadcaster to trade bandwidth per channel against program area coverage, given a transponder power and receiver aerial size.

8.3 Interleaving Random errors are, at least statistically, easy to predict given certain channel conditions. As such, it is possible to construct coding schemes wh ich will cope adequately most of the time. Burst errors, however, can be so gross that to construct a coding scheme capable of recovering data would be prohibitively expensive and inefficient (both computationally and in terms of bandwidth usage). For example, a small defect on the surface of a CD might compromise hundreds of successive bytes, yet may happen rarely. Puncturing the code is one way of dealing with gross errors, but it is not directly applicable to storage media because there is no repeat protocol possible. Instead, a process called interleaving can be used which works by dividing and conquering. There are several types of interleaving but all work on the same basic principle; differences are measured in efficiency, latency and memory. Interleavers essentially re order data such that a single large error gets spread or divided up into lots of small errors. The simplest form of interleaver is called the block interleaver. Data are formed into a block, row by row. When the block is complete, the data are read into the channel or storage medium column by column. To make the process effective, the error control codes are also reorganised. Suppose that each row of the data block is a self-contained group of error coded data capable of recovering, say, two errors. Losing any more than two symbols per row renders the data is irreparably damaged but, by writing the data to

126

8.3 Interleaving

the channel in columns, large consecutive losses will take only small chunks out of each encoded data group. This is illustrated in Table 8.1. The data are read into the block horizontally and, in this example, each row can correct one error. The data are written into the channel vertically and the shaded blocks represent eleven consecutively corrupted symbols. On reconstruction of the block at the receiver or on recovery from the storage medium, each row has at most one error in and so can be corrected. In fact we could lose up to 16 consecutive symbols and still recover from the situation if no further symbols were lost. The amount of redundancy present is the same as if 16 errors were to be corrected in the whole block, regardless of position but the major difference is that the processing needed to achieve this extends only to a single error-correcting code. Table 8.1 Generating a block-interleaved message Check

Data dOE

d OA d09

d07 d 06 dos d 17 d l6 dis

d 04

diE d lD dlc dis diA d l9 d l8

d oo

d oc

dos

d 08

d l4

dQ3 d 02 POl d 13 d 12 Pli

Poo

PIO

d 2E

d 20

d 2C

d 2S

d 2A d 29

d 28

d 27

d 26

d 25

d 24 d 23

d 3E

d 30

d 3C

d 3S

d 3A d 39

d 38

d 37

d 36

d 3S

d 34

d 33

d 22 Ip21 d 32 Ip31

d 4E

d 40

d 4C

d 4S

d 4A d 49

d 48

d 47

d 46

d 4S

d 44

d 43

d 42

P41

P40

d SE

d so d sc

d ss

d SA

d S9

d S8

d S7

d S6

d ss

d S4

d S3

d S2

PSI

Pso

d 6E

d 60 d 6C

P20 P30

d 6S

d 6A d 69

d 68

d 67

d 66

d 6S

d 64

d 63

d 62

d 7E d 7D

d 7S

d 7A d 79

d 78

d 77

d 76

d 7S

d 74

d 73

dn

P61 P71

P60

d 7C

d 8E

d 80

d 8C

d 8S

d 8A

d89

d 88

d 87

d 86

d 8S

d 84

d 83

d 82

P81

P80

d 9E

d 90 d 9C

d 9S

d 9A d99

d 98

d 97

d 96

d 9S

d 94

d 93

d 92 Ip91

P90

d AE d AD d SE d so d CE d co d OE doo d EE d ED

dA) d A2 IPAI P AO d ss d SA d B9 d S8 d S7 d S6 d ss d S4 d S3 d S2 PSI PSO dcs dCA d C9 d C8 d C7 d C6 des d C4 d C3 d C2 PCI PCO dos d OA d 09 d 08 d 07 d 06 dos d 04 d 03 d 02 POl Ipoo dES d EA d E9 d E8 d E7 d E6 dES dE4 d E3 d E2 PEI PEO

d AC dAß d AA dA9 d sc

d cc d oc d EC

P70

d A8 d A7

d A6 dAS

d A4

It is possible to augment interleaving by adding a set of inner codes to the block. For example, the bottom two rows of encoded data could be replaced by Reed-Solomon check symbols. The codes initially embedded into the rows are referred to as outer codes. Now there is a vertical and horizontal correction capability. The inner codes take care of random errors which could, otherwise, compramise subsequent decoding of grass burst errors by the outer codes. If d85 (with the double border) was also corrupted, for example, then without the inner code this row would not be correctable. Interleaving is, therefore, about spreading or decorrelating grass errors so that they appear to the error codes as lots of small errors. Where inner and

CHAPTER 8. AUGMENTING ERROR CODES

127

outer codes are present an important synergy arises from interleaving. Because a failure of the storage medium or a loss of carrier leads to a group of lost symbols, while the inner code will fail to decode, it may, none-theless locate the errors. Considering erasure, if the error position is known, more errors can be corrected. As a result of the intersection of the error codes over a two-dimensional structure, the two codes working together can yield a result which is more powerful than the sum of the two codes operating apart. A similar idea was seen earlier, where data were encoded over a two-dimensional Fourier transform. By redistributing the error coding to tune it to the channel characteristics, giving more power in one dimension, three errors were corrected by a code normally only able to correct two errors. Block interleaving represents probably the simplest approach to error spreading. It suffers from a number of problems including latency and memory usage and, further, it constrains the coding to block codes. Latency arises because the rectangular structure of the interleaver requires it to be almost full before data can start to be output. This means that a channel with a block interleaver will only be able to operate for 50% of the time. Half the time will be spent outputting into the channel while the other half will be spent filling the buffer. While the latency of a block interleaver can't be reduced significantly, double buffering (using two interleavers) will allow continuous data flow, albeit at the expense of more memory. To get around these problems, the cross-interleaver was devised. It achieves much better error spreading per unit of memory and permits simultaneous input and output. Also it is probably better known too, owing to its use within the compact disk system. The principle and aim of crossinterleaving is exactly the same as for block-interleaving, but it is a little more complex to implement. Figure 8.2 shows a simple arrangement. Symbol delays Cross-interleaved symbols out

SymbolS

Multiplex

Demultiplex

Figure 8.2 Cross-interleaver.

128

8.3 Interleaving

In this circuit, rather than fill the whole rows, the input multiplexer squirts symbols successively into each row, starting at the top and working downwards. Consider the contents of the symbol registers after several symbols have been input, starting with da, as shown in Table 8.2. Table 8.2 Output sequence for the cross-interleaver I························,························,···.....................,.................................................,........................,........................,........................,.........................

! d40 ! d 35 ! d 30 i d 25 ! d 20 i d 15 i d 10 i d 5 ! d 0 ! L4.~:~:::::::::::I::4.:;:~::::::::::L:4.:;:;:::::::::::: d 26 ···;i;~·········t::4:;:~:::::::::JA::;:~:::::::::l::4.:~:::::::::::1:4.::;:::::::::::::1 ! d 42 ! d 37 ! d 32 d 27 d 22 d 17 l d 12 l d 7 ! d 2 ! L4.:;,;::::::::::L:4.:;:~:::::::::I:4.:;:;:::::::::.. L.1..±!..........L.1..:n.........l.. 1..:l.1.........

d 28

d 23

d 18

rd"";·;·········T:4.:~::::::::::::I:4.:;:::::::::::::J

d 29

d 24

d 19

I

d 14

1..?.............1...1..~..............l

1...

The bordered data indicate those currently within the symbol delays. As d30 arrives at the input, it is immediately output, followed by d26 , d22 , d 18 and d 14 etc. The de-interleaver is an identical circuit except that direction of the multiplexer is reversed, injecting symbols into the bottom row and working upwards. The spreading efficiency of an interleaver can be measured in a number of ways but one, which combines hardware complexity with correcting capability and largest recoverable burst error size, is as folIows. The largest correctable burst error for a t symbol correcting code is divided by the number of symbol delays required to implement the interleaver. For a square block-interleaver of n symbols per axis, a burst error must not be allowed to cover more than t columns if it is to be correctable. This gives the efficiency ratio ntln2 or tIn. For the cross-interleaver the calculation is a little more tricky. If the number of rows is defined as m, then adjacent symbols in a burst error will be typically spread apart by m - 1 symbols. In the sequence above, if d 18 and d22 were corrupted, we have one error per four symbols. If we therefore construct a code wh ich can correct t symbols in m - 1, then the largest tolerable burst error will be (m - 1)t symbols. The number of symbol delays required is Y211l(m - 1) giving an efficiency of about 2tlm. This is twice the performance of the block interleaver. Because the cross-interleaver starts off empty, unless some kind of preloading of the symbol registers takes place, the encoded data stream contains invalid symbols near the start and end. In the example above, if the data start at da then the output sequence will be

CHAPTER 8. AUGMENTING ERROR CODES

129

and somewhat the converse happens at the end of a message as valid symbols within the interleaver are be clocked out. With a block code, at the expense of a little extra complexity, the interleaver could be preloaded with symbols from the end of the block while, with continuous streams of data (convolutional codes) this is not possible. Even so, if interleaving is necessary in a continuous data stream, the cross-interleaver is still the preferred approach. This kind of interleaver is also called a convolutional interleaver owing to the continuous flow of data that is possible through it. While the goal of all interleavers is much the same, it is already clear from the evolution of the block-interleaver to the cross-interleaver that the implementation of the interleaver is significant. The cross-interleaver requires less memory (cheaper/lower latency) and is also suitable for data streams from convolutional coders (discussed next). Another factor which is significant is synchronisation. For decoding, the data must be arranged in the de-interleaver in exactly the way that they were arranged in the interleaver. This requires either the overall framework of the message to be known (not necessarily possible with convolutional coding), or the addition of special synchronisation characters or monitoring of and feedback from the error decoder. The last of these approaches is most elegant since it does not entail the need for extra overheads in the message and the attendant bit-stuffing needed to make data transparent to the synchronisation process. Simply, if the data are misaligned between the source and receiver, gross errors will appear at the output. By varying the alignment of data in the de-interleaver until successful decoding occurs, synchronisation is achieved. The problem with a block-interleaver is that with m rows and colurnns, there are m2 possible ways of aligning the symbols. This reduces to about m for a crossinterleaver. However, the helical-interleaver has been proposed as a way of easing this alignment problem and works by arranging data into a kind of helix as in Figure 8.3. do d1 d2

ds d6

d3 d4

d7 ds

d I5 d I6

d9 d20

d 17

d2I

d lO du dl2 d l3 d14 d25

Figure 8.3 Helical-interleaver.

130

8.4 Error Forecasting

In this exampIe, data are written vertically into the helix in encoded blocks of 5 symbols. The data are read out horizontally creating a sequence similar to a cross-interIeaver. On reconstruction of the interIeaved symbols, alignment is only required to the order of the size of each encoded block. There is no requirement that each block is in the same column that it started life in. The width of the interIeaver determines the maximum burst error that it will tolerate. In terms of performance, this is almost identical to the crossinterIeaver. Another interIeaver which has become important is called a random interIeaver. Essentially, data are arranged into a matrix and read out in a pseudo-random fashion. There are many proposed algorithms for performing the random interIeave but, surprisingly, the results appear to be remarkably similar. Y ou might suspect that the random interIeaver suffers from the problems associated with the block interIeaver. However, the importance of this interIeaver in modem turbo-coding which will be discussed later, means that the costs associated with using it are more than offset by performance gains.

8.4 Error Forecasting Where inner and outer codes exist, careful decoding of block-interIeaved codes can enhance the correction capabilities of the error control. A similar and very successful technique caIIed error forecasting gives comparabIe results, but also works on data which has only outer codes. In this instance, it is not possible to cross-correlate syndromes on the two-dimensional error codes. Figure 8.4 shows a block of data with single-symbol (outer) codes on the horizontal axis. Errors are marked by shading.

d6

d5

d4

d3

d2

PI

1 2 3

Po ~

4

5 6

7 Figure 8.4 Error jorecasting.

CHAPTER 8. AUGMENTING ERROR CODES

131

In Figure 8.4, two random errors and a 5-symbol burst error are seen. Interleaving has spread the burst over five rows such that it leaves only one error in each row. However, the random errors have increased this to two errors in two of the rows. Error forecasting works as follows. If a code block is found to be uncorrectable, the previous block is examined. If it too, was uncorrectable a decoder failure results. If not, and if errors occurred in the previous block, the decoder examines which symbols were in error. This process will start in row 2 where decoder failure results. Looking back to row 1, an error is seen in d2(1)' Error forecasting assurnes that an error may persist for a short while and on this bases, symbol d2(2) is erased. This much extra know1edge allows a feasible hunt for more errors, hopefully identifying dS(2)' The process repeats again at row 4.

8.5 Discussion In this section, three techniques have been examined which enhance error coding. Puncturing codes allows better use of available bandwidth and finds application right across the board from highly variable, deep-fade channel conditions, to largely stable satellite communications. Interleaving provides a simple, yet very effective adjunct to coding for managing gross burst error conditions. Interleaving does not necessarily mean that areduction in redundancy is achieved, but it does give a significant reduction in the complexity of encoding and decoding. Interleaving is not effective against excessive random errors, however, but can lead to greater correction capability than initially suggested by the power of the codes. For a given amount of system memory, cross-interleaving provides about twice the spreading performance of block-interleaving, but is a little more complex to implement. Where interleaving is used, higher level error forecasting algorithms can help to correct errors even though the capacity of the code may have been apparently exceeded. Adding inner codes to a block also assists error forecasting as well as dealing with isolated random errors wh ich, otherwise, might compromise decoding. Later under concatenated codes we'll see that it is quite advantageous to use different coding strategies for the inner and outer codes, tailoring each to its particular environment.

Chapter 9 CONVOLUTIONAL CODING

There are two main complementary error control strategies and so far we have only looked at one of them, block codes. As their name suggests, with block codes the data are compiled into blocks prior to encoding and transmission and must be subsequently recompiled into blocks for decoding and error correction. The second strategy is called convolutional or trellis coding. Here data may be error encoded on a continuous basis without the need for compilation into blocks. Convolutional encoding circuitry is trivial although this does not mean that coder design is not without pitfalls. Decoding and subsequent error correction, however, is a little more complex and not quite so mechanistic as it is with block codes. There are several strategies which may be employed to perform decoding and each one is a balance of compromises such as complexity and decoding latency. One of the key features of this kind of error coding is its total compatibility with soft decisions and an ability to modulate raw data directly into a usable channel code. Since decoding is so weIl matched to soft decisions, this type of error control is often used in conjunction with Gaussian channels. Clearly, there is little or no information present in soft decisions in the presence of burst errors. Convolutional codes, therefore, are often used as the inner codes of a concatenated system, mopping up the random errors in the channel. When the capacity of the inner code is exceeded, the decoder produces a short error burst which is removed by the outer block code. This combination of codes now forms the basis of

A. Houghton, Error Coding for Engineers © Kluwer Academic Publishers 2001

134 the deep space communications standard used, for example, on the Galileo and Giotto missions. Figure 9.1 represents a typical convolutional encoder. The coding rate, or ratio of input bits to output bits, is Yz for this example since, for every bit input, two bits will be output. The circuit also has a constraint length of four. The constraint length is a measure of how many bits effect the current output, or for how many clocks a given bit will persist in the output. By either puncturing the code, or inputting more than a single bit at a time, coding rates of greater than Y2 are possible, but first we'll examine the operation of this circuit.

din dock --t--+--+-*-+--'

Figure 9.1 A convolutional encoder.

The circuit has eight states owing to its three registers and the current state is determined by the previous three bits input. The circuit has a finite impulse response which means that any bit input has a finite effect upon the output. Some circuits are configured with feedback which means that a bit can effect the output stream for ever afterwards. Such a circuit has an infinite impulse response. This kind of coding can be visualized in a number of ways one of which is by generating a trellis diagram which reflects its operation. For this circuit the trellis appears as shown in Figure 9.2.

o

0101101110000

2

3 4 5

6 7 Figure 9.2 Trellis diagramfor a convolutional encoder.

CHAPTER 9. CONVOLUTIONAL CODING

135

The encoder has three registers and therefore can reflect eight states. These are labelled downwards in the trellis from 0 to 7. The number of register bits determines the depth of the trellis. On the horizontal axis are input/output bits or time. By means of the XOR gates the three register bits are converted into two output bits qo and qJ, sometimes defined in the polynomial form qo = 1 + x2 + x3 and ql = 1 + x + x2. In a typical system, qo and qt might be multiplexed together to form a single bit-stream at twice the input bit rate or superimposed onto a modulation scheme. Each time a new bit is input to the three-bit register the oldest bit falls out the end and the state of the registers potentially changes. This is plotted on the trellis with narrow solid lines representing an input 1 and dotted lines representing an input O. From any one of the eight states we can move through the trellis to one of two new valid states depending on the input. Starting from 0, the bold line represents the path carved through the trellis by the input data given on the top line. Below the input data is the output sequence with two bits for every one input, and this is what is transmitted.

9.1 Error Correction with Convolutional Codes The circuit has imposed a constraint on the way in which the transmitted sequence pairs can be generated and changed. At the receiver the decoder plots the progress of the data through the trellis, comparing each bit pair to what is expected based on the current position in the trellis. Where errors are found the decoder attempts a kind of 'best fit', matching the received bitstream to the c10sest valid path through the trellis. This c10sest path is deemed to be the corrected data. This is where life gets a little tricky and there exist several strategies for performing this 'best fit'. Clearly one could compare the received sequence to every possible path through the trellis and evaluate a cost for each path, finally choosing the path with the minimum Harnrning distance from that received. This rather defeats the object of convolutional coders, however, since we must wait until the entire message is received before any result can be output. Not only this but the nurnber of possible paths would quickly become unmanageable. The Viterbi algorithm, named after its inventor, makes a simplification to the decoding process using the following idea. Considering the trellis in Figure 9.3, there are two routes (A and B) to the node that is circ1ed, and these are shown in bold. Routes A and B will have a cost (Harnrning distance) wh ich is based upon the difference between them and the received sequence from the transmitter. Two routes thus meet with two associated costs. All routes emanating from the circ1ed node will inherit

136

9.2 Finding the Correct Path

the costs of A and Band will incur the same new costs from this point on. C is one such route and if the combined cost of route AC is smaller than the combined cost of route BC, then there is no point in continuing with B since it can never catch up with A. What this means is that we only need to keep a running track of 2n routes where there are n registers.

r'\: B

A

~

~" : .}r\:

V" ':\ 1\ \\ N I\/~ ,. \\1/\V\' \ t \',V :\ \\ V \

\ 1\

' '

E~

, ' " '

\

c.

,

\

Figure 9.3 Optimizing using the Viterbi algorithm.

9.2 Finding the Correct Path From any state in the encoder, given dino it is possible to determine what the outputs, qo and ql will be. These are shown in Table 9.1, below. Table 9.1 Generating qo and q1

State

X3

X2

Xl

din=O ql qo

d in = 1 ql qo

ql

qo

0 1 2 3 4 5 6 7

0 0 0 0 1 1 1 1

0 0 1 1 0 0 1 1

0 1 0 1 0 1 0 1

0 1 1 0 0 1 1 0

1 0 0 1 1 0 0 1

d !d !d d d !d !d d

d d !d !d !d !d d d

0 0 1 1 1 1 0 0

1 1 0 0 0 0 1 1

CHAPTER 9. CONVOLUTIONAL CODING

137

The two right-hand columns show the outputs qlqO based on the current state and input d in • From these the output shown in Figure 9.2 is derived. Corrupting the data stream given previously from

00,11,10,00,00,01,01,00,10,00,10,01,00,11 to

00,11,10,00,10,01,01,10,10,00,10,01,00,11 error correction may be attempted by constructing a cost table in the following way. Give the table sufficient rows for all encoder states plus 1. The circuit of Figure 9.1 requires nine rows and Table 9.2 is constructed from it. Along the top, fill in the input bit-stream. Table 9.2 Finding the correct path through the trellis

00

11

00

10

2

31

01

4142 513° 412° 5152 5°1 1 4261 414° 17211 12°6 1 2261 5130 2162

3°5 1 5251 4162

0

3 4 5 6 7

10

41 41 1O;6~ 15 152 52 3 3 6241 3° 3152 4°4 1

01

10

10

412° 3152 4142 4142 313° 412° 7220 4°21 3°4 1 5°42 16221 15 241 5132 3120 3150 3142 6251 4°5 1

00

10

01

4°4 1 5172 6241 515° 3152 4°41 7251 313° 6241 5°5 1 525114;2; 15172 3°5 1 4 4 515° 512° 5251 4°41 3°5 1 6241

Starting from level 0 in the trellis, the incoming bit pair 00 is compared with what could be expected at this point. From trellis state 0 we could expect 00 (d=O) or 11 (d=1). If d was 0, the output would be 00 and we would stay at the same level in the trellis. If d was 1, the incoming bits would have been 11 and the trellis state would change to 1. The received symbol was 00 so the cost of the first choice, d = 0, (light grey) is nothing but the cost of the second choice d = 1, (dark grey) is 2 bits. In the Table, the cost of each choice is shown as a superscript while the cumulative cost of a path is in anormal font. As you fill in the table, imagine that the trellis diagram is underneath. There are now four paths to consider, emanating from the previous two, and the cost of each must be counted and entered in the table. In the next column

138

9.3 Decoding with Soft Decisions

there are eight paths while in the next column, paths meet back together. This is reflected in that this and subsequent columns contain two values. At this point the Viterbi algorithm operates, deciding which paths to discard, keeping those with the lowest cost. Where the two paths have a similar cost it is not possible to know which (if any) is the correct one. The selected paths are shown bold in each column. It is tempting to look at the table and try to pick out the corrected path as we go along, but this is not possible for the following reasons. To start with, where the first error occurs two paths have the same cost, and second, if both bits of a pair were inverted then the error would result in a cost of 0 in an incorrect path and a cost of 2 in the correct one, not showing up until later. In other words, it is not possible to be sure which is the correct track until some time after an error(s) has occurred. When a route displays a constant cost over a number of bits it is possible to start working backwards to find out where it came from. In this example one route exhibits a cost of two bit errors for the last four columns and, in the absence of more errors, this would have continued. Picking up this route and working backwards allows selection of the least-cost path even though some columns share the same minimum cost between two routes. This is the case in both columns where the bit errors occur, but it does not stop us finding the correct path.

9.3 Decoding with Soft Decisions The bit costs in the tab1e above have been caIculated in whole numbers of bits but it is a simple extension of this to include the probabilities generated by soft decision decoders. The reason for this is that the decoding of convolutional codes operates at a very physical level. The idea of the distance between codewords is measured and caIculated in raw codeword bits. Using soft decisions simply means that distance can be measured in fractions of a bit. Almost certainly this will eliminate the chance of paths sharing the same costs and, in fact, does enhance the correction capability of the code. TheoreticaIly, soft decision decoding can provide an improvement of some 3 dB in a communications system, but in practice the result is more like 2 dB. This discrepancy may stern in part from the fact that a linear quantization is usually used to generate the bit probabilities. The ideal quantizations should be slightly non-linear (0, 1, 2, 3, 4, 5, 6 and 8.67). The hard thresholded data of the previous example was

CHAPTER 9. CONVOLUTIONAL CODING

139

00,11,10,00,10,01,01,10,10,00,10,01,00,11. If soft decisions are available, it might appear as

02,45, 71, 21, 42, 07, 16,42, 70 ..... and Table 9.3 shows how evaluation of the correct path is modified. In this case the received symbols will be compared with 0 and 7. Table 9.3 Finding the correct path using soft decisions

02

45

71

0

2 3 4

21

42

07

16

42

70

22333 8

28 6289

35728°

35 7282

3011 31 6

308245

35742 14

35738 12

346 389 368345

41 745 14 41 731°

23640 13

35 5198

38 1"36 7

4i 2337

405278

34°427

25 8303

~

24°367

37 2337

~

48 14427

25 6

1911386

31 28

26746 14

43 729 12

41 31

34742°

27 8

5 6

~ 2823

296 329

396359

34756 14

32 13

349326

42942 6

7

20 1

26 6 33 13

305348

39 1445 7 25°45 7

36 12387

38544 8

The shaded entry indicates the path created by treating the first bit input to the encoder as a 1. The two bits received were 02 but, for a 1, they should have been 77. The cost of this choice was (7-0)+(7-2) = 12 and the trellis state changes from 0 to 1. The second bit of the message might be 0 or 1, giving two paths emanating from the shaded cell. If the encoder is in state 1 then its output for an input d will be !d, d. So for d = 0, this is 10 and for d = 1, this is 01. At the receiver, we would expect, therefore, 70 or 07. The received symbol was 11, so the cost for d = 0 is (7-1)+(1-0) = 7 and the encoder moves to state 2 while for d = 1 the cost is (1-0)+(7-1) =7 and the encoder moves to state 3. And so the table is constructed. With soft decisions there are far fewer paths of equal cost and the gap between the correct path and others widens more quickly. Because convolutional codes can originate from continuous streams of data, a decoder might have to lock onto the code with no knowledge of the current state of the encoder, or (depending on the channel code), no knowledge of the bit synchronisation. Table 9.4 shows two examples of decoding, on the left side bit-pairs are correctly aligned while, on the right of they are misaligned. In this case, all path costs are reset to zero and the table

9.4 Peiformance 0/ Convolutional Codes

140

is built up as before. Notice that in the first column, four paths have a cost of zero, while in the second, this becomes two, while in the third column, only one path remains with zero cost. In the absence of errors, the number of symbols required to establish synchronisation will be of the order of the constraint length of the decoder. Table 9.4 Synchronising to the convolutional code

01

0 1'0° 1'22 2 221' 3 0°1' 4 1'2' 5 6

01

00

10

1110 1°3' 2'3 2 1'3 2 323' 2'1° 321' 2'2 2 3°2' 1°1' 12'0° 15~' 2'2 2 321' 3'2° 2'0° 1°1' 3'42

2'2° 223' 7 221' 2'2' 2'42 10°3' 1

01

11

00

00

00

110° 2221 2u21 2°3' 2°3' 1'22 0°2' 4~' 423' 423' 221' ~ 1'3 2 3'22 4'3 2 0°1' 2'2 2 1' 1°13'0°14'1° 1'22 1°1' 2~' 322' 423' ~ 321' 10°2' 11°2' 2°3'

10

10

01

3'5 2 4 132 4 120 3'3° ~ 4'42 3°3' 34 325' 523' 524'~ 14'1° 14'2° 4'42

4'3 2 4'4 2 4'2° 0°1' 1'3 2 3'1° 2'3° 11'2° 13~' 5~' 4°3' 221' 1'1° 3'3 2 2'5 2 1'42 1°2' 3°2' 623'

The case where bit synchronisation is not known is shown on the right of the table. All symbols have been shifted left by one bit. Inevitably, the paths start with zeros but gradually the minimum path cost increases so that no path remains with zero cost. Typically the data modulation scheme will be arranged such that the bit pairings are implicit but, if this is not the case, provision may be required for checking both situations, continuing with the least cost solution after the first few symbols have arrived.

9.4 Performance of Convolutional Codes The idea of measuring the Hamming distance between codewords is a little tricky since codewords may have arbitrary length. However, it is possible to truncate message at, say, the constraint length of the encoder and measure the distance between all possible messages. The actual measure that is used should be a function of the decoder strategy since some measures will be more appropriate than others. If the decoder has only a small view of the incoming message, then its performance will be largely influenced by dmin over a short term. A more suitable measure for decoders that have access to

CHAPTER 9. CONVOLUTIONAL CODING

141

an entire message will be dfree. which is the minimum distance calculated for the entire message. Convolutional coders can suffer from so-called catastrophic codes where a small number of errors in the message can cause unlimited errors after decoding. An example of where this might occur is for an encoder in which, after an initial difference, two different but repetitive input sequences give rise to a similar output sequence. Errors which confuse the initial difference between the two sequences may cause the decoder to switch from the correct sequence to the incorrect one. This will typically happen if the encoder produces an all-zero output for a non-zero input sequence. An output based on an even number of input bits will produce zero for an all-one input. Only the initial impulse, as the 1s propagate through the encoder, will distinguish the message from the all-zero message. If the initial impulse is misinterpreted through errors, then all subsequent outputs from the decoder may be in error. Because of their compatibility with soft decisions, convolutional codes are most closely associated with random errors. However, they do have a burst error correcting capability provided the burst is preceded by an error-free period. In the limit, the ratio between this guard space (g) and the burst error size (b) is given by the Gallager bound g 1+R ->-b -1-R

where R is the coding rate. This result will be influenced by the constraint length of the encoder.

9.5 Concatenated Codes Where two or mode codes are combined together, they are referred to as a concatenated code. Typically two codes, an outer and an inner will be used. Deep space communications channels use a concatenated code which has a convolutional inner code with a coding rate of Y2, and a Reed-Solomon outer code. These two codes make an ideal combination since the convolutional inner code interfaces naturally to the physical world by allowing the effective use of soft decision decoding. The Reed-Solomon outer code corrects burst errors created when the inner code fails. Because ReedSolomon codes are symbol-based, burst errors of many bits become constrained to only a few symbols making the RS code equally effectively. For deep space communications an outer (255,223) RS code is specified.

142

9.6 Iterative Decoding

Concatenated codes can be further enhanced by placing an interleaver between the inner and outer codes. This helps to spread the eITors across the outer code as discussed earlier. The inner convolutional code can be further modified to provide a soft output which is effectively a probability that each output symbol is COITect. The Soft Output Viterbi Algorithm (SOVA) is one example of a convolutional decoding strategy with soft outputs. While RS codes require hard decision inputs, the additional probability information can guide the erasure process in the event of an eITor overload in the outer code. If decoding fails, the symbol with the lowest probability of being COITect will be dec1ared an erasure and decoding is re-attempted. If decoding fails again, the process is repeated, choosing the next most likely eITor candidate.

9.6 Iterative Decoding Typically the outer code, rather than being one giant codeword, will comprise a number of individual codewords. These can have varying strengths and may be strong or weak codes depending on the distribution of the available redundancy amongst them. This princip1e was seen over a twodimensional RS code where the horizontal coding was made stronger than the vertical coding since, in that example, bursts were more likely to occur horizontally. The presence of strong and weak codes and the distribution of burst eITors amongst them means that some of the outer codes are likely to decode cOITectly even if not all do. If the outer code is interleaved, these COITect (or cOITected) codewords will map to disparate parts of the inner code in accordance with the interleaving strategy. Through the interleaving process, some of these known good symbols will end up interspersed amongst parts of the inner code where decoding was not successful. If sufficient good bits occur together, the state of the inner Viterbi decoder can be pinned in various pI aces on the basis of the known good codes. If this happens, a second pass of the inner decoder is likely to achieve greater success since some of the ambiguities are removed. The performance gains achieved by using iterative decoding are a rather complex function of the fundamental channel characteristic E,)No and the balance of strong and weak codes used. Roughly, however, a coding gain of between 0.2 and 0.6 dB extra can be expected by using iterative techniques (state pinning) in conjunction with eITor forecasting, for a relatively small extra processing cost.

CHAPTER 9. CONVOLUTIONAL CODING

143

9.7 Turbo Decoding Possible, the "holy grail" of error coding is the attainment of the Shannon limit. Shannon showed that, for any given channel, there is a limit on the amount of data that can be passed through it - a function of its bandwidth and signal to noise ratio. Shannon went on to say that it is possible to trade some of this capacity for error control codes which can provide arbitrarily high data integrity at the receiver output. So there exists a balance between the capacity used for error control codes and the integrity of the remaining data. From this arises a theoretical limit on the integrity that can be attained with respect to the added error control - the maximum coding gain. Much of the last half of twentieth century error coding was spent finding ever more complex ways to eek out another O.ldB of coding gain here and there towards this limit. However, in 1993 Turbo Codes were first presented and these created a quantum leap in error coding performance bringing it to within fractions of a degree of the Shannon limit. Unfairly it has been said that there are "lies, damned lies and statistics". In fact, statistics often provide the only means of making the best possible decisions in life. It is their poor interpretation or deliberate misinterpretation that leads to such malign. One of the keys to turbo coding - or decoding to be more exact, it the incorporation of channel statistics into the decoding process. Soft decision decoding has already been considered, albeit briefly, and this simple concept is taken much further in turbo decoding. Concatenated codes considered so far have been based on the idea of serial concatenation. An example of serial concatenation is shown in Figure 9.4.

m

Figure 9.4 Serial Concatenated Encoder.

A message mundergoes encoding by an outer code, probably a mixture of strong and weak codes. This is followed by an interleaver and a second (inner) coding process to produce a channel code c. There may, of course, be more than two codes concatenated together. Turbo codes, however, use parallel concatenation. Figure 9.5 illustrates how parallel concatenation might be performed. In this example, the coding rate is 1h as three streams are combined together using a multiplexer.

144

9.8 Discussion m--,...-----~

I-----c

Figure 9.5 Parallel Concatenated Encoder.

The interleaver is usually based on a random shuffle while the encoders Cl and C2 will both be convolutional. Clearly both encoders feed the physical channel and have, therefore the necessity to accommodate soft decision decoding strategies. Because of the interleaving process, decoding of a particular bit or symbol will take place at a different point in time. Additive channel noise will be different in each ca se allowing several estimates to be made for each bit. Decoding is performed iteratively, feeding the statistics of each step in the decoding process from one decoder to the other. Careful control of the feedback between decoders allows each decoder to converge to the final solution. Where necessary, higher coding rates can be achieved by puncturing the codes prior to transmission.

9.8 Discussion Convolutional codes provide an ideal interface to the "real world" allowing the addition of soft-decisions into the decoding equation. This provides between 2 and 3dB of coding gain which is very important in communications channels. By concatenating convolutional and ReedSolomon block codes via an interleaver and foBowing this with iterative decoding, a true synergy is produced between the codes through state pinning. This combination of inner convolutional codes and outer Reed-Solomon codes formed the basis of the communications link used by the Voyager expeditions to Uranus and Neptune. Deep-space communication suffers mainly from random errors for wh ich the convolutional code will recover weB, but by augmenting the codes with an outer Reed-Solomon code, huge coding gains are achieved. The Gali1eo mission to J upiter almost failed due to the refusal of a high gain antenna to open. This meant aB communication had to take place using a low ga in antenna resulting in a drastic reduction in the possible data rate. This event prompted a large effort to increase the coding gains of the codes used.

CHAPTER 9. CONVOLUTIONAL CODING

145

The 1990s saw the extension of convolutional codes into parallel concatenated arrangements known as turbo codes which are the closest yet to performing at the Shannon limit. Turbo decoding requires a working knowledge of the channel characteristics and, where these are unknown or variable, the complexity of decoding can increase somewhat. Even so, this represents one of the most exciting possibilities of modem error contral coding.

Chapter 10 HARDWARE

Anyone involved in the design and development of low-Ievel systems will understand that the boundary between software and hardware implementation is fuzzy. Whether a component is realised as a software routine or a hardware circuit is a function of speed, cost and available resources. In this chapter, circuits for performing various operations over finite fields are considered. Even in hardware, there are levels of implementation. A circuit might be generic, capable of supporting arbitrary field size or primitive polynomial, or it might be specific to a particular field. Any solution will be a function of available resources, speed of execution and the application.

10.1 Reducing Elements Often an operation on a field element, or elements will result in a value which is outside of the field range. For example if two elements x + 1 and:J + x + lover GF(2 3) are multiplied modulo-2, the result is x 3 + 1. x 3 is outside of the field whose highest degree is x 2 . The primitive polynomial x 3 + x + 1 = o is used to fold x3 back into the field leaving x. This operation can be accomplished in either a parallel or aserial way. The serial solution is none other than the circuit used to calculate a cyclic redundancy check, shown in Figure 10.1.

A. Houghton, Error Coding for Engineers © Kluwer Academic Publishers 2001

148

10.1 Reducing Elements

v

Figure 10.1 Restoring a vector to the Field.

The vector v is clocked into the circuit until the least significant bit has been entered. At this point, the three registers contain the reduced vector. Table 10.1 illustrates this operation using the previous example of x3 + 1. Table 10.1 Reducing x 3 + 1 using; + x + 1 = 0

q2 =ql 0 0 0 1 0

,

ql = q2' $ qo'

qo =qz' $ Vi

Vi

0 0 1 0 1

0 1 0 0 0

1 0 0 1

The primitive polynomial is input in parallel to the AND gates. In a dedicated solution, zeros in the polynomial allow removal of the associated AND and XOR gate. Figure 10.2 shows a parallel solution to this problem.

Figure 10.2 Parallel Element Reducer.

Over very long vectors it' s possible to combine serial and parallel components to yield a fast circuit that does not involve large amounts of logic. Clearly the choice of polynomial will have a significant impact on dedicated solutions like the one in Figure 10.2. By and large, the fewer the ones in the generator polynomial, the simpler the attendant circuits. It is not

CHAPTER 10. HARDWARE

149

always necessary to bring a vector back into the field after each operation. From a practical point of view x is no more or less a valid description of an element than x3 + 1. Considering an operation like a Fourier transform, results are compiled from a sum of products. The term F I of a Fourier transform, for example, has the form 6

FJ = .,Ldpj j=O

over GF(2\ Each of the products djCt potentially expands the field elements to nearly twice their size with terms up to x4 • Since either the parallel or the serial implementations in Figures 10.1 and 10.2 will involve a time overhead, it is more appropriate to perform the summation over five bits and reduce the final result. Consider FI for the vector d6-0 = 3, 3, 4, 0, 2, 1, 7 in Table 10.2. Table 10.2 Calculation 01 F] over GF(2 3)

ci 0 0

x2 0

x2

0

x

0

x

0 0

x

XO

X

0

0

XO XO

y x 2

Intermediate Products

dj

0 0 0

x2

0 0

0 0 0 0

0 Xl

0 0

x x

0 0 0

x4

XO XO

0 0

L

i

0 0

x3 0

x3 x3 x3 0

0 0 0 0 0

x

2

0

x 0 0 0 0

0 0 0 0

X

XO XO

x

XO

The result, x 4 + x + 1 can be brought back into the field to give Y + 1. However, by summing the intermediate products over five bits, a faster overall performance is achieved.

10.2 Multiplication If the process of bringing elements back into the field (section 10.1) is considered separately from multiplication then, apart from element size, multiplication becomes a completely polynomial-independent operation. Figure 10.3 is an example of a multiplier for any 4-bit symbols, regardless of primitive polynomial. In the figure Z = XxY.

150

10.2 Multiplication

Yo

Zn

y,

z,

Yz

Z2

Y3 Z6

Z5

Z4

Z3

Figure 10.3 Parallel Multiplier Over GF(24 ).

In Figure 10.3 horizontal and vertical lines are ANDed while diagonal lines are summed modulo-2 at each intersection. The output Z2, for example, is found from

The primitive polynomial describing the field only becomes important when it is necessary to present the symbol as a field element, wrapping the higher order bits (Z6 to Z4) back into the lower order bits (Z3 to zo). Over larger fields the parallel implementation of Figure 10.3 may become a little unwieldy in which case aserial form might be desirable. Figure 10.4 illustrates a multiplier for GF(2 3) using x3 + x +1 = O.

Figure 10.4 Serial Multiplier Over GF(23).

A shift register is preloaded with X, outputting decreasing powers of x each time it is clocked. Y is input to the circuit in a parallel fashion, gated by Xi. The primitive polynomial P is input in parallel, similarly to Y. After three clocks, the result is contained in the three central registers. Like the parallel multiplier, this can be arranged to produce a 5-bit output by adding two further registers after q2 and omitting the polynomial feedback path. This

CHAPTER 10. HARDWARE

151

reduces further, the complexity of the circuit, but necessitates a later processing step to bring X4 and X3 back into the field. This simplified multiplier is shown in Figure 10.5. y

Figure 10.5 Simplijied Serial Multiplier Over GF(23 ).

So far, multipliers have been based on symbols expressed as bit patterns or magnitudes, rather than powers. This is the most convenient way of expressing elements since 0 cannot be expressed in terms of apower, a i • None-the-Iess, it is useful to have a multiplier which can multiply a symbol magnitude by a symbol power. In other words, the two inputs to the circuit are apower, k and a magnitude X, while the output is Xak • Serial implementation of such a device is trivial, but very slow over large fields. All that is required is a counter and a slight modification to the circuit of Figure 10.1, shown below in Figure 10.6.

Figure 10.6 Serial

cl multiplier.

If the circuit is preloaded with X, each time it is clocked the value in the registers increases by a. If a second register is preloaded with k and configured as a down counter, then the circuit of Figure 10.6 is clocked as k is decremented to O. When k reaches 0, the multiplier registers will be ak times greater than their starting value. For negative k, the circuit can be configured to multiply by a-I . Since x 3 + x + 1 = 0, X-I = 1 + x 2• This can be realised by reversing the direction of the shift registers (swapping the ds and qs around) such that the feedback comes from qo. qo now generates X-I when the circuit is clocked. A much faster implementation is possible by breaking k down into its binary form. However, in this instance the multiplier becomes specific to a particular polynomial. Using GF(2n ), n multiplier configurations are

necessary, a l , a 2, a 4 .•• a 2n- 1 • Each one of these is associated with its

152

10.2 Multiplication

corresponding bit in k. If the appropriate bit in k is set, the multiplication is performed. Using the tractable GF(2\ one possible circuit is shown in Figure 10.7.

Figure 10.7 Fast

cl Multiplier Circuit.

The magnitude X is loaded into an n-bit register whose output feeds a magnitude-based multiplier such as the one shown in Figure 10.3. The other input to the multiplier comes from an n by n bit shift register, preloaded with i powers of a. An n-bit shift (right) register holds k. When the circuit is clocked, if ki is 1, then the multiplier result is clocked back into the X register, replacing the original value. If not, the register remains unchanged. After n clocks, the X register will hold Xak • Because the multiplier is only ever multiplying by n fixed values to over GF(2n) it can be optimised. Often a particular solution requires a fixed multiplier where one of the inputs is constant. Obviously it' s a trivial matter to convert the circuit of Figure 10.3 to a fixed multiplier. Using an example from GF(24) consider a circuit to multiply an input symbol by aB (x3 + x2 + x) using x4 + x + 1 = O. If the input symbol is d, or d3X3 + d2x2 + d1x 1 + dQXo, di = 0 or 1, then multiplying this by aB gives

and gathering terms in Xi gives

Using the primitive polynomial to bring the upper terms back into the field,

x6 terms join x3 and x2, X S terms join x2 and x, while x4 terms join x and xO, so x3(dor&>dlr&>d2r&>d3) + x\dor&>dlr&>d3r&>d2r&>d3) + x(dor&>d2r&>d3r&>dlr&>d2r&>d3) + xO(d1r&>d2r&>d3).

CHAPTER 10. HARDWARE

153

Minimising terms leaves

The circuit in Figure 10.8 follows on from this d3 d2 d, do

Figure 10.8 Multiplying by a Constant.

We can test this circuit using a couple of simple examples. Consider rixaB = a 4 • The magnitude of a 8 is 0101 over this field. Applying this to the d inputs (remembering that an XOR gate will output 1 for any odd combination of ls at the input) gives 0011, the magnitude of a4 • Now consider aCxaB = a 8• The magnitude of a C is 1111 so the circuit output is 0101, the magnitude of a 8 •

10.3 Division Division is a little more tricky than multiplication. Certain types of divider are moderately tractable, for example, aserial divider with one of the inputs expressed as apower. Dividing by constants is also trivial; in this instance, the constant can be inverted and the problem becomes a multiplication like the circuit in Figure 10.8. A general purpose divider, with magnitudes for inputs, can be implemented in a parallel configuration but, as field size increases, this solution quickly becomes unmanageable. Normally, therefore, implementation will take the form of an iterative solution. Over small fields a parallel divider can be realised using a basic sum-ofproducts circuit. Consider the field GF(2\ Table 10.3 lists all the possible (non-zero) solutions for divisions over GF(23) with the divisor on the horizontal axis. For convenience input symbols are denoted ABC and DEF, while the output is PQR, so

154

10.3 Division

Table 10.3 Constructing a Parallel Divider

0.0 0.1 0

0.0.3 0.4 0.5 0.6

ABC ABC ABC ABC ABC ABC ABC

0.0

0.1

0.2

0.3

0.4

0.5

0.6

DEF

DEF

DEF

DEF

DEF

DEF

DEF

001 010 100 Oll 110 111 101

101 001 010 100 Oll 110 111

111 101 001 010 100 Oll 110

110 111 101 001 010 100 Oll

Oll 110 111 101 001 010 100

100 Oll 110 111 101 001 010

010 100 Oll 110 111 101 001

We need to construct an equation for each of the output bits P, Q and R. Consider R, first. In Table 10.3, the least significant bit (R) has been highlighted in bold for all results. R can be generated by forming a sum-ofproducts based on each 1 so, on a column by column basis R

=

DEF (ABC +ABC+ABC+ABC) + DEF(ABC+ABC+ABC+ABC) + DEF(ABC+ABC+ABC+ABC)+ DEF (ABC + ABC + ABC + iBC ) + DEF (ABC + ABC + ABC + ABC)+ DEF(ABC+ABC+ABC+ABC) + DEF(ABC+ABC+ABC+iBC)

which simplifies to R

= DEFC + DEF(jic + BC)+ DEF (ABC + ABC + ABC + ABC) + DEF(iB + AB)+ DEF{ÄC + AC} DEFB+ DEFA

similarly

Q=

DEFB + DEFA + DEFC + DEF(CB + CB)+ DEF (ABC + ABC + ABC + ABC) + DEF\As + AB)+ DEF(Äc + AC)

and P = DEFA + DEFC + DEF(BC + BC)+ DEF (ABC + ABC + ABC + ABC)+ DEF{ÄB + AB) + DEF(ÄC + AC)+ DEFB .

These boil down to seven common terms in D, E and Fand seven common terms inA, Band C, as follows in Figure 10.9.

CHAPTER 10. HARDWARE

155

ABC

DEF t7

to

t8

t1 t2

t9 t3

110

14

III

t5

112

16

113

10 I,

10

10 1i3

I"

R

I,

I, I. I,

Q

I, I,

p

I. 110

Figure 10.9 Parallel Divider over GF(2 3 ).

The circuit works by calculating all possible resuits and, on this scaIe, it could be constructed in a programmable Iogic array reasonabIy easily. Using typical Iogic synthesis and fitter tools for CPLD devices, this circuit requires 3 macrocells and 29 PLA terms. However, over GF(28) which is used extensively for byte oriented systems, there are 65025 solutions and eight bits to derive from them, so the solution does not scale weIl. For a scalable solution, we need to consider an iterative approach based on symbol traps. Symbol traps essentially look for particular bit patterns during processing. An easy trap to arrange is to look for when the weight of the symbol is one (i.e. onIy a single bit is set). Owing to the way that the fields are generated, this happens for the symbols aO to an-lover GF(2n). Over GF(2\ for example, these would be aO (001), a 1 (010) and a 2 (100). Taking any non-zero value and repeatedly dividing it by an guarantees that at some point the result will have a weight of one. If it takes a divides to reach this

156

10.3 Division

state, the dividend must have been in the range aaxn+b where O'5.b=O Then LI(w A) [i-j] := LI(w A) [i-j] XOR AxB(LI(L A) [n-i], LI(SA)[j]); {now, using location of roots in R, find magnitudes} For i := 0 to n-1 do Begin k : = LI (RA) [i] ;

e : = 0;

For j := 0 to n do e := e XOR AxB(LI(w A) [j], ExpA(j*k»; e := AxB(e, ExpA(-k»;

d

:=

0;

:= 0 to n-1 do d := d XOR AxB(ExpA(k*j), LI(dI A) [n-j-l]); e := AoB(e, d); LI (M [i] : = e; For j

A

)

End; FreeMem (di, 2*(n+l»; FreeMem (w, 2*(n+l»;

End; {Performs recursive extension} Proeedure RecExt (S, L : Pointer; Order, Vecs Integer) ; Var i, j, r : Integer; Begin For i := 0 to Vecs-l do Begin r : = 0; For j := 0 to Order-1 do r := r XOR AxB(LI(V) [Order-j], LI(sA) [j+i]); LI(sA) [i+Order] := r; End; End;

239

240

(Expands polynomial expression) Procedure ResolvePoly (L, t : Pointer; Order, Vecs Integer) ; Var i, j , r : Integer; Begin For i := 0 to Vecs-l do Begin r : = LI (LA) [Order] ; For j := 0 to Order-l do r := r XOR AxB(ExpA«Order-j)*i), LI(L A) [j]); LI(t A) [i] := r; End; End; (Calculates syndrome j) Function GetSyndrome (d Pointer; Integer) Integer; Var i, s : Integer; Begin S : = 0; For i := 0 to Msk-l do s := S XOR AxB(LI(d A) [i], ExpA(i*j»; GetSyndrome := s; End; (Generate a power series to give a polynomial with known roots) Procedure GenPower (i, j, Roots, Sum : Integer; Var Res: Integer); Var k : Integer; Begin If i=O Then Res := 0; If Roots j Then Res := Res XOR ExpA(Sum) Else For k := i to j do GenPower (k+l, j+l, Roots, Sum+k, Res); End; Procedure CreatePoly (L : Pointer; Roots : Integer); Var i : Integer; Begin For i := 1 to Roots do GenPower (0, Roots-i, Roots, 0, LI (LA) [i]); LI(L A) [0] := 1; End; Integer; Var Res Procedure GenPowerl (R : Pointer; i, j, Roots, Sum Var k Integer; Begin If i=O Then Res := 0; If Roots = j Then Res := Res XOR ExpA(Sum) Else For k := i to j do GenPowerl (R, k+l, j+1, Roots, Sum+LI(R A) [k], Res); End; Procedure CreatePolyl (L, R : Pointer; Roots : Integer); Integer; Var i Begin For i := 1 to Roots do GenPowerl (R, 0, Roots-i, Roots, 0, LI(L A) [i]); LI(L A) [0] := 1; End;

Integer) ;

APPENDIX E. SOFTWARE LIBRARY

I {Sundry

display utilities.} {Display elements in decimal format} Procedure DisplayDec (D : BP; s, x, y Byte) ; Var i, j : Byte; Begin TextColor (14); For i := 0 to 15 do Begin GotoXY (x, y+i); For j := 0 to 15 do Begin If (i AND j 15) AND (j+i SHL 4