330 66 28MB
English Pages 527 [560] Year 2020
Mathematical Theory and Applications of Error Correcting Codes
Mathematical Theory and Applications of Error Correcting Codes
Edited by: Stefano Spezia
ARCLER
P
r
e
s
s
www.arclerpress.com
Mathematical Theory and Applications of Error Correcting Codes Stefano Spezia
Arcler Press 224 Shoreacres Road Burlington, ON L7L 2H2 Canada www.arclerpress.com Email: [email protected]
e-book Edition 2021 ISBN: 978-1-77407-969-0 (e-book) This book contains information obtained from highly regarded resources. Reprinted material sources are indicated. Copyright for individual articles remains with the authors as indicated and published under Creative Commons License. A Wide variety of references are listed. Reasonable efforts have been made to publish reliable data and views articulated in the chapters are those of the individual contributors, and not necessarily those of the editors or publishers. Editors or publishers are not responsible for the accuracy of the information in the published chapters or consequences of their use. The publisher assumes no responsibility for any damage or grievance to the persons or property arising out of the use of any materials, instructions, methods or thoughts in the book. The editors and the publisher have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission has not been obtained. If any copyright holder has not been acknowledged, please write to us so we may rectify. Notice: Registered trademark of products or corporate names are used only for explanation and identification without intent of infringement. © 2021 Arcler Press ISBN: 978-1-77407-766-5 (Hardcover) Arcler Press publishes wide variety of books and eBooks. For more information about Arcler Press and its products, visit our website at www.arclerpress.com
DECLARATION Some content or chapters in this book are open access copyright free published research work, which is published under Creative Commons License and are indicated with the citation. We are thankful to the publishers and authors of the content and chapters as without them this book wouldn’t have been possible.
ABOUT THE EDITOR
Stefano Spezia is Ph.D. holder in Applied Physics at the University of Palermo since April 2012. His major research experience is in noise-induced effects in nonlinear systems, especially in the fields of modeling of complex biological systems and simulation of semiconductor spintronic devices. Associate member of the Italian Physical Society and European Physical Society.
TABLE OF CONTENTS
List of Contributors .....................................................................................xvii List of Abbreviations ................................................................................... xxv Preface.................................................................................................. ....xxix Section 1 Introduction to Error-Correcting Codes Chapter 1
Error-correcting Codes and Neural Networks ........................................... 3 Abstract ..................................................................................................... 3 Introduction And Summary ........................................................................ 4 Codes And Good Codes ............................................................................ 5 Neural Encodings Of Stimulus Spaces...................................................... 10 Acknowledgements ................................................................................. 13 References ............................................................................................... 14 Section 2 Hard and Soft Decision Decoding
Chapter 2
Sum of the Magnitude for Hard Decision Decoding Algorithm Based on Loop Update Detection ........................................... 19 Abstract ................................................................................................... 19 Introduction ............................................................................................. 20 Basic Definitions...................................................................................... 22 Algorithm Description ............................................................................. 23 Complexity Analysis ................................................................................ 29 Simulation Results and Statistical Analysis ............................................... 30 Conclusions ............................................................................................. 36 Acknowledgments ................................................................................... 36 Conflicts Of Interest ................................................................................. 36 References ............................................................................................... 37
Chapter 3
Soft-Decision Low-Complexity Chase Decoders for the RS (255,239) Code .................................................................................. 41 Abstract ................................................................................................... 41 Introduction ............................................................................................. 42 Rs Decoders ............................................................................................ 44 Low-Complexity Chase Decoder ............................................................. 44 Decoder Architecture............................................................................... 45 Implementation Results............................................................................ 51 Conclusions ............................................................................................. 56 References ............................................................................................... 57
Chapter 4
Low-energy Error Correction of NAND Flash Memory through Soft-decision Decoding .............................................................. 59 Abstract ................................................................................................... 59 Introduction ............................................................................................. 60 Energy Consumption Of Multi-Bit Data Read In Nand Flash Memory ...... 61 Soft-Decision Error Correcting Performance In Nand Flash Memory ........ 68 Hardware Performance Of (68254, 65536) Ldpc Decoder ....................... 71 Low-Energy Error Correction Scheme For Nand Flash Memory ................ 73 Concluding Remarks................................................................................ 77 Acknowledgements ................................................................................. 77 References ............................................................................................... 78
Chapter 5
Performance of Soft Viterbi Decoder enhanced with Non-Transmittable Codewords for Storage Media .................................. 81 Abstract ................................................................................................... 81 Public Interest Statement.......................................................................... 82 Introduction ............................................................................................. 82 Binary Convolutional Encoding And Decoding ........................................ 84 Enhanced Soft Viterbi Algorithm............................................................... 86 Developed Model .................................................................................... 87 Result And Discussion ............................................................................. 90 Conclusion .............................................................................................. 95 Funding ................................................................................................... 95 References ............................................................................................... 96
x
Section 3 Linear Codes: Cyclic and Constacyclic Codes Chapter 6
The Structure of One Weight Linear and Cyclic Codes Over .................................................................................... 101 Abstract ................................................................................................ 101 Introduction ........................................................................................... 102 Preliminaries.......................................................................................... 103 The Structure of One Weight One Weight
-Linear Codes .............................. 108
-Cyclic Codes ....................................................... 111
Examples Of One Weight
-Cyclic Codes ................................... 115
Conclusion ............................................................................................ 118 Acknowledgement ................................................................................. 118 References ............................................................................................. 119 Chapter 7
(1 + u)-Constacyclic Codes over Z 4 + uZ 4 .......................................... 121 Abstract ................................................................................................. 121 Background ........................................................................................... 122 (1 + U)-Constacyclic Codes Over Z 4 + UZ 4 ............................................................................ 123 Gray Images Of (1 + U)-Constacyclic Codes Over R .............................. 125 Conclusion ............................................................................................ 130 Acknowledgements ............................................................................... 130 Competing Interests ............................................................................... 130 References ............................................................................................. 131 Section 4 Introduction to Error-Correcting Codes
Chapter 8
Projection Decoding of Some Binary Optimal Linear Codes of Lengths 36 and 40 ............................................................................. 135 Abstract ................................................................................................. 135 Introduction ........................................................................................... 136 Projection of Binary Linear Codes .......................................................... 138 Construction of Binary Optimal [36, 19, 8] And [40, 22, 8] Codes......... 139 Projection Decoding .............................................................................. 144 Examples ............................................................................................... 147 Conclusions ........................................................................................... 151 Funding ................................................................................................. 151
xi
Conflicts of Interest ................................................................................ 151 References ............................................................................................. 152 Chapter 9
Reed-Solomon Turbo Product Codes for Optical Communications: From Code Optimization to Decoder Design ........... 153 Abstract ................................................................................................. 153 Introduction ........................................................................................... 154 Reed-Solomon Product Codes ............................................................... 156 Turbo Decoding of RS Product Codes .................................................... 157 RS Product Code Design For Optical Communications .......................... 160 Full-Parallel Turbo Decoding Architecture Dedicated to Product Codes.............................................................................. 168 Complexity and Throughput Analysis of The Full-Parallel Reed-Solomon Turbo Decoders .................................................... 171 Implementation of an RS Turbo Decoder For Ultra High Throughput Communication ........................................................ 177 Conclusion ............................................................................................ 180 Acknowledgments ................................................................................. 180 References ............................................................................................. 181
Chapter 10 Enhancing BER Performance Limit of BCH and RS Codes Using Multipath Diversity ............................................................................... 185 Abstract ................................................................................................. 186 Introduction ........................................................................................... 186 Related Work ......................................................................................... 187 Forward Error Correction ....................................................................... 188 Multipath Propagation ........................................................................... 190 Methodology ......................................................................................... 191 Column Weight Multipath Combiner (Cwmc) ........................................ 192 Results ................................................................................................... 195 Conclusions ........................................................................................... 199 Acknowledgements ............................................................................... 200 Conflicts Of Interest ............................................................................... 200 Abbreviations ........................................................................................ 200 References ............................................................................................. 202
xii
Section 5 Quasi Cyclic Codes Chapter 11 Quasi-Cyclic Codes via Unfolded Cyclic Codes and Their Reversibility .......................................................................................... 207 Abstract ................................................................................................. 207 Section I. ............................................................................................... 208 Introduction ........................................................................................... 208 Section II. .............................................................................................. 210 Preliminaries.......................................................................................... 210 Section III. ............................................................................................. 214 Generator Polynomial Matrix Of Φθ(C) .................................................. 214
Section IV. ............................................................................................. 219 Reversibility of Unfolded Cyclic Codes .................................................. 219 Section V. ............................................................................................... 223 Numerical Examples .............................................................................. 223 Section VI. ............................................................................................. 226 Conclusion ............................................................................................ 226 References ............................................................................................. 227 Chapter 12 Skew Cyclic and Quasi-Cyclic Codes of Arbitrary Length over Galois Rings .......................................................................................... 229 Abstract ................................................................................................. 229 Preliminaries.......................................................................................... 230 Skew Cyclic Codes ................................................................................ 230 Skew Quasi-Cyclic Codes ...................................................................... 231 Examples ............................................................................................... 232 References ............................................................................................. 234 Section 6 Low Density Parity Check Codes Chapter 13 On the use of Ordered Statistics Decoders For Low-Density Parity-Check Codes In Space Telecommand Links............ 237 Abstract ................................................................................................. 238 Introduction ........................................................................................... 238 Ldpc Codes For Space Telecommand Links ............................................ 242 Decoding Algorithms ............................................................................. 244 Error Rate Versus Complexity Tradeoff Evaluation ................................... 249
xiii
Impact Of Limited Memory.................................................................... 256 Conclusions ........................................................................................... 264 Endnote ................................................................................................. 264 Appendix ............................................................................................... 264 Acknowledgements ............................................................................... 265 Competing Interests ............................................................................... 265 References ............................................................................................. 266 Chapter 14 Optimization of LDPC Codes over the Underwater Acoustic Channel . 269 Abstract ................................................................................................. 269 Introduction ........................................................................................... 270 Underwater Acoustic Channel Model .................................................... 274 Iterative DFE-LDPC Structure and Its Exit Charts..................................... 275 Optimization Of LDPC Codes................................................................ 279 Conclusions ........................................................................................... 283 Conflict Of Interests ............................................................................... 283 Acknowledgments ................................................................................. 283 References ............................................................................................. 284 Chapter 15 Augmented Decoders for LDPC Codes.................................................. 289 Abstract ................................................................................................. 290 Introduction ........................................................................................... 290 Brief Overview LDPC Codes .................................................................. 291 Method .................................................................................................. 296 Numerical Results.................................................................................. 297 Conclusions ........................................................................................... 302 Abbreviations ........................................................................................ 303 References ............................................................................................. 304 Chapter 16 Adaptive Rate-Compatible Non-Binary LDPC Coding Scheme for the B5G Mobile System ................................................................... 307 Abstract ................................................................................................. 307 Introduction ........................................................................................... 308 The Construction of RC-NB-LDPC Codes ............................................... 311 Channel Clustering Based on The K-Means++ Algorithm ....................... 322
xiv
Design And Performance Simulation Of The Adaptive RC-NB-LDPC Coding Scheme ...................................................... 327 Conclusions ........................................................................................... 339 Conflicts of Interest ................................................................................ 340 References ............................................................................................. 341 Chapter 17 Improved Symbol Value Selection for Symbol Flipping-based Non-binary LDPC Decoding .......................................... 347 Abstract ................................................................................................. 347 Introduction ........................................................................................... 348 System Model ........................................................................................ 349 Non-Binary LDPC Decoding.................................................................. 350 Proposed Symbol Value Selection Algorithm .......................................... 352 Computational Complexity .................................................................... 353 Simulation Results and Discussion ......................................................... 356 Conclusions ........................................................................................... 358 Acknowledgments ................................................................................. 359 References ............................................................................................. 360 Chapter 18 DNA Barcoding through Quaternary LDPC Codes................................ 361 Abstract ................................................................................................. 361 Introduction ........................................................................................... 362 Results ................................................................................................... 365 Discussion ............................................................................................. 374 Methods ................................................................................................ 376 Acknowledgments ................................................................................. 380 Funding Statement ................................................................................. 380 References ............................................................................................. 381 Chapter 19 High Speed and Adaptable Error Correction for Megabit/s Rate Quantum Key Distribution ............................................................ 389 Abstract ................................................................................................. 389 Introduction ........................................................................................... 390 Results ................................................................................................... 391 Discussion ............................................................................................. 398 Methods ................................................................................................ 400 Acknowledgements ............................................................................... 401 xv
References ............................................................................................. 402 Section 7 Introduction to Error-Correcting Codes Chapter 20 Duality of Quantum and Classical Error Correction Codes: Design Principles and Examples ............................................................ 407 Abstract ................................................................................................. 407 Section I. ............................................................................................... 411 Section II ............................................................................................... 418 Quantum Decoherence ......................................................................... 418 Section III .............................................................................................. 424 Quantum Coding Theory ....................................................................... 434 Section IV. ............................................................................................. 442 Section V. ............................................................................................... 452 Stabilizer Formalism .............................................................................. 452 Section VI. ............................................................................................. 461 Quantum-To-Classical Isomorphism ...................................................... 461 Section VII. ............................................................................................ 468 Taxonomy Of Stabilizer Codes ............................................................... 468 Section VIII. ........................................................................................... 477 Design Examples ................................................................................... 477 Section IX. ............................................................................................. 490 Conclusion & Design Guidelines ........................................................... 490 References ............................................................................................. 492 Chapter 21 Dynamic Concatenation of Quantum Error Correction in Integrated Quantum Computing Architecture ...................................... 507 Abstract ................................................................................................. 508 Introduction ........................................................................................... 508 Results ................................................................................................... 509 Discussion ............................................................................................. 516 Acknowledgements ............................................................................... 516 References ............................................................................................. 517 Index ..................................................................................................... 521
xvi
LIST OF CONTRIBUTORS
Yuri I. Manin Max–Planck–Institut für Mathematik, Bonn, Germany Jiahui Meng College of Information & Communication Engineering, Harbin Engineering University, Harbin 150001, China Danfeng Zhao College of Information & Communication Engineering, Harbin Engineering University, Harbin 150001, China Hai Tian College of Information & Communication Engineering, Harbin Engineering University, Harbin 150001, China Liang Zhang College of Information & Communication Engineering, Harbin Engineering University, Harbin 150001, China Vicente Torres Instituto de Telecomunicaciones y Aplicaciones Multimedia, Universitat Politècnica de València, 46022 Valencia, Spain Javier Valls Instituto de Telecomunicaciones y Aplicaciones Multimedia, Universitat Politècnica de València, 46022 Valencia, Spain Maria Jose Canet Instituto de Telecomunicaciones y Aplicaciones Multimedia, Universitat Politècnica de València, 46022 Valencia, Spain Francisco García-Herrero ARIES Research Center, Universidad Antonio de Nebrija, 28040 Madrid, Spain
xvii
Jonghong Kim Department of Electrical Engineering and Computer Science, Seoul National University, Gwanak-gu, Seoul 151-744 Korea Wonyong Sung Department of Electrical Engineering and Computer Science, Seoul National University, Gwanak-gu, Seoul 151-744 Korea Kilavo Hassan School of Computational and Communication Science and Engineering, Nelson Mandela African Institution of Science and Technology, Arusha, Dodoma, Tanzania. Kisangiri Michael School of Computational and Communication Science and Engineering, Nelson Mandela African Institution of Science and Technology, Arusha, Dodoma, Tanzania. Salehe I. Mrutu College of Informatics and Virtual Education, The University of Dodoma, Dodoma, Tanzania. Ismail Aydo˘gdu Department of Mathematics, Faculty of Arts and Sciences, Yıldız Technical University, ˙Istanbul, Turkey Haifeng Yu Department of Mathematics and Physics, Hefei University, Hefei, China. Yu Wang Department of Mathematics and Physics, Hefei University, Hefei, China. Minjia Shi 2 School of Mathematical Sciences, Anhui University, Hefei, China. Lucky Galvez Department of Mathematics, Sogang University, Seoul 04107, Korea Institute of Mathematics, University of the Philippines Diliman, Quezon City 1101, Philippines Jon-Lark Kim Department of Mathematics, Sogang University, Seoul 04107, Korea Raphael Le Bidan Institut TELECOM, TELECOM Bretagne, CNRS Lab-STICC, Technopole BrestIroise, CS 83818, 29238 Brest Cedex 3, France xviii
Camille Leroux Institut TELECOM, TELECOM Bretagne, CNRS Lab-STICC, Technopole BrestIroise, CS 83818, 29238 Brest Cedex 3, France Christophe Jego Institut TELECOM, TELECOM Bretagne, CNRS Lab-STICC, Technopole BrestIroise, CS 83818, 29238 Brest Cedex 3, France Patrick Adde Institut TELECOM, TELECOM Bretagne, CNRS Lab-STICC, Technopole BrestIroise, CS 83818, 29238 Brest Cedex 3, France Ramesh Pyndiah Institut TELECOM, TELECOM Bretagne, CNRS Lab-STICC, Technopole BrestIroise, CS 83818, 29238 Brest Cedex 3, France Alyaa Al-Barrak Department of Computing & Immersive Technologies, University of Northampton, Northampton NN2 6JD, UK Department of Computer Science, College of Science, University of Baghdad, Baghdad 10071, Iraq Ali Al-Sherbaz Department of Computing & Immersive Technologies, University of Northampton, Northampton NN2 6JD, UK Triantafyllos Kanakis Department of Computing & Immersive Technologies, University of Northampton, Northampton NN2 6JD, UK Robin Crockett Department of Environmental and Geographical Sciences, University of Northampton, Northampton NN2 6JD, UK Ramy Taki Eldin Faculty of Engineering, Ain Shams University, Cairo 11517, Egypt Toyota Technological Institute, Nagoya 468-8511, Japan And Hajime Matsui Toyota Technological Institute, Nagoya 468-8511, Japan
xix
Mingzhong Wu Department of Mathematics, China West Normal University Nanchong, Sichuan 637002, P.R. China Marco Baldi Dipartimento di Ingegneria dell’Informazione, Università Politecnica delle Marche, Ancona, Italy. Consorzio Nazionale Interuniversitario per le Telecomunicazioni (CNIT), Parma, Italy. Nicola Maturo Dipartimento di Ingegneria dell’Informazione, Università Politecnica delle Marche, Ancona, Italy. Consorzio Nazionale Interuniversitario per le Telecomunicazioni (CNIT), Parma, Italy. Enrico Paolini Department of Electrical, Electronic, and Information Engineering “G. Marconi,” University of Bologna, Cesena, Italy Franco Chiaraluce Dipartimento di Ingegneria dell’Informazione, Università Politecnica delle Marche, Ancona, Italy. Consorzio Nazionale Interuniversitario per le Telecomunicazioni (CNIT), Parma, Italy. Shengxing Liu Department of Applied Marine Physics and Engineering, Xiamen University, Xiamen 361102, China Key Laboratory of Underwater Acoustic Communication and Marine Information Technology, Ministry of Education, Xiamen University, Xiamen 361102, China Aijun Song Department of Electrical and Computer Engineering, University of Alabama, Tuscaloosa, AL 35487, USA Alex R. Rigby College of Sciences and Engineering, University of Tasmania, Hobart, Australia. JC Olivier College of Sciences and Engineering, University of Tasmania, Hobart, Australia. Hermanus C. Myburgh Department of Electrical Engineering, University of Pretoria, Pretoria, South Africa.
xx
Chengshan Xiao Department of Electrical and Computer Engineering, Lehigh University, Bethlehem, USA. Brian P. Salmon College of Sciences and Engineering, University of Tasmania, Hobart, Australia. Dan-feng Zhao College of Information & Communication Engineering, Harbin Engineering University, Harbin 150001, China Hai Tian College of Information & Communication Engineering, Harbin Engineering University, Harbin 150001, China Rui Xue College of Information & Communication Engineering, Harbin Engineering University, Harbin 150001, China Nuwan Balasuriya Department of Electronic & Telecommunication Engineering, University of Moratuwa, Katubedda, Moratuwa, Sri Lanka Chandika B. Wavegedara Department of Electronic & Telecommunication Engineering, University of Moratuwa, Katubedda, Moratuwa, Sri Lanka Elizabeth Tapia CIFASIS-Conicet Institute, Rosario, Argentina, Fac. de Cs. Exactas e Ingeniería, Universidad Nac. de Rosario, Rosario, Argentina, University Medicine Greifswald, GERMANY, Flavio Spetale CIFASIS-Conicet Institute, Rosario, Argentina, Fac. de Cs. Exactas e Ingeniería, Universidad Nac. de Rosario, Rosario, Argentina, University Medicine Greifswald, GERMANY, Flavia Krsticevic CIFASIS-Conicet Institute, Rosario, Argentina, Laura Angelone CIFASIS-Conicet Institute, Rosario, Argentina, Fac. de Cs. Exactas e Ingeniería, Universidad Nac. de Rosario, Rosario, Argentina, University Medicine Greifswald, GERMANY, xxi
Pilar Bulacio CIFASIS-Conicet Institute, Rosario, Argentina, Fac. de Cs. Exactas e Ingeniería, Universidad Nac. de Rosario, Rosario, Argentina, University Medicine Greifswald, GERMANY, A. R. Dixon Corporate Research and Development Centre, Toshiba Corporation, Komukai Toshibacho, Saiwai-ku, Kawasaki-shi 212-8582, Japan. H. Sato Corporate Research and Development Centre, Toshiba Corporation, Komukai Toshibacho, Saiwai-ku, Kawasaki-shi 212-8582, Japan. Zunaira Babar School of Electronics and Computer Science, University of Southampton, SO17 1BJ, United Kingdom. Daryus Chandra School of Electronics and Computer Science, University of Southampton, SO17 1BJ, United Kingdom. Hung Viet Nguyen School of Electronics and Computer Science, University of Southampton, SO17 1BJ, United Kingdom. Panagiotis Botsinis School of Electronics and Computer Science, University of Southampton, SO17 1BJ, United Kingdom. Dimitrios Alanis School of Electronics and Computer Science, University of Southampton, SO17 1BJ, United Kingdom. Soon Xin Ng School of Electronics and Computer Science, University of Southampton, SO17 1BJ, United Kingdom. Lajos Hanzo School of Electronics and Computer Science, University of Southampton, SO17 1BJ, United Kingdom. Ilkwon Sohn School of Electrical Engineering, Korea University, Seoul, Korea. xxii
Advanced KREONET Center, Korea Institute of Science and Technology Information, Daejeon, Korea. Jeongho Bang School of Computational Sciences, Korea Institute for Advanced Study, Seoul, 02455, Korea. Jun Heo School of Electrical Engineering, Korea University, Seoul, Korea.
xxiii
LIST OF ABBREVIATIONS
APP
A posterior probability
ACM
Adaptive coding modulation
AWGN
Additive White Gaussian Noise
ASD
Algebraic soft-decision
ADC
Analog-to-digital converter
ANS
Approximate node-wise scheduling
ARQ
Automatic-Repeat-reQuest
APSCWSF
Average probability and stopping criterion weighted symbol flipping
BP
Belief propagation
BPSK
Binary phase shift keying
BER
Bit error rate
BF
Bit-flipping
BICM
Bit-Interleaved Coded Modulation
BGMD
Bit-level generalized minimum distance
BCH
Bose–Chaudhuri–Hocquenghem
CRSS
Calderbank-Rains-Shor-Sloane
CSS
Calderbank-Shor-Steane
CSI
Channel state information
CN
Check node
CND
Check nodes decoder
CNPs
Check-Node Processors
CSEE
Chien Search and Error Evaluation
CER
Code word error rate
CCSDS
Consultative Committee for Space Data Systems
CV
Continuous Variable
CCMC
Continuous-input Continuousoutput Memoryless Channel
CNOT
Controlled-NOT
CC
Convolutional code
CCF
Coupling coefficient factor
CRC
Cyclic Redundancy Check
DO
Data output
DFEs
Decision feedback equalizers
DVB
Digital video broadcast
DV
Discrete Variable
DC
Dynamic concatenation
ePiBMA
Enhanced parallel inversionless Berlekamp– Massey algorithm
EA
Entanglement-Assisted
EG
Euclidean geometry
EXIT
Extrinsic-Information-Transfer
FFD
Factorization-Free decoder
FTQC
Fault-tolerant quantum computation
FPGA
Field Programmable Gate Array
FG-LDPC
Finite geometry LDPC
FEC
Forward error correction
FER
Frame error rate
FPQTD
Fully-Parallel Quantum Turbo Decoder
FPTD
Fully-Parallel Turbo Decoder
G-LDPC
Generalized LDPC
GV
Gilbert-Varshamov
GPU
Graphical processing unit
GSE
Ground state estimation
HDA
Hard decision algorithm
HDA
Hard decision decoding algorithm
HV
Hard Viterbi
HD
Hard-decision
HDD
Hard-decision decoding HDD
HDD
HD decoder
IMWSF
Improved MWSF
IBD
Interpolation-Based decoder
ISI
Inter-symbol interference
IRCC
IRregular Convolutional Code
xxvi
IAs
Iterative algorithms
KES
Key Equation Solver
KV
Kötter–Vardy
L-MBBP
eaking MBBP
LLR
Likelihood ratio
LLR
Log-likelihood ratio
LUTs
Look up tables
LDPC
Low density parity check node
LCC
Low-Complexity Chase
LDPC
Low-density parity check codes
LDPC
Low-density parity-check
MAP
Maximum a posteriori
ML
Maximum Likelihood
MLSE
Maximum Likelihood Sequence Estimation
MTER
Maximum tolerable error rate
MRRW
McEliece-Rodemich-Rumsey-Welch
MMSE
Minimum mean-squared error
MS
Min-sum
eMBB
Mobile bandwidth enhance mobile broadband
MWSF
Modified weighted symbol flipping
MRB
Most reliable basis
MLC
Multi-level cell
MMB
Multi-level cell broadcast
MBBP
Multi-media multiple-bases belief-propagation
MIMO
Multiple-input and multiple-output
MVPSF
Multiple-vote parallel symbol flipping algorithm
MVSF
Multiple-vote symbol flipping
MVA
Multiple-vote symbol flipping algorithm
MA
Multiplicity assignment
MAS
Multiplicity Assignment stage
NRF
National Research Foundation of Korea
NGS
Next-Generation Sequencing
NB-LDPC
Non-binary LDPC
NTCs
Non-Transmittable Codewords xxvii
OOK
On-off keying
ONUs
Optical network units
OSD
Ordered statistics decoding
OFDM
Orthogonal frequency division multiplexing
PCPE
Parallel Chien Polynomial Evaluation
PONs
Passive optical networks
PE
Program-and-erase
PEG
Progressive edge growth
QPSK
Quadrature phase-shift keying
QBER
Quantum Bit Error Ratio
QC
Quantum computation
QCC
Quantum Convolutional Codes
QEC
Quantum error correction
QECCs
Quantum error correction codes
QFT
Quantum Fourier transform
QIRCC
Quantum IRregular Convolutional Codes
QKD
Quantum Key Distribution
QSDC
Quantum Secure Direct Communication
QSCs
Quantum Stabilizer Codes
QTC
Quantum Turbo Codes
QC
Quasi-cyclic
QSC
Quaternary Symmetric Channel
RAM
Random access memory
RC-LDPC
Rate-compatible LDPC
RC-NBLDPC
Rate-compatible, non-binary, lowdensity parity check
RBER
Raw BER
BCH
Ray–Chaudhuri, and Hocquenghem
RSC
Recursive Systematic Convolutional
RRNS
Redundant Residue Number System
RS
Reed Solomon
SO
Sensing operation
SRVs
Sensing reference voltages
SPIHT
Set partitioning in hierarchical trees
SRAA
Shift register adder accumulator
xxviii
PREFACE
An unfailing communication system engages the transmission of signals with vanishingly small error rates while is exposed to a certain level of noise, reflection, diffraction, shadowing, and fading. One of the most widely used techniques to provide reliable communication consists of the usage of errorcorrecting codes everywhere data transmission takes place, i.e., in data storage, satellite communication, smartphone, and High Definition TV, neural network, etc. Section 1 of Mathematical Theory and Applications of Error-Correcting Codes book introduces the mathematics of error-correcting codes, in particular, of good ones, and in the end, focuses on mathematical models of neurological data. Section 2 treats of hard- and soft-decision decoding algorithms; in detail, it presents the study of a hard-decision decoding algorithm based on loop update detection, of a new architecture for soft-decision Reed–Solomon (RS) low-complexity chase decoding, of the optimum output precision of NAND Flash memory for low-energy soft-decision-based error-correction, and in the end, of performance of soft Viterbi decoder enhanced with non-transmittable codewords for storage media. Section 3 focuses on two particular classes of linear codes, the one weight linear and cyclic codes over over
and
-constacyclic codes
.
Section 4 firstly describes how to decode some binary optimal linear codes by projecting them onto an additive code over GF(4). In the end, it investigates the use of RS codes both for optical communication and for wireless communications comparing the efficiency with and without multipath propagation. Section 5 considers the class of quasi-cyclic codes resulting from the unfolding of cyclic codes and studies their reversibility. Moreover, it shows that the skew cyclic codes of arbitrary length are equivalent to either cyclic and quasi-cyclic codes over Galois rings. Section 6 widely reviews low density parity check (LDPC) codes, their decoding, and applications. In particular, it treats of decoding of LDPC codes in space telecommand links, of their optimization over the underwater acoustic
channel, of augmented decoders, and of non-binary LDPC codes for the Beyond 5G mobile system. In the end, the systematic design of DNA barcodes able to accurately sustain the Next-Generating Sequencing technologies; and the use of LDPC codes for Mbit/s rate quantum key distribution. Finally, the last Section 7 deals with the duality of classical and quantum error correction codes, and of the dynamic concatenation of these latter ones in integrated quantum computing architecture.
xxx
SECTION 1
INTRODUCTION TO ERROR-CORRECTING CODES
CHAPTER
1
Error-correcting Codes and Neural Networks
Yuri I. Manin1 1
Max–Planck–Institut für Mathematik, Bonn, Germany
ABSTRACT Encoding, transmission and decoding of information are ubiquitous in biology and human history: from DNA transcription to spoken/written languages and languages of sciences. During the last decades, the study of neural networks in brain performing their multiple tasks was providing more and more detailed pictures of (fragments of) this activity. Mathematical models of this multifaceted process led to some fascinating problems about
Citation: Manin, Y.I. “Error-correcting codes and neural networks”. Sel. Math. New Ser. 24, 521–530 (2018). https://doi.org/10.1007/s00029-016-0284-4 Copyright: © 2016, Springer Nature. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
4
Mathematical Theory and Applications of Error Correcting Codes
“good codes” in mathematics, engineering, and now biology as well. The notion of “good” or “optimal” codes depends on the technological progress and criteria defining optimality of codes of various types: error-correcting ones, cryptographic ones, noise-resistant ones etc. In this note, I discuss recent suggestions that activity of some neural networks in brain, in particular those responsible for space navigation, can be well approximated by the assumption that these networks produce and use good error-correcting codes. I give mathematical arguments supporting the conjecture that search for optimal codes is built into neural activity and is observable.
INTRODUCTION AND SUMMARY Recently it became technically possible to record simultaneously spiking activity of large neural populations (cf. Refs. 1, 2 in [21]). The data supplied by these studies show “signatures of criticality”. This means that the respective neural populations are functioning near the point of a phase transition if one chooses appropriate statistic models of their behavior. In some papers the relevant data for retina were interpreted as arguments for optimality of encoding visual information (cf. [22]), whereas in other works such as [21] it was argued that criticality might be a general property of many models of collective neural behavior unrelated to the encoding optimality. In this note I test the philosophy relating criticality with optimality using the results of recent works suggesting models of encoding of stimulus space that utilise the basic notions of the theory of error-correcting codes. The recent collaboration of neurobiologists and mathematicians, in particular, led to the consideration of binary codes used by brain for encoding and storing a stimuli domain such as a rodent’s territory through the combinatorics of its covering by local neighbourhoods: see [1,2, 24]. These binary codes as they are described in [1,2,24] (cf. also a brief survey for mathematicians [14]) are not good error-correcting codes themselves. However, these codes as combinatorial objects are produced by other neural networks during the stage where a rodent say, studies a new environment. During this formation period, other neural networks play an essential role, and I suggest that the auxiliary codes used in this process are error-correcting ones, changing in the process of seeking optimality. In laboratory experiments, dynamics of these auxiliary codes is described in terms of criticality, i. e. working near phase transition boundaries.
Error-correcting codes and neural networks
5
I approach the problem of relating criticality to optimality of such neural activities using a new statistical model of error-correcting codes explained in [15]. In this model the thermodynamic energy is measured by the complexity of information to be encoded; this complexity in turn being interpreted as the length of a maximally compressed form of this information, Kolmogorov style. As far as I know, the respective Boltzmann partition function based upon L. Levin’s probability distributions ([9,10]) has never appeared before [15] in the theory of codes. In this setting, the close relationship between criticality and optimality becomes an established (although highly non-obvious) mathematical fact, and I argue that the results of [1,2,24] allow one to transpose it in the domain of neurobiology. In the main body of the paper below, Sect. 1 is dedicated principally to the mathematics of error-correcting codes, in particular, of good ones, whereas Sect. 2 focusses on mathematical models of neurological data and related problems from our perspective. I am very grateful to many people whose kind attention and expert opinions helped crystallize views expressed in this article, especially S. Dehaene, and recently W. Leve lt who in particular, directed me to the recent survey [18]. I am also happy to dedicate this paper to the inspired researcher Sasha Beilinson, who is endowed with almost uncanny empathy for living inhabitants of this planet!
CODES AND GOOD CODES Codes In this article a code C means a set of words of finite length in a finite . Here we will be considering only alphabet A, i. e. a subset finite codes or even codes consisting of words of a fixed length. Informally, codes are used for the representation (“encoding”) of certain information as a sequence of code words. If code is imagined as a dictionary, then information is encoded by texts, and all admissible in some sense (for example, “grammatically correct”) texts constitute a language. The main motivation for encoding information is the communicative function of a language: text can be transmitted from a source to a receiver/target via a channel.
6
Mathematical Theory and Applications of Error Correcting Codes
An informal notion of “good” or even “optimal” code strongly depends on the circumstances in which the encoding/transmission/decoding takes place. For example, cryptography provides codes helping avoid unauthorized access and/or falsification of information, whereas construction of errorcorrecting codes allows receiver to reconstruct with high probability the initial text sent by the source through a noisy channel. Of course, it is unsurprising that, after the groundbreaking work by Shannon– Weaver, quality of error-correcting codes is estimated in probabilistic terms. One starts with postulating certain statistical properties of transmitted information/noisy channel and produces a class of codes maximizing the probability to get at the receiver end the signal close to the one that was sent. In the best case, at the receiver end one should be able to correct all errors using an appropriate decoding program. In this paper, I will be interested in a large class of error-correcting codes whose quality from the start is appreciated in purely combinatorial terms, without appealing to probabilistic notions at all. The fact that good codes lie near/on the so called asymptotic bound, whose existence was also established by combinatorial means (see [23] and references therein), was only recently interpreted in probabilistic terms as criticality of such codes: cf. [12,15,16]. The class of probability distributions that popped up in this research was introduced in [9]. It involves a version of Kolmogorov complexity, has a series of quite nonobvious properties (e.g. a fractal symmetry), and was subsequently used in [13] in order to explain Zapf’s law and its universality. In the remaining part of this section, I will briefly explain these results.
Combinatorics of error-correcting codes From now on, we consider finite codes consisting of words of a fixed length C ⊂ An, n ≥ 1.. The cardinality of the alphabet A is denoted q ≥ 2. If A, C are not endowed with any additional structure (in the sense of Bourbaki), we refer to C as an unstructured code. If such an additional structure is chosen, e.g. A is a linear space over a finite field Fq, and C is a linear subspace in it, we may refer to such a code as a linear one, or generally using the name of this structure. Consider, for example, neural codes involved in place field recognition according to [2,24]. Here q = 2, and in [24] these codes are identified with subsets of Fn 2, but they are not assumed to be linear subspaces.
Error-correcting codes and neural networks
7
Besides alphabet cardinality q = q(C) and word length n = n(C), two most important combinatorial characteristics of a code C are the (logarithmic) cardinality k(C) := logq card(C)
and the minimal Hamming distance between different code words:
In the degenerate case card C = 1 we put d(C) = 0. We will call the numbers q, k = k(C), n = n(C), d = d(C), code parameters and refer to C as an [n, k, d]q -code.
These parameters play the following role in the crude estimates of code quality. The arity q(C) and the number of pairwise different code words qk(C) essentially determine, how many “elementary units” of source information can be encoded by one-word messages. Codes with very large q (as Chinese characters in comparison with alphabetic systems) can encode thousands of semantic units by one-letter words, whereas alphabetic codes with q ≈ 30 in principle allow one to encode 304 units by words of length 4. The minimal distance between two code words gives an upper estimate of the number of letters in a code word misrepresented by noisy channel that can still be safely corrected at the receiving end. Thus, if the channel is not very noisy, a small d(C) might suffice, otherwise larger values of d(C) are needed. When d(C) is large, this slows down the information transmission, because the source is not allowed to use all qn words for encoding, but still must transmit a sequence n-letter words. Finally, when solving engineering problems, it might be preferable to use structured codes, e.g. linear ones, in order to minimise encoding and decoding efforts at the source/target ends. But in this paper, I will not deal with it. In the remaining part of the paper, in particular, in discussions of optimality, the alphabet cardinality q is fixed.
From combinatorics to statistics: I. The set of code points and asymptotic bound Consider an [n, k, d]q -code C as above and introduce two rational numbers: the (relative) transmission rate (1.1) and the relative minimal distance
8
Mathematical Theory and Applications of Error Correcting Codes
(1.2) In (1.1) we used the integer part [k(C)] rather than k(C) itself in order to obtain a rational number, as was suggested in [12]. The larger n is, the closer is (1.1) to the version of R(C) used in [15,16] and earlier works. According to our discussion above, an error-correcting code is good if in a sense it maximises simultaneously the transmission rate and the minimal distance. A considerable bulk of research in this domain is dedicated either to the construction/engineering of (families of) “good” error-correcting codes or to the proofs that “too good” codes do not exist. Since a choice of the transmission rate in a given situation is dictated by the statistics of noise in a noisy channel, we may imagine this task as maximisation of δ(C) for each fixed R(C). In order to treat this problem as mathematicians (rather than engineers) do, we introduce the notion of a code point
and denote by Vq the set of code points of all codes of given arity q. Let the latter set be Pq .
Let Uq be the closed set of limit points of Vq . We will call limit code points elements of Vq ∩ Uq . The remaining subset of isolated code points is defined as Vq \Vq ∩ Uq . More than thirty years ago it was proved that Uq consists of all points in [0, 1] 2 lying below the graph of a certain continuous decreasing function αq: (1.3)
Moreover, αq (0) = 1, αq (δ) = 0 for 1 − q−1 ≤ δ ≤ 1, and the graph of αq is tangent to the R-axis at (1, 0) and to the δ-axis at (0, 1 − q−1). This curve is called the asymptotic bound. For a modern treatment and a vast amount of known estimates for asymptotic bounds, cf. [23]. Thus, an error-correcting code can be considered a good one, if its point either lies in Uq and is close to the asymptotic bound, or is isolated, that is, lies above the asymptotic bound.
Error-correcting codes and neural networks
9
The main result of [12] was the following description of limit and isolated code points in terms of the computable map cp : Pq → Vq rather than topology of the unit square. We will say that a code point x ∈ Vq has infinite (resp. finite) multiplicity, if cp−1(x) ⊂ Pq is infinite (resp. finite).
Theorem [12] (a)
Code points of infinite multiplicity are limit points. Therefore isolated code points have finite multiplicity (b) Conversely, any point (R0, δ0) with rational coordinates satisfying the inequality 0 < R0 < αq (δ0) (resp. 0 < R0 < αlin q (δ0)) is a code point (resp. linear code point) of infinite multiplicity. The existence of isolated codes is established, but we are pretty far from understanding the whole set of them. According to our criteria, an isolated code having the transmission rate matching the channel noise level would be really the best of all good codes. But it is totally unclear how such codes can be engineered. To the contrary, trials and corrections might lead us close to a crossing of an appropriate asymptotic bound, and we will later stick to this subclass of “optimal codes”.
From combinatory to statistics: II. Crossing asymptotic bound as a phase transition The main result of [15] (see also [16]) consisted in the suggestion to use on the set Pq the probability distribution in which energy level of a code is a version of its Kolmogorov complexity. This distribution was introduced and studied by L. Levin. Furthermore, we proved that the respective Boltzmann partition function using this distribution produces the phase transition curve exactly matching the asymptotic bound αq.
In order to make more precise the analogy with phase transition in physical systems, it is convenient to introduce the function inverse to αq which we will denote βq. This means that the equation of asymptotic bound is now written in the form δ = βq (R), and the domain below αq is defined by the inequality δ kmax (kmax is the maximum number of iterations set for the user), the decoding is declared to fail and stop. The completion of a symbol of the flip need to determine the two parameters: (1) The position of the flip; and (2) the value or amplitude of the flip. •
• Step 4: Determining the position of the flip symbol We calculated the measured value for each variable node at th the k iteration. The core of the WSF algorithm involves flipping out the symbols that do not satisfy the check equation. To calculate E(k)n, we first need to calculate the reliability of variable node n, respectively {α1, ⋯ , αq−1}. (5) In Equation (5), β(β ≥ 0) is the weighting factor. At that time, β = 0, which allows us to integrate the MSMWSF algorithm into SMWSF algorithm. For the given non-binary LDPC codes, the optimal value of the weighting factor can be obtained by Monte Carlo simulation under the specified number of iterations [30]. After this, we update the check equation to get the adjacency value of the hard decision:
(6) , a valid code word is obtained and Among them, m is the CN. If , the flip function is calculated the search is stopped. If and looped accordingly.
Mathematical Theory and Applications of Error Correcting Codes
26
In order to estimate the reliability measure of a symbol, the measure of each symbol must be calculated: (7) We select the location where the symbol is to be flipped, which is n(k), 0 ≤ n < N. The summation operation in Equation (7) is a weighted method to measure the location possibility of flipping symbols. The sum of the weights of 1 can enhance the measure of the symbol position and provide a more accurate judgment standard. In this way, the symbol corresponding to the maximum value of Equation (8).
is the position of the flip symbol, such as the (8)
•
Step 5: Determining the value or magnitude of the flip
For the chosen symbol , flip the bit corresponding to the minimum value of |rni|, where 0 ≤ i < b, and update y(k−1). •
Step 6: Decoding, according to the results of the flip to get a new hard decision decoding sequence. Assign the value of y(k−1) to y(k), y(k) ← y(k−1), and return to the second step.
Loop Update Detection Algorithm Due to the possibility of an infinite loop in the process of symbol flipping, the symbol after the current flipping is still an error symbol and cannot be flipped to correct the symbol. Therefore, this paper proposes a loop update detection algorithm to further improve the decoding performance and speed up the convergence. Firstly, the output symbol sequence matrix and the infinite loop detection matrix are defined, which involves the sequential traversal of each symbol. After this, sorting them from smallest to largest, we determine the location of the flip symbol. From the second smallest symbol, we re-flip the same error symbol, which is converted to the bit value. The wrong bits are sorted from smallest to largest. According to the definition of small to large index value, the bit is flipped and then converted into a symbol value to replace the previous decoding symbol. The symbol position corresponding to the maximum value is placed in the exclusion sequence. Essentially, after the
Sum of the Magnitude for Hard Decision Decoding Algorithm ...
27
symbol position corresponding to the maximum value is excluded, the maximum value error symbol is searched again. This process is repeated for decoding the iteration. The specific decoding process is as follows. • Step 1: Initialization The excluded sequence A is initialized to an empty set, which is used to store the position corresponding to the symbol that does not satisfy the flipping function. The bit flipping identifier F is defined and initialized to 1, which is used as the counter. The maximum value is b (1bq) and the value of F determines the number of bits to be flipped in the flip symbol. • Step 2: Determining the magnitude of symbol flipping The bit position to be flipped in the symbol position n(k) is determined by the binary sequence r = [r0, r1, ⋯ , rNb−1], which is transmitted after the AWGN channel is transmitted. The symbol at position n(k) of the symbol in the to-be-flipped state can be converted into b bits, which correspond to [rnb, rnb+1, ⋯ , r(n+1)b−1] in r. We sort |ri|(nb ≤ i ≤ (n + 1)b − 1) from largest to smallest, with a smaller |ri| indicating lower reliability of the corresponding bit. Therefore, the F bit position with the smallest absolute value in [rnb, rnb+1, ⋯ , r(n+1)b−1] is selected according to the bit flipping identifier F, while the F bits in the corresponding position are reversed to obtain a new symbol sequence y(k) = [y0, y1, ⋯ , yN−1]. • Step 3: Detects whether there is an infinite loop The loop update detection algorithm (LUD) is as follows. First, the output symbol sequence matrix Y(k+1)×N is defined, as shown in Equation (9).
(9) Following this, an infinite loop detection matrix Ek×N is defined, as shown in Equation (10).
(10) As long as all the elements of one row in an infinite loop matrix are 0, an infinite loop is detected. Otherwise, an infinite loop is not detected.
Mathematical Theory and Applications of Error Correcting Codes
28
(1).
If an infinite loop is detected and F does not reach the maximum b, we increase the value of F by 1. After this, we return to step 2 to re-determine the position of the specific bit to be flipped by the symbol, before flipping the F bits of the corresponding position. (2). If an infinite loop is detected but F has reached the maximum b, the currently selected flip symbol position is stored in the exclusion symbol sequence A. F is set to 1 and the flip symbol position is rediscovered. (3). If no infinite loop is detected, the exclusion symbol sequence A is set as an empty set, the bit flipping identifier F is 1 and the calibration equation is recalculated. Combining the loop update detection algorithm with the weighted symbol flipping algorithm based on sum of magnitude, a modified sum of the magnitude for the weighted symbol flipping decoding algorithm based on loop update detection (LUDMSMWSF) is proposed. On the basis of the MSMWSF algorithm, the algorithm repeatedly flips the bits corresponding to the error code word and looks for them in the order of the most probable error probability. Finally, this algorithm finds the correct code word. If this error traverses all the symbols, there is still an infinite loop, which indicates that the current symbol is not an error symbol. Following this, the algorithm adds the current symbol to the excluded symbol set to ensure that the next current position will not be continuously found. The specific decoding algorithm flow chart is shown in Figure 1.
Figure 1. Flow chart of LUDMSMWSF algorithm.
Sum of the Magnitude for Hard Decision Decoding Algorithm ...
29
COMPLEXITY ANALYSIS In this section, the conditions used to analyze the complexity of the decoding algorithm are: (1) Ignoring a small number of binary calculation and multiplication operations in the algorithm, with the assumption that the comparison operation is equivalent to the addition operation; and (2) taking the regular non-binary LDPC codes as an example to compare the average number of real addition operations in each algorithm. In the past, the computational complexity of the initial stage before the first iteration is often neglected when analyzing the computational complexity [31]. However, the simulation analysis of the decoding algorithm in the next section shows that the highest signal-to-noise ratio (SNR) in the frame occurs through iterative decoding less frequently after convergence. Thus, there is strong complexity of the initialization phase in the decoding process of the high proportion, if the neglect leads to a significant error. Therefore, the computational complexity involved in the initialization stage before the first iteration is taken into account. At the same time, the principle of second sections shows that all WSF algorithms perform symbol flipping in the final stage of each iteration, which is a serial decoding algorithm. After the symbol flipping is completed, the check equation is updated and the flip function is updated. In each iteration, there are three steps involved: (1) Calculating the adjoining vector ; (2) updating the flip function ; and (3) searching the flipped bit. Taking the standard WSF algorithm as an example, if the variable node n is flipped during the previous iteration, corresponding to the dv check nodes connected to the adjoining vectors them need to be updated. After the update of the vector is completed, the flip function of dc variable nodes connected to the check node m is calculated. Therefore, for both steps, the total number of operations required is dc ⋅ dv. In Step (3), it is assumed that the node position of the variable corresponding to the maximum flip function is found in n variable nodes, while further comparison operations are performed for N − 1 times. Therefore, the calculation amount of an iterative process in a standard WSF algorithm is N − 1 + dc ⋅ dv. The specific update process is as follows. First, we calculate and update it according to the updated formula of flip function, which is . The average number of iterations of the five decoding algorithms is AI1–AI6, while dc represents line weight and dv represents column weight. Table 1 gives the calculation methods of the total number of real operations of each decoding algorithm. From Table 1, it can be concluded that the complexity
30
Mathematical Theory and Applications of Error Correcting Codes
of the LUDWSF, LUDSMWSF and LUDMSMWSF algorithms is lower than the WSF, SMWSF and MSMWSF algorithms. Taking one algorithm as an example, the computation amount of WSF algorithm and LUDWSF algorithm is the same for each iteration with the difference of the average iteration times. However, the average iterations of LUDWSF algorithm are much less than that of WSF algorithm. From Table 1, the LUDMSMWSF algorithm only requires real additions, complexity is Mq(2dc − 1) + Nqdv + (N − 1) + (N − 1 + dcdv)(AI5 − 1). However, complexity of the Fast Fourier Transform-based belief propagation decoding algorithm (FFT-BP) [32,33] includes real additions, multiplications and divisions, the complexities of which are AI6[2Ndvqlog2q + 2Ndv(q − 1) + M(dc − 1)], AI6[Ndvq(dc + 2dv − 1) + Mdc] and AI6[Ndv(q + 2)], respectively. Real multiplications and divisions are more consumable units than real additions. Although the proposed WSF algorithm with flipping pattern requires more iterations for decoding than FFT-BP, it needs less real additions than FFT-BP and requires no multiplying the iterations with computational requirements in each iteration, the total computational requirement of WSF algorithm is still lower than FFT-BP. Therefore, the computational requirement of WSF algorithm is much lower than FFT-BP. Table 1. The average total number of real operations in each algorithm.
SIMULATION RESULTS AND STATISTICAL ANALYSIS The simulation parameters used in this section are as follows: 384,192 LDPC codes with a code rate of 0.5 and a column weight of three. The matrix is generated by the progressive edge growth (PEG) algorithm [34,35], divided into 4-ary (Code 1) and 16-ary (Code 2) simulation, which has a maximum number of iterations of 100. Under the AWGN channel conditions and using BPSK modulation, at least 1000 error bits are collected at each SNR. The link level simulation block diagram is shown in Figure 2.
Sum of the Magnitude for Hard Decision Decoding Algorithm ...
31
Figure 2. The link level simulation block diagram.
Weighted Factor Test For non-binary LDPC codes with given column weights, the decoding performance is different under the same signal-to-noise ratio with different weighting factors. When choosing a constant and optimal weighting factor, the performance loss is negligible and the complexity of the implementation is decreased [36,37]. The optimal value of the weighting factor is generally related to the non-binary LDPC codes and the specific code structure. Figure 3 and Figure 4 show the bit error rate performance of Code 1 and Code 2 with different weighting factors when using MSMWSF algorithm under different SNR conditions. Based on the definition of the optimal value of weighted factor, it can be seen from Figure 3 that the influence of weighting factor on bit error rate (BER) is not obvious at lower SNR as the optimal value of weighting factor varies little with an increase in SNR. As shown in Figure 3 and Figure 4, the optimal value of the weighting factor of the MSMWSF algorithm in Code 1 in this paper is 1.8, while the optimal value of the weighting factor of the MSMWSF algorithm in Code 2 is one.
32
Mathematical Theory and Applications of Error Correcting Codes
Figure 3. Code 1 weighted factor test.
Figure 4. Code 2 weighted factor test.
Comparison of Algorithm Performance and Average Iteration Numbers The performance comparison between Code 1 and Code 2 in five different decoding algorithms under the optimal parameters is shown in Figure 5 and Figure 6, respectively. With an increase in SNR, the coding gain of MSMWSF algorithm is gradually increasing compared with other existing algorithms. At a low SNR, the performance of MSMWSF algorithm is almost the same as other algorithms. According to the simulation diagram of reference [34,35,36], a higher number of binary numbers indicates a poorer performance of the WSF algorithm. From Figure 5 and Figure 6, we can see that the performance of Code 1 is better than that of Code 2, which proves
Sum of the Magnitude for Hard Decision Decoding Algorithm ...
33
the effectiveness of the algorithm proposed in this paper.
Figure 5. Comparison of five algorithms under Code 1.
Figure 6. Comparison of five algorithms under Code 2.
Figure 7 and Figure 8 show the performance of Code 1 and Code 2 under the SMWSF algorithm and MSMWSF algorithm, respectively, with the optimal parameters of the weighting factors after introducing the loop update detection algorithm. As seen from Figure 7 and Figure 8, the LUDMSMWSF algorithm improves the decoding performance compared to the MSMWSF algorithm, accelerating the convergence speed. Compared with the WSF algorithm, a coding gain of 2.2 dB is obtained in the case of Code 1 and a coding gain of 2.35 dB in the case of Code 2. Under the same decoding algorithm, the coding gain of about 0.8–1.1 dB can be obtained after introducing the proposed loop update detection algorithm. At the same time, the LUDMSMWSF algorithm is about 0.85–1.05 dB away from FFTBP decoding at BER of 10−5 with a tremendous reduction of computational requirement. In view of the results on Figure 7 and Figure 8, we argue that the proposed symbol flipping algorithm offer the tradeoff points between performance and computational cost.
34
Mathematical Theory and Applications of Error Correcting Codes
Figure 7. Comparison of improved algorithms under Code 1.
Figure 8. Comparison of improved algorithms under Code 2.
Figure 9 and Figure 10 gives the average number of iterations and average number of addition operations of Code 1 under five decoding algorithms. As shown in Figure 9 and Figure 10, the two algorithms proposed in this paper are significantly less than the average number of iterations and the average number of addition operations in the traditional algorithm. Because the algorithm proposed in this paper has more efficient and accurate symbol flipping function, it improves the performance and reduces the complexity of the algorithm to a certain extent. We see that MSMWSF algorithm achieves fast convergence and low complexity.
Figure 9. The average number of iterations of five algorithms under Code 1.
Sum of the Magnitude for Hard Decision Decoding Algorithm ...
35
Figure 10. The average number of addition operations of five algorithms under Code 1.
We set the SNR to be 3.5 dB with the simulation of 10,000 frames under the condition of Code 1. We then combined the statistical five algorithms for decoding failure frames, as shown in Table 2. Table 2 shows that the WSF algorithm has nearly 99% frame failure. Two decoding algorithms are proposed in this paper. The failure of the frame is about 62% and 42%, which is less than traditional algorithms. It can also be seen that the algorithm proposed in this paper does not increase the complexity of the implementation, but has been reduced to a certain extent. Table 2. Five algorithms for decoding failure frames. SF Algorithm
Decoding Failure Frames
WSF algorithm
9915
MWSF algorithm
9840
IMWSF algorithm
6544
SMWSF algorithm
6200
MSMWSF algorithm
4184
The total computation required for decoding this LDPC codewith three various algorithms at 4.5 dB is shown in Table 3. From Table 3, the FFTBP algorithm is nearly six times the computational requirement of the MSMWSF algorithm only in real addition. Therefore, we can prove that the computational requirement of MSMWSF algorithm is much lower than FFT-BP with no real multiplication and division. The MSMWSF algorithm has the lowest complexity and does not need to consume hardware resources and multiplication and division operations in software overhead.
36
Mathematical Theory and Applications of Error Correcting Codes
Table 3. Under the condition of Code 2, Eb/N0 = 4.5 dB, decoding complexity of three algorithms. SF Algorithm
Addition Operations
Multiplication Operations
Division Operations
WSF algorithm
205644
0
0
MSMWSF algorithm
92710
0
0
FFT-BP algorihtm
567723
744870
66267
CONCLUSIONS This paper proposes a sum of the magnitude for hard decision decoding algorithm based on loop update detection. The algorithm combines the magnitude of the sum information of the variable nodes adjacent to the check node and uses it as the reliability information. At the same time, the reliability information of the variable node itself is taken into account and a more effective flip function is obtained, which improves the flip efficiency of the symbol and improves the decoding performance of the algorithm. The loop update detection algorithm is introduced to improve the accuracy of symbol flipping and to further accelerate the convergence rate of decoding. Simulation results show that compared with the WSF algorithm, the proposed LUDMSMWSF algorithm gains about 1.3 dB and 1.8 dB respectively and the decoding complexity is greatly reduced. Therefore, the algorithm proposed in this paper can be better applied to the 5G mobile communication system and meet the requirements of the decoding algorithm. It is a good candidate decoding algorithm for high speed communication devices.
ACKNOWLEDGMENTS This research work was supported by the ZTE Cooperation Forum Project (Grant No. KY10800160020), the National Natural Science Foundation of China (Grant No. 61371099), the Fundamental Research Funds for the Central Universities of China (Grant No.HEUCF150814/150812/150810), the International Exchange Program of Harbin Engineering University for Innovation-Oriented Talents Cultivation.
CONFLICTS OF INTEREST The authors declare no conflict of interest.
Sum of the Magnitude for Hard Decision Decoding Algorithm ...
37
REFERENCES 1.
Naga B., Li J.Y., Durga P.M. Network densification: The dominant theme for wireless evolution into 5G. IEEE Commun. Mag. 2014; 52:82–89. 2. Guo L., Ning Z.L., Song Q.Y. Joint encoding and grouping multiple node pairs for physical-layer network coding with low-complexity algorithm. IEEE Trans. Veh. Technol. 2017; 66:9275–9286. doi: 10.1109/TVT.2017.2696709. 3. Wang L.S., Wang Y.M., Ding Z.Z. Cell selection game for denselydeployed sensors and mobile devices in 5G networks integrating heterogeneous cells and internet of things. Sensors. 2015; 15:24230– 24256. doi: 10.3390/s150924230. 4. Fortuna C., Bekan A., Javornik T. Software interfaces for control, optimization and update of 5G type communication networks. Comput. Netw. 2017; 129:373–383. doi: 10.1016/j.comnet.2017.06.015. 5. Gao X., Ove E., Fredrik R. Massive MIMO performance evaluation based on measured propagation data. IEEE Trans. Wirel. Commun. 2015; 14:3899–3911. doi: 10.1109/TWC.2015.2414413. 6. Cheng L., Zhu H., Li G. LDPC encoder design and FPGA implementation in deep space communication; Proceedings of the International Conference on Logistics, Engineering, Management and Computer Science; Shenyang, China. 24–26 May 2014; pp. 343–346. 7. Wang C.L., Chen X.H., Li Z.W. A simplified Min-Sum decoding algorithm for non-binary LDPC codes. IEEE Trans. Commun. 2015; 61:24–32. doi: 10.1109/TCOMM.2012.101712.110709. 8. Marc P.C.F., Miodrag J.M., Hideki I. Reduced complexity iterative decoding of low-density parity check node based on belief propagation. IEEE Trans. Commun. 1999; 47:673–680. 9. Aslam C.A., Guan Y.L., Cai K. Edge-based dynamic scheduling for belief-propagation decoding of LDPC and R S codes. IEEE Trans. Commun. 2017;65:525–535. doi: 10.1109/TCOMM.2016.2637913. 10. Namrata P.B., Brijesh V. Design of hard and soft decision decoding algorithms of LDPC. Int. J. Comput. Appl. 2014;90:10–15. 11. Evdal A., Najeeb U.H., Michael L. Challenges and some new directions in channel coding. J. Commun. Netw. 2015;17:328–338. 12. Balasuriya N., Wavegedara C.B. Improved symbol value selection for symbol flipping-based non-binary LDPC decoding. EURASIP J. Wirel.
38
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
Mathematical Theory and Applications of Error Correcting Codes
Commun. Netw. 2017;2017:105. doi: 10.1186/s13638-017-0885-4. Chen M., Hao Y.X., Qiu M.K. Mobility-Aware caching and computation off loading in 5G ultra-dense cellular network. Sensors. 2016;16:974. doi: 10.3390/s16070974. Swapnil M. High-Throughput FPGA QC-LDPC Decoder Architecture for 5G Wireless. The State University of New Jersey; Rutgers, NJ, USA: 2015. Kou Y., Lin S., Fossorier M. Low-density parity-check codes based on finite geometries: A rediscovery and new results. IEEE Trans. Inf. Theory. 2001;47:2711–2736. doi: 10.1109/18.959255. Zhang J., Fossorier M. A modified weighted bit-flipping decoding of low-density parity-check codes. IEEE Commun. Lett. 2004;8:165– 167. doi: 10.1109/LCOMM.2004.825737. Jiang M., Zhao C.M., Shi Z. An improvement on the modified weighted bit flipping decoding algorithm for LDPC codes. IEEE Commun. Lett. 2005;9:814–816. doi: 10.1109/LCOMM.2005.1506712. Guo R., Liu C., Wang M. Weighted symbol-flipping decoding for non-binary LDPC codes based on average probability and stopping criterion. J. Commun. 2016;37:43–52. Zhang G.Y., Zhou L., Su W.W. Average magnitude based weighted bitflipping decoding algorithm for LDPC codes: Average magnitude based weighted bit-flipping decoding algorithm for LDPC codes. J. Electron. Inf. Technol. 2014;35:2572–2578. doi: 10.3724/SP.J.1146.2012.01728. Garcia-Herrero F., Li E.B., Declercq D. Multiple-Vote Symbol-Flipping Decoder for Nonbinary LDPC Codes. IEEE Trans. Very Large Scale Integr. Syst. 2014;11:2256–2267. doi: 10.1109/TVLSI.2013.2292900. Garcia-Herrero F., Declercq D., Valls J. Non-binary LDPC decoder based on symbol flipping with multiple votes. IEEE Commun. Lett. 2014;18:749–752. doi: 10.1109/LCOMM.2014.030914.132867. Nhan N.Q., Ngatched T.M.N., Dobre O.A. Multiple-votes parallel symbol-flipping decoding algorithm for non-binary LDPC codes. IEEE Commun. Lett. 2015;19:905–908. doi: 10.1109/ LCOMM.2015.2418260. Ueng Y.L., Wang C.Y., Li M.R. An efficient combined bit-flipping and stochastic LDPC decoder using improved probability traces. IEEE Trans. Signal Process. 2017;65:5368–5380. doi: 10.1109/ TSP.2017.2725221.
Sum of the Magnitude for Hard Decision Decoding Algorithm ...
39
24. Le K., Ghaffari F., Declercq D. Efficient hardware implementation of probabilistic gradient descent bit-flipping. IEEE Trans. Circuits Syst. I Regul. Pap. 2017;64:906–917. doi: 10.1109/TCSI.2016.2633581. 25. Lu M. Research on Construction and Decoding Algorithm of Nonbinary LDPC Codes. Beijing Jiaotong University; Beijing, China: 2013. 26. Mackay D.J., Wilson S.T., Davey M.C. Comparison of constructions of irregular Gallager codes. IEEE Trans. Commun. 1999;47:1449–1454. doi: 10.1109/26.795809. 27. Garcia-Herrero F., Canet M.J., Valls J. Nonbinary LDPC Decoder Based on Simplified Enhanced Generalized Bit-Flipping Algorithm. IEEE Trans. Very Large Scale Integr. Syst. 2014;22:1455–1459. doi: 10.1109/TVLSI.2013.2276067. 28. Thi H.P. Two-Extra-Column Trellis Min-max Decoder Architecture for Nonbinary LDPC Codes. IEEE Trans. Very Large Scale Integr. Syst. 2017;25:1781–1791. doi: 10.1109/TVLSI.2017.2647985. 29. Guo F., Hanzo L. Reliability ratio based weighted bit-flipping decoding for low-density parity-check codes. Electron. Lett. 2004;40:1356– 1358. doi: 10.1049/el:20046400. 30. Zhang J.Y. Simplified symbol flipping algorithms for nonbinary lowdensity parity-check codes. IEEE Trans. Commun. 2017;65:4128– 4137. doi: 10.1109/TCOMM.2017.2719027. 31. Liu B., Tao W., Dou G.Q. Weighted symbol-flipping decoding for nonbinary LDPC codes based on a new stopping criterion. J. Electron. Inf. Technol. 2011;33:309–314. doi: 10.3724/SP.J.1146.2010.00257. 32. Sulek W. Non-binary LDPC decoders design for maximizing throughput of an FPGA implementation. Circuits Syst. Signal Process. 2016; 35:4060–4080. doi: 10.1007/s00034-015-0235-x. 33. Kang J.Y., Huang Q., Zhang L. Quasi-cyclic LDPC codes: An algebraic construction. IEEE Trans. Commun. 2010; 58:1383–1396. doi: 10.1109/TCOMM.2010.05.090211. 34. Jiang X.Q., Hai H., Wang H.M. Constructing large girth QC protograph LDPC codes based on PSD-PEG algorithm. IEEE Access. 2017;5:13489–13500. doi: 10.1109/ACCESS.2017.2688701. 35. Bai Z.L., Wang X.Y., Yang S.S. High-efficiency Gaussian key reconciliation in continuous variable quantum key distribution. Sci. China Phys. Mech. Astron. 2016;59:614201. doi: 10.1007/s11433015-5702-7.
40
Mathematical Theory and Applications of Error Correcting Codes
36. Chang T.C.Y., Su Y.T. Dynamic weighted bit-flipping decoding algorithm for LDPC codes. IEEE Trans. Commun. 2015;63:3950– 3963. doi: 10.1109/TCOMM.2015.2469780. 37. Bazzi L., Audah H. Impact of redundant checks on the LP decoding thresholds of LDPC codes. IEEE Trans. Inf. Theory. 2015;61:2240– 2255. doi: 10.1109/TIT.2015.2417522.
CHAPTER
3
Soft-Decision Low-Complexity Chase Decoders for the RS (255,239) Code
Vicente Torres 1, Javier Valls 1, Maria Jose Canet 1 and Francisco García-Herrero2 Instituto de Telecomunicaciones y Aplicaciones Multimedia, Universitat Politècnica de València, 46022 Valencia, Spain
1
ARIES Research Center, Universidad Antonio de Nebrija, 28040 Madrid, Spain
2
ABSTRACT In this work, we present a new architecture for soft-decision Reed–Solomon (RS) Low-Complexity Chase (LCC) decoding. The proposed architecture
Citation: Torres, V.; Valls, J.; Canet, M.J.; García-Herrero, F. “Soft-Decision LowComplexity Chase Decoders for the RS(255,239) Code”. Electronics 2019, 8, 10. https://doi.org/10.3390/electronics8010010 Copyright: © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
42
Mathematical Theory and Applications of Error Correcting Codes
is scalable and can be used for a high number of test vectors. We propose a novel Multiplicity Assignment stage that sorts and stores only the location of the errors inside the symbols and the powers of α that identify the positions of the symbols in the frame. Novel schematics for the Syndrome Update and Symbol Modification blocks that are adapted to the proposed sorting stage are also presented. We also propose novel solutions for the problems that arise when a high number of test vectors is processed. We implemented three decoders: a η=4 LCC decoder and two decoders that only decode 31 and 60 test vectors of true η=5 and η=6 LCC decoders, respectively. For example, our η=4 decoder requires 29% less look-up tables in Virtex-V Field Programmable Gate Array (FPGA) devices than the best soft-decision RS decoder published to date, while has a 0.07 dB coding gain over that decoder. Keywords: FEC; Low-Complexity Chase; Reed–Solomon; SoftDecision Decoding
INTRODUCTION Reed–Solomon (RS) error-correction codes are widely used in communication and storage systems due to their capacity to correct both burst errors and random errors. These codes are being incorporated in recent 100 Gbps Ethernet standards over a four-lane backplane channel, as well as over a four-lane copper cable [1,2], and for optical fiber cables. Generally, the main decoding methods for RS codes are divided into harddecision decoding (HDD) and algebraic soft-decision (ASD) decoding. The hard-decision RS decoder architecture consists, commonly, of three main computation blocks: syndrome computation, key equation solver and error location and evaluation. In a different way, ASD algorithms require three main steps: multiplicity assignment (MA), interpolation and factorization. ASD algorithms can achieve significant coding gain at a cost of a small increase in complexity when compared with HDD. Low-Complexity Chase (LCC) [3,4] achieves the same error correction performance with lower complexity, when compared to other Algebraic Soft-Decision Decoding algorithms [5,6,7] for Reed–Solomon codes [8], like bit-level generalized minimum distance (BGMD) decoding [9] or Kötter–Vardy (KV) [5]. The main benefit of LCC is the use of just one level of multiplicity, which means that only the relationship between the hard-decision (HD) reliability value of the received symbols and the second best decision is required to exploit the soft-information from the channel. This fact has a great impact
Soft-Decision Low-Complexity Chase Decoders for the ...
43
on the number of iterations and on the global complexity of the interpolation and factorization steps [10,11,12,13,14,15,16] compared to KV and BGMD [17]. Another benefit derived from having just one level of multiplicity is that the interpolation and factorization stages can be replaced by Berlekamp– Massey decoders [18], and this results in a considerable reduction in the total number of operations [19]. Recently, Peng et al. [20] showed that the computation of the symbol reliability values can be performed using bit-level magnitudes. They also presented implementation results for a soft-decision decoder that includes the Multiplicity Assignment stage (MAS). On the other hand, Lin et al. [21] proposed a decoder that reduces the complexity by relaxing the criteria for the selection of the best test vector. The resulting decoder requires less area than decoders with worse performance. The main contributions of the present paper are as follows. We present a novel MAS based on the one proposed in [20], which sorts and stores less data than in that proposal. We propose also novel implementation schematics for the Syndrome Update (SUS) and Symbol Modification (SMS) stages that are adapted to the proposed MAS. We also propose a scalable architecture for the computation of a high number of test vectors, therefore, it reaches high coding gain. We detail two architectures that use two or four Key Equation Solver (KES) blocks, and follow a Gray code sequence to process the test vectors, so the complexity of the decoder is not increased. Specifically, we present three decoders for soft-decision RS decoding. The first is a η=4 LCC decoder. The other two, which we call Q5 and Q6, do not decode the complete set of 2η test vectors of η=5 and η=6 LCC decoders, but only a subset of them. The proposed decoders give a solution for the problems created by using a high number of test vectors, since, in that case, the design of the decoder is not as simple as parallelizating resources. We present implementation results for FPGA and CMOS ASIC devices that confirm that our proposals have lower area while they achieve higher coding gain than state-of-the-art decoders. The organization of this paper is as follows. In Section 2 and Section 3 we summarize the background concepts about RS and LCC decoding, respectively. In Section 4 we detail the architecture for the proposed decoders. The implementation results and comparisons are given in Section 5. Finally, in Section 6 we present the conclusions.
44
Mathematical Theory and Applications of Error Correcting Codes
RS DECODERS In an RS(N,K) code over GF(2m), where N=2m−1, 2t redundant symbols are added to the K-symbol message to obtain the N-symbol codeword C(x). After the codeword is transmitted over a noisy channel, the decoder receives R(x)=C(x)+E(x), where E(x) is the noise polynomial. he RS decoding process begins with the Syndrome Computation (SC) block. This block computes the 2t syndromes Si that are the coefficients of the syndrome polynomial S(x). This is achieved by evaluating the received polynomial in the 2t roots of the generator polynomial, specifically Si=R(αi+1) for i∈{0,1,…,2t−1}, where α is the primitive element of GF(2m). The KES block obtains the error-locator Λ(x) and the error magnitude Ω(x) polynomials by solving the key-equation Λ(x)⋅S(x)=Ω(x) modxN−K. The third block is the Chien Search and Error Evaluation (CSEE). The Chien search finds the error locations, evaluating Λ(x) in all the possible positions (i.e., Λ(α−n), for n∈{0,1,…,N−1}) and an error evaluation method (e.g., Forney’s formula) is used to calculate the error magnitude (e.g., En=Ω(α−n)/Λ′(α−n)) when the Chien search finds an error location, which is whenever Λ(α−n)=0. If the total amount of errors in R(x) does not exceed the error correcting capability t, all the errors in R(x) are corrected subtracting the error magnitudes from the received symbols.
LOW-COMPLEXITY CHASE DECODER We assume that the codeword C is modulated in binary phase-shift keying (BPSK) and transmitted over a Gaussian Noise (AWGN) channel. The LCC algorithm uses the reliability of the received symbols in order to generate a set of test vectors that will be decoded with a HD decoder (HDD). The reliability of a symbol is derived from the a posteriori probabilities p(C|R), but instead, the likelihood function, p(R|C), can be used by applying Bayes’ Law. The reliability of the received symbol ri is defined as Γi=log[p(ri∣∣yHDi)/ p(ri∣∣y2HDi)], where yHDi is the symbol with the highest probability of being the transmitted symbol for the i-th received symbol and y2HDi is the symbol with the second highest probability. The closer Γi is to zero, the less reliable ri is, since the probabilities of being yHDi or y2HDi the transmitted symbol are more similar. Once Γi is computed for all the received symbols, those η symbols with the smallest values of Γi are selected, where η is a positive integer. The LCC decoding process creates 2η different test vectors: all the possible combinations of choosing or not y2HDi instead of yHDi for those η symbols. As proposed in
Soft-Decision Low-Complexity Chase Decoders for the ...
45
[20], y2HDi is obtained by flipping the least reliable bit of yHDi.
The Frame Error Rate performance of the RS(255,239) LCC decoder is shown in Figure 1 for η={1,2,3,4,5,6}.
Figure 1. Frame Error Rate (FER) versus Eb/No for RS(255,239) decoders.
DECODER ARCHITECTURE In this section we present the architecture for three soft-decision RS(255,239) decoders. The first decoder we present is a η=4 LCC. The second one, which we refer to as Q5, is a quasi-η=5 LCC: it uses all the test vectors of a true η=5 LCC, but one. The third one, which we refer to as Q6, is a quasi-η=6 LCC: it uses all the test vectors of a true η=6 LCC, but four. They are based on a systolic KES, the enhanced parallel inversionless Berlekamp–Massey algorithm (ePiBMA) [22], that requires 2t=16 cycles for the computation of each frame, with low critical path: one adder (T+), one multiplexer (Tx) and one multiplier (T∗). Moreover, the selected KES requires fewer resources than other popular options. If the computation times of the three main pipeline stages are equalized, one KES can be used to compute 16 test vectors, for example for a η=4 LCC decoder. For the Q5/Q6 decoders we propose the use of 2/4 KES working in parallel, which increases the decoding capability to 32/64 test vectors. Figure 2 shows the block diagram for the proposed Q5 decoder. The decoder is based on the three classical blocks of a HDD: SC, KES and CSEE. Furthermore, more functional blocks are required to manage the additional test vectors. First, the test vectors have to be created and their relevant characteristics are stored so the rest of the blocks can process those test vectors. A tree of comparators and multiplexers finds the least reliable bit of each symbol. The Sorting Array block, as described below, selects
46
Mathematical Theory and Applications of Error Correcting Codes
the η least reliable symbols of the received frame, which are sorted and stored for later use. The SC block computes the syndromes for the HD test vector and that information is used to create the syndromes for the additional test vectors. Each KES is fed with the syndromes of a new test vector each 16 cycles. The 16 Parallel Chien Polynomial Evaluation (PCPE) blocks are used to anticipate if those test vectors will be successfully decoded in a full CSEE block. After all those computations, the Vector Selection stage (VSS) feeds the CSEE block with the best test vector available. The vector test selection criteria are as follows: the first vector that accomplishes that the number of errors is equal to the order of the error-locator polynomial is the one to be decoded; otherwise, the HD test vector is selected.
Figure 2. Block diagram for the Q5 decoder.
In the case of η=4, one KES is enough to process all the test vectors and, therefore, the block diagram is the same as in Figure 2 but without KES2, SUS2 and PCPE2. In the case of the Q6 decoder, two more copies of each of those three blocks are required. Figure 3 shows the decoding chronogram for the Q5 decoder. As can be seen, while a KES computes a specific test vector, the corresponding SUS calculates the syndromes for the next one. At the same time, the corresponding PCPE processes the previous test vector. The decoding of a new frame can start every 256 cycles. In this decoder, KES2 must wait 16 cycles until the syndromes for its first test vector (i.e., #31) are available. KES1, on the other hand, works with test vector #0 (HD) during those cycles, since its syndromes are available. If KES2 were to compute 16 test vectors, the latency of the decoder would increase, since the decisions in VSS would be delayed. Moreover, the complexity of the decoder would also increase because the control logic would have to consider decisions for two consecutive frames at the same time. Therefore, Q5 computes 31 out of the 32 possible test vectors of a η=5 LCC (see Figure 4a). The test vectors that are evaluated by each KES follow a Gray code sequence. This allows the syndromes for a test vector to be easily created from the
Soft-Decision Low-Complexity Chase Decoders for the ...
47
syndromes of the previous one [19]. The total amount of required operations is reduced, since only one symbol changes from one test vector to the next one. It should be noted that the first test vector evaluated by each KES and the HD frame are different in just one symbol. Note that SUS2 follows the Gray sequence in reverse order, starting with test vector #31. In Q6, for the same reasons explained above, only 16+15+15+15=61 test vectors could be decoded. Nevertheless, in order to start the computation in all four KES with a test vector that has only one symbol difference with respect to the HD frame, we compute test vector #31 simultaneously in two KES, as shown in Figure 4b. Therefore, Q6 computes 60 out of the 64 possible test vectors in η=6. For the η=4 LCC, the full 4-bit Gray sequence is decoded. As can be observed in Figure 1, the coding gain of decoders Q5 and Q6 is close to that of true η=5 and η=6 decoders, respectively.
Figure 3. Decoding chronogram for the Q5 decoder.
Figure 4. Test vectors used by the Q5 and Q6 decoders. The arrows show the processing order followed by each KES. ■ = 1, □ = 0.
In the following subsections, we describe the blocks that are different from the ones in other LCC decoders.
Multiplicity Assignment Block The Minimum Finder block receives the soft magnitudes of the m=8 bits of a symbol and sorts them according to their absolute value [20]. For each
48
Mathematical Theory and Applications of Error Correcting Codes
symbol of the received frame this block outputs the hard decision value, the absolute value of the least reliable bit of the symbol and its position in the symbol (a 3-bit value). The goal of the Sorting Array block is to provide all the information required to create the additional test vectors. The information we need to create the test vectors is the position of each one of these η symbols in the frame and the location of their least reliable bit inside those symbols. In [20] both yHDi and y2HDi are sorted and stored for the η least reliable symbols. Nevertheless, in our proposal, instead of sorting/storing 2η 8-bit values, we only sort/store η 3-bit values that are the positions of the least reliable bits in the symbols. It is unnecessary to store yHDi and y2HDi since yHDi is already stored in the buffer and y2HDi can be obtained from yHDi if the position of its least reliable bit, posi, is known, since yHDi+y2HDi=2posi. Assuming that the reliability values are stored with g bits, a total of (g+22)⋅η bit registers are required in our proposal, whereas in [20] (g+48)⋅η bit registers are required. Moreover, instead of sorting/storing the positions of the symbols in the frames, for convenience reasons that are explained below, we sort/store the corresponding powers of α created by the Root Generator block. Figure 5a shows the architecture of the Sorting Array block. The first row uses the output from the Minimum Finder block to sort the symbols according to their reliability. The other 2 rows of the schematic apply the decisions adopted in the first block of their column and, therefore, store the position of the least reliable bits inside their symbol and the location of the symbols inside the frame. Figure 5b,d show the implementation schematic of the basic blocks in Figure 5a. The pseudocode of the Sorting Array block is shown in Algorithm 1.
Figure 5. Schematics of the Sorting Array block. Note that the first column is different than the rest, since “Din” and “shift in” inputs do not exist.
Soft-Decision Low-Complexity Chase Decoders for the ...
49
Algorithm 1. Pseudocode for the Sorting Array block
Syndrome Update Block The value to be added to the previous i-th syndrome, Sprevi, to obtain the new i-th syndrome, Snewi, is:
(1) where j is the position of the symbol in the frame and posj is the position of its least reliable bit. In this work, we propose a novel architecture for SUS that takes advantage of Equation (1) and of the fact of storing powers of α to indicate the positions of the least reliable symbols in the frame instead of their positions itself. Figure 6a shows the schematic of this block. The root multiplexer outputs α−j (selected from root1–rootη). The pos multiplexer outputs posj (selected from pos1–posη). Both values are changed each 16 clock cycles. The shift block scales by 2posj and the Reduction Modulo block computes the modular reduction to the primitive polynomial of the Galois Field. One syndrome is updated each clock cycle. For the first 16 clock cycles, the HD syndromes are used to compute the new syndromes. After that, the syndromes are computed from the syndromes of the previous test vector. The pseudocode of the Syndrome Update block is shown in Algorithm 2.
50
Mathematical Theory and Applications of Error Correcting Codes
Figure 6. Schematics for different blocks of the proposed decoders. Algorithm 2. Pseudocode for the Syndrome Update block
Vector Selection Block This block selects the test vector whose KES output feeds the CSEE block. The decision depends on whether the number of errors found by the PCPE block matches the order of the error-locator polynomial. Since the latency of the PCPE block is 21 (which is greater than the latency of the KES), VSS requires that the KES output from the previous test vector is still available. Therefore, for each KES in the decoder, two sets of registers are required to store the current and the previous test vectors. On the other hand, since the decision to feed the CSEE block with HD may be delayed beyond the
Soft-Decision Low-Complexity Chase Decoders for the ...
51
moment a new frame is being received (see Figure 3), the KES output for HD requires also two sets of registers to save the current and the previous frames. VSS also outputs the identification number of the test vector that is selected. The schematic for the VSS of the Q5 decoder is shown in Figure 6b. For the η=4 LCC the schematic is similar, but there is neither KES2 nor a second polynomial evaluation block. In the case of the Q6 decoder, more registers should be added for the storage of the KES3 and KES4 outputs, just as shown in Figure 6b for KES2.
Symbol Modification Block In an HDD, the corrected frame is created from the received frame and the error information. but in a LCC decoder, the error information is not related to the received frame, but to the selected test vector. Therefore, in order to create the corrected frame, first it is necessary to create the test vectors from the received frame (the HD symbols, which are stored in the buffer). The architecture we propose for this block is shown in Figure 6c. The multiplexers select the symbols that have to be changed according to the Gray code. When the output of the Root Generator matches one of the outputs of a multiplexer, the pattern required to change that symbol is obtained from the position of the bit to be changed inside the symbol. The pattern obtained in the Symbol Modification block is added to the HD symbol (from the buffer) and to the error magnitude (from CSEE) to correct the corresponding symbol (see Figure 2). The pseudocode of the Symbol Modification block is shown in Algorithm 3. Algorithm 3. Pseudocode for the Symbol Modification block
IMPLEMENTATION RESULTS The proposed architectures for the η=4 LCC, Q5 and Q6 decoders were implemented on an eight-metal layer 90 nm CMOS standard-cell technology with Cadence software and also in a Xilinx Virtex-V and Virtex-7 ultrascale
52
Mathematical Theory and Applications of Error Correcting Codes
FPGA devices with ISE and Vivado software, respectively. The chip layout of the proposed η=4 LCC decoder in ASIC is shown in Figure 7.
Figure 7. Chip layout of the proposed η=4 LCC decoder.
In Figure 8, we compare the gate count (#XORs) and coding gain of the proposed decoders with the results from state-of-the-art soft-decision RS(255,239) decoders, specifically η={3,4,5} LCC based on HDD (ZHA) [19], Factorization-Free decoder (FFD) [23] and Interpolation-Based decoder (IBD) [24]. As can be seen, our decoders improve the ratio coding gain versus area, when compared with other decoders.
Figure 8. Coding gain at FER = 10−6 versus implementation cost (#XORs) for ASIC devices. Note: data from the decoders labeled as ZHA [19], FFD [23] and IBD [24] are estimations and do not include the multiplicity assignment stage.
Soft-Decision Low-Complexity Chase Decoders for the ...
53
Table 1 shows the detailed gate count in ASIC (given in number of XORs) of the different blocks of the three proposed decoders. Table 1. Gate count (#XORs). Block
Q6
Q5
η=4
Root Generator
55
54
55
Minimum Finder
597
596
596
Sorting Array
619
506
407
Syndrome Update
3373
1681
830
Vector Selection
3911
2387
1626
Symbol Modification
304
269
219
Syndrome Computer
1538
1538
1538
KES
15,717
7906
3937
Polynomial Evaluation
9054
4493
2242
CSEE
1664
1665
1665
Others
2480
1852
1528
Decoder (without BUFFER)
39,312
22,947
14,643
BUFFER
12,289
12,281
12,282
Total gate count (#XOR)
51,601
35,228
26,925
Table 2 and Table 3 compare, for the same RS code, our proposals and state-of-the-art published decoders, for ASIC and FPGA devices, respectively. On the one hand, Table 2 compares our decoders with [21], the only decoder, to the best of our knowledge, that provides complete implementation results in ASIC, and also with [20]. On the other hand, Table 3 compares our decoders with [20], the only decoder, to the best of our knowledge, that provides complete implementation results in a Virtex-V FPGA device. As can be seen in Table 2, our η=4 LCC decoder requires 41% fewer gates in ASIC than [21], whereas it has a 0.07 dB improvement in coding gain at FER = 10−6 compared to this decoder. Furthermore, our Q5 decoder has a gate count similar to [21], but has 0.14 dB advantage in coding gain at FER = 10−6. On the other hand, the comparison with [20] is not that straightforward, due to differences in the technology used and the lack of gate count. Nevertheless, the reduction in area is clear when using the same Virtex-V FPGA device. As shown in Table 3, in this case, our η=4 LCC decoder reduces the LUT count in [20] about 29 %. Moreover, Ref. [20] has
54
Mathematical Theory and Applications of Error Correcting Codes
lower coding gain, since this is a η=3 LCC decoder. Our decoder η=4 LCC has similar area to the η=3 LCC decoder in [19], but it should be noted that [19] does not include the multiplicity assignment stage. Table 2. Implementation results of RS decoders in CMOS ASICs. RS(255,239) Ours Q6
Ours Q5
Ours η=4 LCC η=5 DCD [21]
η=3 LCC [20]
Process (nm) [email protected] @ Supply Volt. (V)
[email protected]
[email protected]
[email protected]
130@-
Chip area (mm2)/# Metal layers
0.632/8
0.435/8
0.336/8
0.216/9
0.332/-
Gate ct. (kXOR) no/ with buffer
39.3/51.6
22.9/35.2
14.6/26.9
22.5/45.3
-/-
Frequency (MHz)
446 *
450 *
450 *
320 †
220
Throughput (Gb/s)
3.55
3.58
3.58
2.56
1.6
Latency (clock cycles)
256 × 2 + 34
256 × 2 + 34
256 × 2 + 34
259 × 3
275 × 2 §
Power consumpt. (mW@ MHz)
62.2@446 ‡
32.1@450 ‡
28.8@450 ‡
19.6@320 †
-
Coding gain (dBs@FER)
0.60@10−6
0.52@10−6
0.45@10−6
0.38@10−6
0.37@ 10−6
Critical path
T∗ + T+ + Tx
T∗ + T+ + Tx
T∗ + T+ + Tx
2T∗ + 2T+ + Tx
T∗ + T+ + Tx
* Post-layout result. † Measurement. ‡ Estimated. § Does not include latency from MAS, SMS and Chien-Forney stages. Table 3. Implementation results of RS decoders in a Virtex-V XC5vlx50t-3 FPGA device.
Soft-Decision Low-Complexity Chase Decoders for the ...
55
In regard to latency and throughput results, as can be seen in Table 2 and Table 3, our decoders reach 450 MHz thanks to the low critical path, which is T∗ + T+ + Tx. The throughput of our η=4 LCC and Q5 decoders in ASICs is 255×8×450×106/256=3.58Gb/s. Since the decoder from [21] has longer critical path and higher computational latency than our decoders (259 versus 256 cycles), the potential throughput that it can achieve is slightly lower than ours. On the other hand, since the decoders from [19,20] have the same critical path as ours but slightly higher computational latency than ours (275 versus 256 cycles), it is expected that our throughput is slightly higher: 1.3 Gb/s versus 1.0 Gb/s [20], for the same FPGA device, as can be seen in Table 3. Additionally, the proposed decoders have a latency of 546 cycles, whereas the DCD [21] requires 777 cycles and the decoders from [19,20] require 550 clock cycles plus the pipeline. Table 2 also shows chip area and consumption details for the proposed decoders. Our consumption data are obtained with the Static Power Analysis tool of the Encounter software from Cadence. It should be noted that the proposed decoders are implemented with different Standard-Cell libraries, number of metal layers and supply voltage compared to [21]. For comparison purposes, we also implemented an η=4 LCC decoder optimized for area and working at 320 MHz (the same clock frequency as the decoder in [21]). For that implementation, the chip area is 0.268 mm2 and the estimated power consumption is 21.3 mW, which are similar to those of the decoder in [21]. More up-to-date FPGA device implementation results are shown in Table 4. As can be seen, in this technology our decoders reach 2.5 Gb/s. Table 4. Implementation results of RS decoders in a Virtex-7 FPGA devices. RS(255,239)
Ours Q6
Ours Q5
LUTs Registers Frequency (MHz) Throughput (Gb/s)
13,343 7223 312.5 2.5
7372 4480 312.5 2.5
Ours η=4 LCC 4227 3083 312.5 2.5
It should be noted that in the comparison with state-of-the-art decoders, the coding gain performance of other decoders [19,20,21] is that of an η=3 LCC decoder. The decoders we propose are the first ones that use 16 or more test vectors with their full-decoding capabilities. Zhang et al. [19] give estimation results for η=4 and η=5 using their architecture: 19,594 and 32,950 XOR gates, respectively. The hardware requirement of the decoder
56
Mathematical Theory and Applications of Error Correcting Codes
is reduced to 14,643/19,594 = 75% and 22,947/32,950 = 70%, respectively. On the one hand, it should be noted that their estimation does not include the Multiplicity Assignment nor the Symbol Modificaction stages. On the other hand, these authors estimate the cost of the η=5 LCC decoder assuming that the design of this decoder only requires the parallelization of specific resources (i.e., syndrome update, Key Equation Solver and polynomial evaluation), but, in the case of η=5 and η=6, the use of a Gray code sequence in the decoding process is not straightforward. We propose a solution for this issue in the present work. Moreover, when a decoder has to process 32 or 64 test vectors by using 2 or 4 processing units in parallel, respectively, part of these units would have to process their last test vector while the processing of the next frame has already started (see Figure 3). Processing those last test vectors would imply that the latency of the decoder increases and that a considerable amount of registers are required to concurrently process data from two frames. In the present work we propose a solution for this problem that still profits from the use of a Gray code processing sequence.
CONCLUSIONS In this work, we present three soft-decision Reed–Solomon LCC decoders for η=4, quasi-η=5 and quasi-η=6 that are based on HD decoding. The Frame Error Rate coding gains of the proposed decoders are 0.45, 0.52 and 0.60 at FER = 10−6 compared to hard-decision decoding, which are higher than those of previously published LCC decoders. The proposed architecture is easily scalable and is based on a simplification of the Multiplicity Assignment stage. We present also detailed implementation schematics for those computational blocks that are different from conventional implementations. In the present work we propose novel solutions for decoders that use a high number of test vectors, problems that go beyond the simple parallelization of resources. We present implementation results in ASIC and FPGA devices for the three decoders. The results show, for example, that our η=4 decoder, which has a 0.07 higher coding than the best η=3 decoder published to date, requires 41% less area and 29% less LUTs, in ASIC and FPGA, respectively, than the η=3 decoder. This results are achieved without spoiling the throughput nor the latency of the decoder.
Soft-Decision Low-Complexity Chase Decoders for the ...
57
REFERENCES 1.
2.
3. 4. 5. 6. 7.
8. 9.
10.
11.
12.
13.
Cideciyan, R.D.; Gustlin, M.; Li, M.P.; Wang, J.; Wang, Z. Next Generation Backplane and Copper Cable Challenges. IEEE Commun. Mag. 2013, 51, 130–136. Perrone, G.; Valls, J.; Torres, V.; García-Herrero, F. Reed–Solomon Decoder Based on a Modified ePIBMA for Low-Latency 100 Gbps Communication Systems. Circuits Syst. Signal Process. 2018. Bellorado, J. Low-Complexity Soft Decoding Algorithms for Reed–Solomon Codes. Ph.D. Thesis, Harvard University, Cambridge, MA, USA, 2006. Chase, D. Class of Algorithms for Decoding Block Codes with Channel Measurement Information. IEEE Trans. Inf. Theory 1972, 18, 170–182. Koetter, R.; Vardy, A. Algebraic Soft-Decision Decoding of Reed– Solomon Codes. IEEE Trans. Inf. Theory 2003, 49, 2809–2825. Sudan, M. Decoding of Reed–Solomon Codes beyond the ErrorCorrection Bound. J. Complex. 1997, 13, 180–193. Guruswami, V.; Sudan, M. Improved Decoding of Reed–Solomon and Algebraic-Geometry Codes. IEEE Trans. Inf. Theory 1999, 45, 1757– 1767. Blahut, R.E. Theory and Practice of Error Control Codes; AddisonWesley: Reading, MA, USA, 1983. Jiang, J.; Narayanan, K.R. Algebraic Soft-Decision Decoding of Reed– Solomon Codes Using Bit-Level Soft Information. IEEE Trans. Inf. Theory 2008, 54, 3907–3928. Gross, W.J.; Kschischang, F.R.; Koetter, R.; Gulak, R.G. A VLSI Architecture for Interpolation in Soft-Decision List Decoding of Reed–Solomon Codes. In Proceedings of the IEEE Workshop on Signal Processing Systems, San Diego, CA, USA, 16–18 October 2002; pp. 39–44. Zhu, J.; Zhang, X.; Wang, Z. Backward Interpolation Architecture for Algebraic Soft-Decision Reed–Solomon Decoding. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2009, 17, 1602–1615. Zhu, J.; Zhang, X. Efficient VLSI Architecture for Soft-Decision Decoding of Reed–Solomon Codes. IEEE Trans. Circuits Syst. I Regul. Pap. 2008, 55, 3050–3062. Wang, Z.; Ma, J. High-Speed Interpolation Architecture for SoftDecision Decoding of Reed–Solomon Codes. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2006, 14, 937–950.
58
Mathematical Theory and Applications of Error Correcting Codes
14. Zhang, X. Reduced Complexity Interpolation Architecture for SoftDecision Reed–Solomon Decoding. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2006, 14, 1156–1161. 15. Zhang, X.; Parhi, K.K. Fast Factorization Architecture in Soft-Decision Reed–Solomon Decoding. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2005, 13, 413–426. 16. Zhang, X.; Zhu, J. Hardware Complexities of Algebraic Soft-Decision Reed–Solomon Decoders and Comparisons. In Proceedings of the 2010 Information Theory and Applications Workshop (ITA), San Diego, CA, USA, 31 January–5 February 2010; pp. 1–10. 17. Bellorado, J.; Kavcic, A. Low-Complexity Soft-Decoding Algorithms for Reed–Solomon Codes—Part I: An Algebraic Soft-In Hard-Out Chase Decoder. IEEE Trans. Inf. Theory 2010, 56, 945–959. 18. García-Herrero, F.; Valls, J.; Meher, P.K. High-Speed RS(255,239) Decoder Based on LCC Decoding. Circuits Syst. Signal Process. 2011, 30, 1643–1669. 19. Zhang, W.; Wang, H.; Pan, B. Reduced-Complexity LCC Reed– Solomon Decoder Based on Unified Syndrome Computation. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2013, 21, 974–978. 20. Peng, X.; Zhang, W.; Ji, W.; Liang, Z.; Liu, Y. Reduced-Complexity Multiplicity Assignment Algorithm and Architecture for LowComplexity Chase Decoder of Reed–Solomon Codes. IEEE Commun. Lett. 2015, 19, 1865–1868. 21. Lin, Y.M.; Hsu, C.H.; Chang, H.C.; Lee, C.Y. A 2.56 Gb/s Soft RS(255,239) Decoder Chip for Optical Communication Systems. IEEE Trans. Circuits Syst. I Regul. Pap. 2014, 61, 2110–2118. 22. Wu, Y. New Scalable Decoder Architectures for Reed–Solomon Codes. IEEE Trans. Commun. 2015, 63, 2741–2761. 23. Zhu, J.; Zhang, X. Factorization-Free Low-complexity Chase SoftDecision Decoding of Reed–Solomon Codes. In Proceedings of the 2009 IEEE International Symposium on Circuits and Systems, Taipei, Taiwan, 24–27 May 2009; pp. 2677–2680. 24. Garcia-Herrero, F.; Canet, M.J.; Valls, J.; Meher, P.K. High-Throughput Interpolator Architecture for Low-Complexity Chase Decoding of RS Codes. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2012, 20, 568–573.
CHAPTER
4
Low-energy Error Correction of NAND Flash Memory through Soft-decision Decoding
Jonghong Kim and Wonyong Sung Department of Electrical Engineering and Computer Science, Seoul National University, Gwanak-gu, Seoul 151-744 Korea
ABSTRACT The raw bit error rate of NAND Flash memory increases as the semiconductor geometry shrinks for high density, which makes it very necessary to employ a very strong error correction circuit. The soft-decision-based error correction algorithms, such as low-density parity-check (LDPC) codes, can enhance the error correction capability without increasing the number of
Citation: Kim, J., Sung, W. “Low-energy error correction of NAND Flash memory through soft-decision decoding”. EURASIP J. Adv. Signal Process. 2012, 195 (2012). https://doi.org/10.1186/1687-6180-2012-195 Copyright: © 2012 Kim and Sung; licensee Springer. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
60
Mathematical Theory and Applications of Error Correcting Codes
parity bits. However, soft-decision error correction schemes need multiple precision data, which obviously increases the energy consumption in NAND Flash memory for more sensing operations as well as more data output. We examine the energy consumption of a NAND Flash memory system with an LDPC code-based soft-decision error correction algorithm. The energy consumed at multiple-precision NAND Flash memory as well as the LDPC decoder is considered. The output precision employed is 1.0, 1.4, 1.7, and 2.0 bits per data. In addition, we also propose an LDPC decoder-assisted precision selection method that needs virtually no overhead. The experiment was conducted with 32-nm 128-Gbit 2-bit multi-level cell NAND Flash memory and a 65-nm LDPC decoding VLSI.
INTRODUCTION NAND Flash memory is widely used for handheld devices and notebook PCs because of its high density and low power consumption. As the semiconductor geometry shrinks, the error performance of NAND Flash memory becomes worse, thus it is greatly needed to increase the reliability by using memory signal processing and forward-error correction (FEC) methods. Among various FEC codes, Bose-Chaudhuri-Hocquenghem (BCH) and Reed-Solomon (RS) codes have widely been used for NAND Flash error correction[1–3]. However, because of severe performance degradation of recent NAND Flash memory devices, more advanced FEC codes are needed. Low-density parity-check (LDPC) codes[4] show excellent error correcting performance close to the Shannon-limit when decoded with the belief-propagation (BP) algorithm[5] using soft-decision information. LDPC codes have successfully been applied to many communication systems such as DVB-S2[6], IEEE 802.3an[7], and IEEE 802.16e[8]. However, despite of good characteristics of LDPC codes, their application to NAND Flash memory is not straightforward because multiple precision output data are needed for exploiting the advantages of LDPC algorithms that show high performance with soft-decision decoding. Moreover, multiple sensing operations and delivering multiple precision data also increase the energy consumption of NAND Flash memory. In this article, we analyze the energy consumption of a NAND Flash memory error correction system that adopts LDPC soft-decision decoding. The energy consumption of NAND Flash memory as well as that of the LDPC decoder is all considered. A VLSI circuit-based decoder for a rate-
Low-energy error correction of NAND Flash memory through ...
61
0.96 (68254, 65536) LDPC code is used for error performance and energy estimation. Especially, the effect of energy consumption when increasing the precision of NAND Flash memory is analyzed. The LDPC decoder tends to consume more energy when the precision of NAND Flash memory output is very low, such as 1.0 bit per data; however, increasing the precision also demands more energy in NAND Flash memory for sensing and data transfer. As a result, the optimum precision is closely related to the signal quality of NAND Flash memory. We analyze this relation quantitatively, and also propose a method that can find the optimum precision using the iteration count of an LDPC decoder. The rest of this article is organized as follows. “Energy consumption of multi-bit data read in NAND Flash memory” section explains the read operation of NAND Flash memory and its energy consumption. In “Softdecision error correcting performance in NAND Flash memory” section, the performance of LDPC decoding with multi-precision output data in NAND Flash memory is presented. “Hardware performance of (68254, 65536) LDPC decoder” section describes the energy consumption of a rate-0.96 (68254, 65536) LDPC decoder with a 65-nm technology. In “Low-energy error correction scheme for NAND Flash memory” section, we analyze the total energy consumption of NAND Flash memory with LDPC code based soft-decision decoding and also propose an LDPC decoder-assisted precision selection method. Finally, this article ends with conclusion section.
ENERGY CONSUMPTION OF MULTI-BIT DATA READ IN NAND FLASH MEMORY NAND Flash memory overview A NAND Flash memory device contains thousands of cell blocks that can independently be erased. Each cell block consists of rows and columns of cells. The cells in the same row are controlled by the same word-line, and can be read or programmed simultaneously. The number of columns determines the page size, and the typical page size of the current generation of NAND Flash memory is 64 kbits (8 kbytes) besides the parity data. Each Flash memory cell is a floating gate NMOS transistor, in which the gate stores charges to control the threshold voltage of the transistor. Because of the process variation, program inaccuracy, charge leakage, and noise, the threshold voltage of NAND Flash memory has a Gaussian-like distribution. Today’s NAND Flash memory adopts the multi-level cell (MLC) technology
62
Mathematical Theory and Applications of Error Correcting Codes
that has more than one bit per memory cell to increase the density. The organization of a 128-Gbit NAND Flash memory device with 2-bit MLC technology is shown in Table1[9]. Table 1. The features of 34-nm 2-bit MLC NAND Flash memory [[9]] Capacity MLC tech. Device size Block size Page size
128 Gbits 2 bits/cell 8,192 blocks 256 pages 8,192 + 448 bytes
Voltage sensing scheme for multi-precision output In 2-bit MLC NAND Flash memory, each memory cell has one of four different threshold voltages that have Gaussian-like distributions as illustrated in Figure1. The left-most distribution is the erased state (symbol 11), while the remaining distributions correspond to three different programmed states (symbol 01, 00, and 10, respectively).
Figure 1. Voltage sensing schemes for soft-decision data output.
In conventional Flash memory with hard-decision data output, three sensing reference voltages (SRVs), namely, Vr. 1, Vr. 2, and Vr. 3, are needed to fully resolve the four threshold voltage distributions. Note that Vr. 1 resolves the boundary between symbols 11 and 01, while Vr. 2 is for the boundary of symbols 01 and 00, and Vr. 3is for symbols 00 and 10. Since a pair of LSB and MSB pages is mapped into a word-line and the bits are gray coded, Vr. 1 and Vr. 3 are required to read MSB pages, while only Vr. 2 is needed for LSB pages. The LSB sensing operation (SO) with Vr. 2is referred to S O1, and the MSB sensing operation with Vr. 1and Vr. 3 is represented by S O2. For soft-decision error correction, each page should be sensed with an increased number of reference voltages. Especially, it is needed to increase
Low-energy error correction of NAND Flash memory through ...
63
the resolution in the overlapping regions, where most of bit errors are occurred, as shown in Figure1. The simplest form of multi-bit sensing is to provide an erasure region at each symbol boundary. In this case, we need six SRVs and can obtain seven different threshold values. The lowest voltage region can be considered a strong 11 symbol, and the next lowest region is a value between 11 and 01. Figure1 shows four different sensing schemes, including the conventional sensing for hard-decision data output. Increasing the number of sensing operations at each symbol boundary can provide more accurate reliability information, which, however, increases the latency and energy consumption in NAND Flash memory. Let the number of SRVs be N s , the sensed threshold voltage belongs to one of N s + 1 regions, and N b (=log2(N s + 1)) bits are needed to represent the threshold voltage. Hence, each bit of a page is represented by N b /2 bits for 2-bit MLC NAND Flash memory. The memory sensing operations with 3, 6, 9, and 15 SRVs yield 1-, 1.4 (= 0. 5 × log2(7))-, 1.7 (= 0. 5 × log2(10))-, and 2 (= 0. 5 × log2(16))-bit soft-decision bits, respectively. For example, in the 2-bit soft-decision memory sensing scheme, there exist N s = 15 SRVs and 4 bits are enough to represent the 16 threshold levels for both LSB and MSB data. Since conventional NAND Flash memory devices do not provide multiprecision data output, obtaining the soft-decision data from conventional memory requires multiple hard-decision sensing and data output operations. Note that conventional NAND Flash memory devices provide command sequences that can change the SRVs. Figure2 illustrates the voltage sensing scheme for 1.7-bit soft-decision data output with conventional hard-decision Flash memory, where V r.i ’s are SRVs for 1 ≤ i ≤ 9. With hard-decision sensing S O1 using Vr. 5 and S O2 using Vr. 4 and Vr. 6 around the overlapping region R2, an LSB page is read with four levels as shown in Figure2a. In this case, two data output operations are performed. Meanwhile, because an MSB page has two overlapping regions, R1 and R3, three S O2’s using V r.i ’s, where i ∈ {1,2,3,7,8,9}, are needed. In addition, one S O1 using Vr. 5 is also performed to distinguish the region below Vr. 1and that above Vr. 9as illustrated in Figure2b. As a result, in order to read an MSB page with eight levels, one S O1 and three S O2are demanded, which results in four times many data output operations when compared to the conventional hard-decision mode. Finally, Table2 summarizes the number of sensing operations for the 1-bit hard-decision and the 1.4-, 1.7-, and 2-bit soft-decision data output. Note that the sensing results are mapped to log-likelihood ratio (LLR) values by using a look-up table in the Flash memory controller.
Mathematical Theory and Applications of Error Correcting Codes
64
Figure 2. Voltage sensing scheme of 1.7-bit soft-decision data output for (a) LSB and (b) MSB pages. Table 2. The number of sensing and data output (DO) operations for hard- and soft-decision sensing with conventional NAND Flash memory
1.0-bit 1.4-bit 1.7-bit 2.0-bit
LSB pages SO1 SO2 1 0 0 1 1 1 1 2
DO 1 1 2 3
MSB pages SO1 SO2 0 1 1 2 1 3 1 5
DO 1 3 4 6
LSB and MSB concurrent access scheme for low-energy softdecision data output As explained, the soft-decision scheme with conventional memory demands multiple hard-decision sensing and data transfer operations to increase the resolution in the overlapping region. Moreover, an additional LSB sensing operation is needed to access an MSB page as shown in Figure2. This scheme incurs a high amount of data output operations when high precision data are needed. In order to reduce the energy consumption of soft-decision data output, we consider a method that senses the LSB and MSB data simultaneously with multiple SRVs. In this scheme, N s memory sensing operations are performed for a row of transistors in the NAND Flash array, and all the sensing results are stored to the page register in ⌈N b ⌉ bits, where N b = log2(N s + 1). Assuming that up to 2-bit precision is used for each data, N b = 4 bits are needed to represent
Low-energy error correction of NAND Flash memory through ...
65
all soft-decision sensing results. Of course, this scheme needs increased hardware of 4 × Npagebits data registers to store the soft-decision sensing results as shown in Figure3, while the conventional NAND Flash memory has only Npagebits data registers, where Npagebits is the number of bits in each page.
Figure 3. NAND Flash memory with internal multi-precision data composition.
When compared to the soft-decision sensing using conventional NAND Flash memory, this concurrent access scheme greatly reduces the number of data transfer operations, only ⌈N b ⌉ bits for both LSB and MSB data, because the data are composed within a memory device. Thus, this method reduces not only the data output latency, but also the energy consumption for off-chip data transfer. Therefore, we only consider the LSB and MSB concurrent access scheme in this article.
Energy consumption of read operations in NAND Flash memory The read operation of NAND Flash memory involves address decoding, NAND Flash array access, and data output. The conventional NAND Flash memory supports various types of read operations such as read page and read page cache, where the read page mode accesses only one page, while the read page cache mode reads the next sequential pages in a block consecutively. The timing diagram of the read page mode is illustrated in Figure4, where tclk, t R , and trc denote the clock period, NAND Flash array access time per voltage sensing operation, and read cycle time, respectively. The array access time, t R , includes the threshold voltage sensing operation time as well as the data transfer time from NAND Flash array to either the data or cache register.
66
Mathematical Theory and Applications of Error Correcting Codes
Figure 4. Timing diagram of read page mode.
In this section, we analyze the energy consumption of reading 2-bit MLC NAND Flash memory. We estimate the energy consumption based on the electrical specifications listed in the data book from Micron technology [9]. We model the energy consumption of reading NAND Flash memory as the sum of the energy for array access (Eac) and that for data output (Edo), where (1) (2) Note that we only concern the active energy and ignore the idle energy. Vcc and Vccq are the core and the I/O supply voltages, while Icc and Iio represent the core and the I/O supply currents, respectively. Finally, the data output time is represented by tdo, which is determined by the number of bytes to output and the period of data output clock, as a result tdo = trc × ⌈N b ⌉ × Npagebits/8.
Since the read operation is performed simultaneously for both LSB and MSB data, the energy consumption of LSB and MSB pages is considered as follows. Let ELSBand EMSBbe the read energy for an LSB page and an MSB page, respectively. In 2-bit MLC, reading an MSB page uses two times many SRVs than that of an LSB page access, hence the energy consumption of the array access operations for an LSB page and an MSB page can be modeled as Eac/3 and Eac × 2/3, respectively. Because two pages of data are delivered simultaneously in the LSB and MSB concurrent access scheme, the data output energy of each page is modeled as Edo/2. Therefore, the energy consumption of each page can be represented as follows: (3) (4)
Low-energy error correction of NAND Flash memory through ...
67
Table3 shows the voltage, current, and timing parameters noted in the 34-nm 2-bit MLC NAND Flash data book from Micron technology[9]. Table 3. The voltage, current, and timing parameters of 2-bit MLC NAND Flash memory Asynchronous Synchronous
Unit
20, 25, 30, 35, 50, 10, 12, 15, 20, 30, 50 100 3.3 3.3
V
1.8, 3.3
1.8, 3.3
V
25
25
mA
8
20
mA
150–450
168–288
ns
tR
12.5
12.5
t rc
t clk
0. 5 × tclk
μ s/sensing ns
t clk V cc V ccq I cc I io t ad
ns
Table4 shows the estimated energy consumption and the latency of read operation for different output precision cases. Since the data output operation takes a long time due to the limited number of I/O ports, the operating condition that needs the smallest trc in the synchronous mode shows the minimum energy consumption. In this simulation, NAND Flash memory that operates at 100 MHz and Vccq of 1.8 V in the synchronous mode consumes the minimum read energy. Since the energy consumption of the read page mode is almost similar to that of the read page cache mode, we only consider the read page mode of the above operating condition (tclk = 10 ns, Vccq = 1. 8 V, and synchronous mode). Table 4. The energy consumption of a read operation for LSB and MSB pages Eac (nJ/byte) LSB pages 1.0-bit 1.4-bit 1.7-bit 2.0-bit MSB pages 1.0-bit
Edo byte)
(nJ/ E r (nJ/byte)
0.12 0.24 0.36 0.60
0.18 0.27 0.36 0.36
0.30 0.51 0.72 0.96
0.24
0.18
0.42
68
Mathematical Theory and Applications of Error Correcting Codes 1.4-bit 1.7-bit 2.0-bit
0.48 0.72 1.19
0.27 0.36 0.36
0.72 1.08 1.55
As summarized in Table4, the 1.4-, 1.7-, and 2-bit data output of an LSB page consume 1.7, 2.4, and 3.2 times more energy, respectively, when compared to the 1-bit hard-decision data output. MSB pages consume approximately 1.5 times more energy than LSB pages.
SOFT-DECISION ERROR CORRECTING PERFORMANCE IN NAND FLASH MEMORY In this section, we employ the MLC NAND Flash memory channel modeled in[10, 11], where random telegraph noise[12], the incremental step pulse programming[13], cell-to-cell interference[14], and non-uniform quantization[15] are considered. In particular, in order to support softdecision LDPC decoding, we adopt the LLR computation method proposed in[16], in which the four threshold voltage distributions are assumed as Gaussian distributions and the partial cumulative distribution functions of the Gaussian distribution are used to compute quantized LLRs. Thus, the LLR computation method only requires the means and the variances of the distributions obtained by performing channel estimation. Note that the LLR computation can be implemented by using a look-up table. For the error correction in NAND Flash memory, we employ a rate0.96 (68254, 65536) shortened Euclidean geometry (EG) LDPC code whose message size matches the page size of the 128-Gbit 2-bit MLC NAND Flash memory. The EG-LDPC codes[17] are a class of finite-geometry codes and show very low error-floor performance[18] as well as fast convergence speed[17], which are important properties for application to NAND Flash error correction. In this study, we estimate the error performances of the NAND Flash memory channel with LDPC and BCH decoders. We assume that the erased state (symbol 11) has a Gaussian distribution whose mean and standard deviation are 1.0 and 0.32 V, respectively, and the target programming voltages for the symbol 01, 00, and 10 are 2.6, 3.2, and 3.8 V, respectively. In order to generate the NAND Flash memory channel with different biterror rates (BERs), we change the cell-to-cell coupling coefficient factor (CCF)[15, 16]. The CCF primarily affects the variances of the threshold
Low-energy error correction of NAND Flash memory through ...
69
voltage distributions. Increasing the CCF results in high raw BER (RBER) because the variance of Flash memory signal becomes larger. The error performances of a rate-0.96 (68254, 65536) EG-LDPC code and two BCH codes over the NAND Flash memory channel are plotted in Figure5a for LSB pages, where the min-sum (MS) algorithm[19] is used for low-complexity LDPC decoding. The performance of BP-based LDPC decoding is also shown for comparison. The simulation of the LDPC code is performed in floating-point arithmetic. The x-axis represents RBER and the numbers in parentheses are the corresponding signal-to-noise ratio (SNR) values, which are computed assuming a 4-pulse amplitude modulation channel with additive white Gaussian noise. The BP algorithm with infinite-bit soft-decision information yields the best error correcting performance, and the MS decoding with 1.7- and 2-bit soft-decision data output also shows good error performance fairly close the BP decoding. The (68256, 65536, 160) BCH code, which has the same code rate of 0.96, shows a worse performance than the LDPC decoder with 1-bit (harddecision) data. In order to make the error performance of the BCH code close to that of LDPC code with 2-bit MS decoding, the error-correcting capability t is increased from t = 160 to t = 320, which corresponds to the code rate of 0.92 and requires more hardware resources. The comparison of soft-decision LDPC and hard-decision BCH codes clearly shows the advantage of the soft-decision decoding.
70
Mathematical Theory and Applications of Error Correcting Codes
Figure 5. Error-performance of a (68254, 65536) EG-LDPC code and two BCH codes for (a) LSB and (b) MSB pages.
Figure5b shows the error performances of the LDPC code and two BCH codes for MSB pages. The overall performance of the LDPC code for MSB pages is slightly worse than that for LSB pages. In this case, a BCH code with the error-correcting capability of t = 300 is required to achieve the comparable performance of the LDPC code with 2-bit soft-decision MS decoding. In Figure5a, we can find that even hard-decision-based decoding works when the RBER is lower than 1. 95 × 10−3. However, when the RBER is between 1. 95 × 10−3and 3. 15 × 10−3, the hard-decision-based decoding does not work and only soft-decision decoding can remove most of the errors. When the RBER is greater than 3. 62 × 10−3, even 2-bit MS decoding cannot correct the data properly. From this observation, we can divide the RBER values into five regions as shown in Table5. Although a NAND Flash memory system requires error-free decoding with BER less than 10−15, here we set the target BER to 10−7 because the simulation of the LDPC code takes much time to observe the minimum requirement. Note again that EG-LDPC codes show very low error-floor performance and have fast convergence speed. Finally, Table5 summarizes the results for LSB and MSB pages. Here, we can find that the 1.4-bit precision enhances the error correcting performance very much when compared to 1-bit hard-decision decoding. However, further increasing the precision brings diminishing returns. As a result, the Region II is quite wider than Region III or IV.
Low-energy error correction of NAND Flash memory through ...
71
Table 5. The operating regions according to memory output precision RBER (×10−3) LSB pages MSB pages Region I (R1) Region II (R2) Region III (R3) Region IV (R4) Region V (R5)
Memory output precision needed for 10−7BER 1-, 1.4- 1.7-, and 2-bit
∼1. 95 1.95–3.15
∼1. 79 1.79–2.90
2.90–3.15
1.7- and 2-bit
3.50–3.62
3.15–3.33
2-bit
3.62+
3.33+
–
3.15–3.50
1.4- 1.7-, and 2-bit
HARDWARE PERFORMANCE OF (68254, 65536) LDPC DECODER In order to assess the energy consumption of LDPC decoding, we have implemented the (68254, 65536) EG-LDPC decoder employing the normalized a posteriori probability (APP)-based algorithm and layered decoding that lead to simplified functional units and halved decoding iterations, respectively. In addition, a conditional variable node update scheme is employed to improve the error performance and reduce circuit switching activities in the node processing units[20]. The decoding throughput is increased by employing 5-stage pipelined 8-way parallel architecture, while the chip size is much reduced by adopting memory optimization techniques[20]. Because the error performance of fixed-point LDPC decoding with the normalized APP-based algorithm is very close to that of the floating-point decoding, the (68254, 65536) LDPC decoder yields almost the same performance as shown in Figure5 for the NAND Flash memory channel. The LDPC decoder was synthesized, placed, and routed in 0.13-μ m CMOS technology using Synopsys tools, then parasitic resistances and capacitances were extracted to estimate the energy consumption accurately. Randomly generated information bits were encoded and Gaussian noise was added to make test vectors. Then, the power consumption, iteration count, and decoding latency were estimated by using Synopsys PrimeTime. From the simulation results, we obtained the average energy consumption as a first-order function of the iteration count. Finally, the energy consumption of the LDPC decoder was computed using the average iteration counts found by simulations for each memory output precision and RBER. In order to consider the implementation with a recent process technology, the decoding
72
Mathematical Theory and Applications of Error Correcting Codes
energy of the LDPC decoder is scaled down to a 65-nm technology. The core supply voltages of 130 and 65 nm nodes are 1.2 and 1.0 V, respectively. In addition, the maximum clock frequencies are assumed to be the same, 131 MHz, for both processes. Considering the process technologies and the supply voltages, the energy consumption is scaled down by a factor of 2.88 (= [(65/130 nm) × (1. 0/1. 2 V)2−1) for the 65-nm technology node according to[21]. The energy consumption of the (68254, 65536) LDPC VLSI with the 65nm technology for hard-decision and soft-decision data is shown in Figure6, where the input to the LDPC decoder are LLR values. The clock frequency was set to 131 MHz. Here, we set the maximum iteration limit as eight and the number of quantization bits in the LDPC decoder as seven including two bits for the fractional part. Since the implemented LDPC decoder shows very fast convergence speed, the decoding energy consumption decreases rapidly at low RBER (high SNR). For the low RBER region below 10−3, all decoders demand mostly one decoding iterations, thus resulting in the minimum energy consumption of 0.7 nJ/byte. For the region exceeding the RBER of 10−3, decoding with multi-precision data consumes less energy than that with the hard-decision data because of the decreased number of iterations. In addition, in the region below the RBER of 3 × 10−3, all softdecision decoding shows similar energy consumption.
Figure 6. The energy consumption of the (68254, 65536) LDPC decoder (65nm VLSI) over NAND Flash memory channel for (a) LSB and (b) MSB pages.
Low-energy error correction of NAND Flash memory through ...
73
At the high RBER region where only 2-bit soft-decision decoding is allowed to use, we can find that the average energy consumption of the LDPC decoder is 1.6 to 8.4 times higher than that of a read operation in MLC NAND Flash memory. However, in the low RBER (high SNR) region, in which all kinds of precision can be used, the LDPC decoder consumes only 0.5 to 2.3 times of the energy needed for the read operation in MLC NAND Flash memory. Therefore, we can consider that the total energy consumption is significantly affected by the LPDC decoder in the high RBER region, but is more influenced by the read operation of NAND Flash memory in the low RBER region.
LOW-ENERGY ERROR CORRECTION SCHEME FOR NAND FLASH MEMORY Optimum output precision for low-energy decoding The total energy consumption of NAND Flash memory access can be obtained by adding that for memory access and that for error correction. We observe that high output precision increases the energy for memory access, while it can reduce the LDPC decoding energy. Figure7 shows the total energy consumption of NAND Flash memory with the LDPC decoder for LSB and MSB pages, where NAND Flash memory operates at 100 MHz and Vccq of 1.8 V in the synchronous data output mode. The vertical dotted lines divide the operating regions according to Table5.
Figure 7. The total energy consumption for (a) LSB and (b) MSB pages, where the LDPC decoder was scaled down to 65-nm technology node.
74
Mathematical Theory and Applications of Error Correcting Codes
In the region I, where all hard- and soft-decision decoding operate, the 1-bit hard-decision decoding shows the smallest energy consumption when the RBER is very low, while the 1.4-bit soft-decision decoding consumes less energy than the hard-decision decoding as RBER increases. In the region II, the 1.4-bit memory output precision results in the lowest energy consumption, while in the region III, the 1.7-bit precision leads to the lowest consumption. Finally, in the region IV, there is no other choice except the 2-bit softdecision decoding. In summary, for each operating region, decoding with the smallest output bits allowed consumes the least energy among possible decoding schemes, especially for decoding MSB pages. Although the 2-bit soft-decision decoding shows the best error correcting performance over all RBER regions, it consumes up to two times more energy than the hard-decision decoding at the low RBER (high SNR) region because of the additional memory sensing operations. Therefore, depending on the channel condition, appropriate memory output precision should be chosen to minimize the total energy consumption. We also studied the trend of total energy consumption when considering both program-and-erase (PE) cycling and data retention. The NAND Flash memory channel estimation proposed in[22] was used to decide the SRVs and the smallest output precision was chosen among the possible decoding schemes. Figure8 shows the total energy consumption for MSB pages. The number of PE cycles and retention time vary from 1 to 5K times and from 1 to 9K hours, respectively. The coupling coefficients of the x and xy directions are set to 0.1034 and 0.006721, respectively, in order to consider 20-nm Flash memory technology [23, 24]. We can find that the total energy consumption is very strongly affected by the PE cycling. When the number of PE cycles is less than or equal to 1K, the total energy consumption shows the least amount, which is around 1 nJ/byte regardless of the retention time. However, the total energy consumption also increases with the retention time when the number of PE cycles is larger than 1K.
Low-energy error correction of NAND Flash memory through ...
75
Figure 8. The total energy consumption for MSB pages with the number of PE cycles and retention time.
Iteration count-based precision selection The presented experimental results show that optimum precision selection is very important for low-energy soft-decision-based decoding of NAND Flash memory. One straightforward idea is to conduct failure-based precision selection. In this method, the precision is increased when the decoding is failed. For example, the decoding begins with 1-bit (hard-decision) precision, and if it fails, the decoding is retried with an improved precision. Although this method is very simple and there is no need of storing the precision information, this can consume a large amount of energy when the decoding fails because LDPC decoders iterate many cycles. Of course, the failurebased scheme also incurs additional time-delay for retrying the decoding with an updated precision. Another approach is to estimate the signal quality of NAND Flash memory periodically with channel estimation algorithms[22]. By sensing the signal with multiple threshold voltages, we can estimate the mean and the variance of each symbol. This method, however, demands extra time and energy for signal quality estimation. Considering that the signal quality deteriorates when the number of PE cycles and the retention time increase, the overhead of periodic estimation can be quite high, especially for a large capacity solid-state drives. We propose a precision selection method that utilizes the iteration count of the LDPC decoder. In this explanation, we use the precision of 1.0, 1.4, and 2.0 bits because the optimum operating range of the 1.7-bit precision is quite narrow. As shown in Figure9, when the RBER is very low, such as less
76
Mathematical Theory and Applications of Error Correcting Codes
than 1. 0 × 10−3, the average iteration count is around one even with 1-bit precision decoding. Thus, employing the 1-bit precision is the best for low energy decoding in this region. However, as the RBER grows and when it becomes approximately between 1. 0 × 10−3 and 1. 79 × 10−3, the decoding with 1-bit precision for Flash memory output demands an increased number of iterations. Thus, we need to increase the precision to 1.4-bit for low energy when the iteration count with 1-bit precision is repeatedly two or greater. Of course, the opposite path is also needed. If the iteration count is repeatedly only one with 1.4-bit precision, then it is needed to lower the precision into 1-bit. A similar scenario happens when the RBER is close to 3. 0 × 10−3. At this region, the decoding with 1.4-bit demands the iteration count of three or more. This means that it is the time to increase the iteration count to 2-bit. Of course, when the iteration count with 2-bit decoding is repeatedly equal to or less than two, we need to decrease the precision to 1.4-bit. Since we increase the precision before the decoding failure, we can avoid the energy loss and delay.
Figure 9. Average number of decoding iterations of the (68254, 65536) LDPC decoder for MSB pages.
The iteration count-based precision selection can also be applied to adapt the reference voltages. When the bit error pattern shows an asymmetric result, which means that the number of errors from 1 to 0 is significantly higher or lower than that from 0 to 1, we need to adjust the sensing reference voltages and the direction is easily determined by the error statistics. The channel estimation is performed only when the iteration count with 2-bit precision is repeatedly four or greater.
Low-energy error correction of NAND Flash memory through ...
77
CONCLUDING REMARKS We studied the optimum output precision of NAND Flash memory for low-energy soft-decision-based error correction. The energy consumed at NAND Flash memory as well as the LDPC decoder is considered. This study shows that the optimum precision of Flash memory data for soft-decision LDPC decoding depends on the signal quality, which implies that knowing the SNR of NAND Flash memory is quite important for low-energy error correction. When the SNR is relatively high, the conventional 1-bit (harddecision) decoding leads to the lowest energy consumption because of minimum sensing and output energy consumed at NAND Flash memory; however, as the SNR decreases the optimum number of bits for low energy needs to be increased. We find that the precision of 1.4-bit for each output, which represents providing an erasure region at each signal boundary, leads to minimum energy decoding at a broad range of signal quality. We also propose an adaptive, feedback-based, precision selection scheme that needs virtually no overhead.
ACKNOWLEDGEMENTS This work was supported in part by the Brain Korea 21 Project and the National Research Foundation of Korea(NRF) grants funded by the Korea government(MEST) (No. 2011-0027502 and 2012R1A2A2A06047297).
78
Mathematical Theory and Applications of Error Correcting Codes
REFERENCES 1.
Liu W, Rho J, Sung W: Low-power high-throughput BCH error correction VLSI design for multi-level cell NAND Flash memories. In Proceedings of the IEEE Workshop Signal Processing Systems (SiPS’2006). (Alberta, Canada; 2–4 October 2006). pp. 303–308 2. Micheloni R, Ravasio R, Marelli A, Alice E, Altieri V, Bovino A, Crippa L, Martino ED, D’Onofrio L, Gambardella A, Grillea E, Guerra G, Kim D, Missiroli C, Motta I, Prisco A, Ragone G, Romano M, Sangalli M, Sauro P, Scotti M, Won S: A 4-Gb 2b/cell NAND flash memory with embedded 5b BCH ECC for 36 MB/s system read throughput. In IEEE ISSCC Digest of Technical Papers. (San Francisco, CA; 6–9 February 2006). pp. 497–506 3. Chen B, Zhang X, Wang Z: Error correction for multi-level NAND flash memory using Reed-Solomon codes. In Proceedings of the IEEE Workshop Signal Processing Systems (SiPS’2008). (Hainan Island, China; 8–10 October 2008). pp. 94–99 4. Gallager RG, Low density parity check codes: IRE Trans Inf. Theory. 1962, IT-8(1):21-28. 5. MacKay DJC: Good error-correcting codes based on very sparse matrices. IEEE Trans. Inf. Theory 1999, 45(2):399-432. 10.1109/18.748992 6. Digital Video Broadcasting (DVB): Second Generation System for Broadcasting, Interactive Services, News Gathering and Other Broadband Satellite Applications. ETSI Standard, ETS 302 307 March 2005. 7. IEEE Standard for Information Technology-Telecommunications and Information Exchange between Systems-Local and Metropolitan Area Networks-Specific Requirements Part 3, Carrier Sense Multiple Access with Collision Detection (CSMA/CD) Access Method and Physical Layer Specifications. IEEE Standard IEEE 802.3an June 2006. 8. IEEE Standard for Local and Metropolitan Area Networks Part 16: Air Interface for Fixed and Mobile Broadband Wireless Access Systems. IEEE Standard IEEE 802.16e February 2006. 9. Micron Technology Inc., 2009);http://www.micron.com/products/ nand-flash/mlc-nand . Accessed 5 December 2012 10. Dong G, Li S, Zhang T: Using data postcompensation and predistortion to tolerate cell-to-cell interference in MLC NAND flash memory. IEEE
Low-energy error correction of NAND Flash memory through ...
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
79
Trans Circuits Syst. I: Regular Papers 2010, 57(10):2718-2728. Dong G, Pan Y, Xie N, Varanasi C, Zhang T: Estimating informationtheoretical NAND flash memory storage capacity and its implication to memory system design space exploration. IEEE Trans. VLSI Syst 2012, 20(9):1705-1714. Monzio C, Ghidotti M, Lacaita A, Spinelli A, Visconti A: Random telegraph noise effect on the programmed threshold-voltage distribution of flash memories. IEEE Electron. Dev. Lett 2009, 30(9):984-986. Suh BH, Um YH, Kim JK, Choi YJ, Koh YN, Lee SC, Kwon SS, Choi BS, Yum JS, Choi JH, Kim JR, Lim HK: A 3.3V 32Mb NAND flash memory with incremental step pulse programming scheme. IEEE J. Solid State Circuits 1995, 30(11):1149-1156. 10.1109/4.475701 Lee JD, Hur SH, Choi JD: Effects of floating-gate interference on NAND flash memory cell operation. IEEE Electron. Dev. Lett 2002, 23(5):264-266. Dong G, Xie N, Zhang T: On the use of soft-decision error-correction codes in NAND flash memory. IEEE Trans. Circuit Syst. I: Regular Papers 2011, 58(2):429-439. Kim J, Lee D, Sung W: Performance of rate 0.96 (68254, 65536) EGLDPC code for NAND Flash memory error correction. In Proceedings of IEEE International Conference on Communications (ICC’2012), Workshop on Emerging Data Storage Technologies. (Ottawa, Canada; 10–15 June 2012). Kou Y, Lin S, Fossorier MPC: Low-density parity-check codes based on finite geometries: a rediscovery and new results. IEEE Trans. Inf. Theory 2001, 47(7):2711-2736. 10.1109/18.959255 Huang Q, Diao Q, Lin S, Abdel-Ghaffar K: Cyclic and quasi-cyclic LDPC codes on constrained parity-check matrices and their trapping sets. IEEE Trans. Inf. Theory 2012, 58(5):2648-2671. Chen J, Dholakia A, Eleftheriou E, Fossorier MPC, Hu XY: Reducedcomplexity decoding of LDPC codes. IEEE Trans. Commun 2005, 53(8):1288-1299. 10.1109/TCOMM.2005.852852 Kim J, Sung W: A rate-0.96 LDPC decoding VLSI for soft-decision error correction of NAND flash memory. IEEE Trans. VLSI Syst 2012. (submitted, in review) Borkar S: Design challenges of technology scaling. IEEE Micro 1999, 19(4):23-29. 10.1109/40.782564
80
Mathematical Theory and Applications of Error Correcting Codes
22. Lee D, Sung W: Estimation of NAND flash memory threshold voltage distribution for optimum soft-decision error correction. IEEE Trans. Signal Process 2012. (accepted for publication) 23. Prall K: Scaling non-volatile memory below 30nm. In Proceedings of the 22nd IEEE Non-Volatile Semiconductor Memory Workshop(SiPS’2007). (Monterey, CA; 26–30 August 2007):5-10. 24. Poliakov P, Blomme P, Corbalan MM, Houdt JV, Dehaene W: Crosscell interference variability aware model of fully planar NAND Flash memory including line edge roughness. Microelectron Reliab 2011, 51(5):919-924. 10.1016/j.microrel.2010.12.010
CHAPTER
5
Performance of Soft Viterbi Decoder enhanced with Non-Transmittable Codewords for Storage Media
Kilavo Hassan 1, Kisangiri Michael 1and Salehe I. Mrutu2 School of Computational and Communication Science and Engineering, Nelson Mandela African Institution of Science and Technology, Arusha, Dodoma, Tanzania.
1
College of Informatics and Virtual Education, The University of Dodoma, Dodoma, Tanzania.
2
ABSTRACT The introduction of Non-Transmittable Codewords (NTCs) into Viterbi Algorithm Decoder has emerged as one of the best ways of improving performance of the Viterbi Algorithm Decoder. However, the performance has been tested only in hard decision Viterbi Decoder in telecommunication Citation: Hassan, K., Michael, K., & Mrutu, S. I. (2018). “Performance of Soft Viterbi Decoder enhanced with Non-Transmittable Codewords for storage media”. Cogent Engineering, 5(1), 1426538.https://doi.org/10.1080/23311916.2018.1426538 Copyright: © 2018 The Author(s). This open access article is distributed under a Creative Commons Attribution (CC-BY) 4.0 license.
82
Mathematical Theory and Applications of Error Correcting Codes
systems, but not in soft decision Viterbi Decoder and storage media. Most storage media use Reed Solomon (RS) Algorithm Decoder. Yet, the field experience still shows failure of the algorithm in correcting burst errors in reading data from the storage media; leading into data loss. This paper introduces the Soft Viterbi Algorithm Decoding enhanced with NonTransmittable Codewords for storage media. Matlab software was used to simulate the algorithm and the performance was measured by comparing residual errors in a data length of one million bits. Additive White Gaussian Noise model was applied to distort the stored data. The performance comparison was made against the Reed Solomon code, Normal Soft Viterbi and Hard decision Viterbi enhanced with NTCs. The results showed that the Soft Viterbi Algorithm enhanced with NTCs performed remarkably better by 88.98% against RS, 84.31% against Normal Soft Viterbi and 67.26% against Hard Viterbi enhanced with NTCs. Keywords: NTCs, SVAD-NTCs, soft and hard Viterbi, Reed Solomon, storage media
PUBLIC INTEREST STATEMENT The demand for digital data storage media increases every day and it is estimated that over 90% of all the digital data produced in the world is being stored in hard disk. Sometimes, errors occur in storage media and hence; causing data retrieving difficulties that lead to data loss. Error control in storage media rely upon error correction algorithms to guarantee information retrieval. Reed Solomon (RS) code is the dominant algorithm for errors correcting in storage media. However, recent studies show that there still exist challenges in retrieving data from storage media. This research is among the efforts to design more powerful and effective error correction algorithms. In this study, Soft Viterbi Algorithm Decoder enhanced with Non-Transmittable Codewords (SVAD-NTCs) is proposed. The experiment results for error correction in storage media show that, SVAD-NTCs perform better than RS.
INTRODUCTION There is a big challenge behind the error correction for the storage media due to higher demand of digital data of which most of them are stored on storage media. The demand for storage media devices is increasingly vast (Coughlin & Handy, 2008). Large file sizes requirement for high resolution
Performance of Soft Viterbi Decoder enhanced with ...
83
and multi-camera images are among the reason for increasing demand of storage devices (Coughlin, 2015). Ensuring data reliability and quality of data from the storage media is one of the big challenges (Peters, Rabinowitz, & Jacobs, 2006). The demand for storage media increases every day and it is estimated that over 90% of all information and data produced in the world are stored on hard disk drives (Pinheiro, Weber, & Barroso, 2007). Majority of the people are not aware and interested in improving the Forward Error Correction codes for storage media; rather they are interested in improving backup systems and data recovery software (Hassan, Michael, & Mrutu, 2015). To prevent errors from causing data corruption in storage media, data can be protected with error correction codes. The Viterbi decoder was introduced by Andrew J. Viterbi in 1967 (Mousa, Taman, & Mahmoud, 2015; Sargent, Bimbot, & Vincent, 2011). Since then the researchers are tirelessly working to expand his work by finding better Viterbi decoder (Andruszkiewicz, Davis, & Lleo, 2014; Takano, Ishikawa, & Nakamura, 2015). The Viterbi algorithm allows a random number of the most possible sequences to be enumerated. It can be used to efficiently calculate the most likely path of the hidden state process (Cartea & Jaimungal, 2013; Titsias, Holmes, & Yau, 2016). Channel coding techniques in storage media are used to make the transmitted data robust against any impairment. The data in a storage media get corrupted by noise and can be recovered by using channel coding techniques (Cover & Thomas, 2012). The encoding technique can be either systematic encoding or non-systematic encoding. The comparison between channel codes can be done by looking on different metrics such as coding accuracy, coding efficiency and coding gain. The coding accuracy means a channel is strong and can usually recover corrupted data. The accuracy can be compared on how close the recovered data match with the original data which is measured by Bit Error Rate (BER) probabilities (Jiang, 2010). Coding efficiency means the code has relatively a small number of encoder bits per data symbol and this is defined in terms of code rate which is given by R = K/N, where K is an input symbol to each encoder and N is an output symbol from each encoder. Decreasing the redundant bits decreases the number of error per symbol that can correct/detect errors. Coding gain is the measure of the difference between signal to noise ratio (SNR) level between coded system and uncoded system that require reaching the same bit error rate (BER) levels. Convolutional code and Viterbi decoder are the powerful forward error correction techniques (Katta, 2014). The Viterbi algorithm is one of the best methods in decoding Convolutional codes. It involves the
84
Mathematical Theory and Applications of Error Correcting Codes
calculating and measuring the similarity or distance between the received signal at time t and all the trellis path entering each state at time t(i) (Sood & Sah, 2014). Viterbi decoding is probably the most popular and widely adopted in decoding algorithm for Convolutional codes. Viterbi is utilized to decode the Convolutional codes (Becker & Traylor, 2012; Jiang, 2010). The decoding can be done by using two different approaches. One is hard decision approach and the second is soft decision approach (Sklar, 2001). The difference between the two approaches is that the hard decision digitizes the received voltage signals by comparing it to a threshold before passing it to the decoder while the soft decision uses a continuous function of the analogy sample as the input to the decoder and does not digitize the incoming sample prior to decoding. In this paper, the soft decision approach was considered better than hard decision approach. If this approach will be adapted to storage media devices, the data reliabilities over these storage devices will improve.
BINARY CONVOLUTIONAL ENCODING AND DECODING Convolutional code consists of (n, k, m) and typically k and n are small integers with k < n, but the memory order m can vary and as you increase the memory, the low error probability can be achieved (Marazin, Gautier, & Burel, 2011). Convolutional codes are different from the block codes in such a way that the encoder contains the memory and the n encoder at any time unit and depends not only on the k input but also on the previous input block (Wicker, 1995). The encoding process used in this paper is locked Convolutional encoding and the decoding process used is the enhanced Soft decision Viterbi Decoding. Figure 1 is a state diagram for (2, 1, 2) binary Convolutional encode, where we have one bit as an input and then we get two bits as output.
Figure 1. State diagram for (2, 1, 2) binary convolutional encoder.
Performance of Soft Viterbi Decoder enhanced with ...
85
Table 1 is the output and input of the state diagram. From different states, having input 0 or 1 we can get different output that can be either 00, 01, 10 or 11 and then we can identify the next state. Table 1. Input and output of the state diagram for encoder Current state
Output symbol when input is 0 and 1 Input = 0
Next state
Input = 1
Next state
00(S0)
00
00(S0)
11
10(S2)
01(S1)
11
00(S0)
00
10(S2)
10(S2)
10
01(S1)
01
11(S3)
11(S3)
01
01(S1)
10
11(S3)
Notes:- S0 = 00 (State one), S1 = 01 (State two), S2 = 10 (State three), and S3 = 11 (State four).
Figure 2 is a state diagram for (2, 1, 2) Binary Convolutional Decoder, where we have two bits as input and then we get one bit as output. This is the one that we is used to get back to original code work that has not been corrupted. We follow this through the minimum Euclidean distance.
Figure 2. State diagram for (2, 1, 2) binary convolutional decoder.
Table 2 is the output and input of the state diagram. The input 00, 01, 10 or 11are used to get the output 1 or 0 that guide to get the correct code that has been corrupted. Table 2. Input and output of the state diagram for the decoder Current state
Output symbol when input are 00,01,10 and 11
S0
0
S0
1
S1
Input = 00 Next state
S1 S2 S3
Input = 01 1 0
Next state S3 S2
Input = 10 Next state 0 1
Input = 11
Next state
1
S1
0
S0
S2 S3
Notes:- S0 = 00 (State one), S1 = 01 (State two), S2 = 10 (State three), and S3 = 11 (State four).
86
Mathematical Theory and Applications of Error Correcting Codes
ENHANCED SOFT VITERBI ALGORITHM Soft Viterbi Algorithm Decoder enhanced with Non-Transmittable Codewords (SVAD-NTCs) was used in this study. Locked Convolutional Encoder was used to encode data while Soft Viterbi Algorithm Decoder enhanced with Non-Transmittable Codewords was used to decode stored data during the reading process. The general parameters used in our design were in reference to the general diagram for binary Convolutional Encoder shown in Figure 3. The input is k bits which means the encoder accepts k bits at a time and reads out n bits which give us the code rate of k/n. In our case, the input k bits is 1 and the output n bits is 2 hence the code rate is 1/2. Having this rate means that there are one input bits and two output bits. Constrain length is given by m + 1 where m is the memory. In this design, our constraint length is 3 which is the number of delay elements in the convolutional coding. This means that there are two delay elements which are the memory plus the single input bits. These parameters dictate the complexity of the decoder. For example, when you increase the constraint length that means increasing the memory size and the error correcting capability increases (Mrutu, Sam, & H. Mvungi, 2014a). By doing so, the decoder ends up with prohibitive delays when constraint length is above 10 which is not preferred and this is caused by the exponential growth of the decoding computation.
Figure 3. Convolutional encoder.
The data are provided at a rate of q bits per second and the channel symbol is an output at a rate of n = 2q symbol per second. The input k is stable during the encoder cycle and the cycle starts when the input clock edge occurs. In this case, the output of the left-hand flip flop is clocked into the right hand flip flop, the previous input bit is clocked into the left hand flip flop and a new input bit becomes available. This encoder encodes input k in (7, 5) Convolutional code, where number 7 and 5 represent the
Performance of Soft Viterbi Decoder enhanced with ...
87
code generator polynomial. This polynomial reads in binary as (1112 and 1012) and corresponding to the shift registers connection to the upper and lower module two adders respectively. This code has been determined to be the best code for the rate 1/2.
DEVELOPED MODEL This study introduces the new technique for Forward Error Correction which can be adopted in the storage media devices. The technique uses locked Convolutional Encoder to write data to storage media devices and Soft Viterbi Algorithm Decoder enhanced with Non-Transmittable Codewords to read data from the storage media devices. Viterbi Algorithm Decoder has special characteristics which enable it to use Non-Transmittable Codewords (NTCs) at the receiving machine (Mrutu, Sam, & Mvungi, 2014b).This technique uses either higher or lower locked encoder where in higher encoder we add two bits which are minus one and minus one (−1−1) and in lower encoder we add two bits which are one and one (11). Using this technique stabilizes the decoder and reduces the computation complexity. The technique can be used in the decoding process if we use the locked Convolutional Encoder during the encoding process. In this model, locked Convolutional Encoder is used in writing data to the storage media and Soft Viterbi Algorithm Decoder enhanced with Non-Transmittable Codewords in the reading of the data in the other end. Reed Solomon was also developed to help in the comparison as it has been seen that most storage media devices use reed Solomon algorithm. Both algorithms used the same code efficient. The following are the parameters used in our simulation Table 3 Table 4 Figure 4. Table 3. Reed solomon code parameters used Parameter
Value
Bit per symbol m
5 bit
Codewords length N
N = 2 x – 1 = 31 (symbol)
Message length z
z = [11, 15, 19, 23, 27] (symbol)
Number of codewords
6,452 codewords
Noise
AWGN
Modulation/Demodulation
BPSK
Number of encoded bit sent
6,452*N*x = 10,00,060 (bit)
SNR
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] (dB)
88
Mathematical Theory and Applications of Error Correcting Codes
Table 4. Parameter uses for enhanced soft Viterbi algorithm Parameter Data length
Value 106
Constraint length (K)
m+1=3
Rate (r) Encoder lock bits NTCs Modulation/Demodulation Noise Quantization
k/n = 1/2 2 zero bits (i.e. 00) 6 BPSK AWGN Soft/Hard decision
Path evaluation Locked convolutional encoder
Euclidean distance 2, 1, 2
Figure 4. Simulation block diagram.
Keeping track of the post decoding errors was the key to observe the performance of the decoder. Table 5 shows the residual error after adding the NTCs in different signals to noise ratio. The test was conducted by adding NTCs and the improvement made was noted. This process kept going for all 13 NTCs. The observation from the Table 5 shows that the residual error decreased by adding more NTCs. In the NTCs from 6 to 13, there were no changes meaning that adding more NTCs did not have significant change from NTCs 7. The residual error is almost the same from other NTCs. This shows that the maximum improvement which we can be use is to use the NTCs 6. These results concur with the hard decision results from the research done by Mrutu, Sam, and Mvungi (2014c) that the maximum NTCs which have significant changes in the improvement is 6.
0
61,719
11
Total
0
1
1
7
8
0
35
6
0
241
5
9
1,397
4
10
1
5,072
3
38,146
0
0
5
38
190
977
3,376
9,560
1,5793
2
23,999
2
39,179
1
29,327
0
0
0
0
7
29
199
853
2,771
7,580
17,888
3
N Addition of NTCs 1 to 13
1
S R
26,243
0
0
0
0
7
32
223
787
2,631
6,872
15,691
4
24,494
0
0
0
0
5
34
223
842
2,640
6,474
14,276
5
Table 5. The residual errors when using different NTCs
23,900
0
0
0
0
4
41
197
786
2,548
6,352
13,972
6
23,783
0
0
0
0
7
39
191
803
2,540
6,456
13,747
7
23,749
0
0
0
0
4
39
193
793
2,484
6,345
13,891
8
23,637
0
0
0
1
2
40
223
807
2,472
6,338
13,754
9
23,811
0
0
0
0
4
37
189
829
2,491
,463
13,798
10
23,655
0
0
0
0
2
44
210
794
2,522
6,368
13,715
11
23,835
0
0
0
1
4
35
180
835
2,546
6,294
13,940
12
23,883
0
0
0
1
4
46
193
785
2,622
6,583
13,649
13
Performance of Soft Viterbi Decoder enhanced with ...
89
90
Mathematical Theory and Applications of Error Correcting Codes
Figure 5 shows the graphical interpretation of the data above. There are no significant changes above 6 NTCs. Again, in the 6 NTCs, the residual error is approximately to zero as the signal to noise ratio in dB increases. This tells that even if the storage media are highly affected by the errors, there is a higher possibility of being able to retrieve the stored information.
Figure 5. The behaviours of the NTCs in soft decision Viterbi decoding.
RESULT AND DISCUSSION This section describes the performance improvement of the error recovery in storage media between the Normal Soft and Hard Viterbi Algorithm (i.e. Zero NTCs) against Enhanced Soft Viterbi Algorithm, Enhanced Hard Viterbi against Enhanced Soft Viterbi algorithm, and the Enhanced Soft Viterbi algorithm against Reed Solomon which is commonly used in the storage media. For both algorithms same bit stream for encoding and decoding are used. Figure 6 shows the Normal Hard Viterbi, Normal Soft Viterbi, Hard Viterbi Algorithm enhanced with Non-Transmittable Codewords and Soft Viterbi Algorithm Decoder enhanced with Non-Transmittable Codewords with 6 NTCs. Figure 7 shows the Reed Solomon for different values of z which z = 1 was used in the comparison with the Enhanced Soft decision Viterbi Decoding. This was the one which gave the best performance compared to the other. Figure 6 shows that when Signal to Noise Ratio (SNR) is equal to one, it means that signal strength is equal to noise strength. Looking at this point, you will find out that out of one million bits corrupted, the Enhanced Algorithm is able to recover the data up to 98%. The overall performance is that out of one million bit this algorithm is able to correct up to 97%. It is
Performance of Soft Viterbi Decoder enhanced with ...
91
obvious that if the storage media are corrupted, there is a higher possibility of recovering the data by using this algorithm. Figure 5 shows that when the SNR is 6 dB then the algorithm can correct error by 100%. Hence the reliability of the storage media can be maintained.
Figure 6. Viterbi decoding hard, soft, enhanced hard and enhanced soft.
Figure 7. Reed Solomon algorithm for different message length.
Table 6 shows the residual error of one million bits sent for the Normal Soft Viterbi (SV) and the Table 6. Percentage improvement of the SVAD-NTCs against the Normal Soft Viterbi (SV) algorithm SNR
SV residual error after decoding
SVAD-NTCs residual error after decoding
Data error recovery improvement (SV vs SVAD-NTCs)
Percentage (%) improvement of SVAD-NTCs over SV
1
92,464
13,972
78,492
84.89
2
41,575
6,352
35,223
84.72
3
14,033
2,548
11,485
81.84
92
Mathematical Theory and Applications of Error Correcting Codes
4
3,555
786
2,769
77.89
5
652
197
455
69.79
6
66
41
25
37.88
7
3
4
−1
0
8
1
0
1
0
9
0
0
0
0
10
0
0
0
0
11
0
0
0
0
1,52,349
23,900
1,28,449
84.31
Residual error for the Soft Viterbi Algorithm Decoder enhanced with Non-Transmittable Codewords (SVAD-NTCs). The aim was to compare and see the percentage improvement in the storage media for the error correction when we use the NTCs technique. Thus, Table 6 shows the data error recovery improvement from normal Soft Viterbi and enhanced Soft Viterbi and its percentage improvement. The results show that there are 1,28,449 data error recovery improvement that meaning that the enhanced Soft Viterbi Algorithm with 6NTCs reduces residual error by 84.31% from the normal Soft Viterbi Algorithms. This is a good indication for the storage media devices. Table 7 shows the residual error of one million bits sent for the normal Hard Viterbi (HV) and the residual error for the SVAD-NTCs. The aim was to compare and see the percentage improvement in the storage media for the error correction when we use the NTCs technique. Thus, Table 7 shows the data error recovery improvement from normal Hard Viterbi and enhanced Soft Viterbi and its percentage improvement. The results show that there are 3,71,880 data error recovery improvement; meaning that the enhanced Soft Viterbi Algorithm with 6NTCs reduces residual error by 94% from the normal Hard Viterbi algorithms. This is a good indication for the storage media devices. Table 7. Percentage improvement of the SVAD-NTCs against the Normal Hard Viterbi (HV) algorithm SNR
HV residual SVAD-NTCs reerror after sidual error after decoding decoding
Data error recovery improvement (HV vs SVAD-NTCs)
Percentage (%) improvement of SVADNTCs over SV
1
1,81,498
13,972
167,526
92.3
2
1,15,921
6,352
109,569
94.5
3
60,996
2,548
58,448
95.8
Performance of Soft Viterbi Decoder enhanced with ... 4
26,186
786
25,400
97
5
8,524
197
8,327
97.7
6
2,096
41
2,055
98
7
469
4
465
99.8
8
79
0
78
100
9
10
0
10
100
10
1
0
1
100
11
0
0
0
0
3,95,780
23,900
3,71,880
94
93
Table 8 shows the residual errors of one million bits sent for the Reed Solomon (RS) and the residual error for the SVAD-NTCs. The aim was to compare and see the percentage improvement in the storage media for the error correction when Reed Solomon Algorithm and the Soft Viterbi algorithm enhanced with NTCs technique are used. Table 7 shows data error recovery improvement from Reed Solomon and enhanced Soft Viterbi and its percentage improvement. The results show that there are 1,92,978 data error recovery improvement that meaning that the enhanced Soft Viterbi Algorithm with 6NTCs reduces residual error by 88.98% from the Reed Solomon Algorithms. These results are good and show that there is a big improvement when the enhanced Soft Viterbi Algorithm is used as compared to the Reed Solomon Algorithm. This is a good improvement to the storage media devices. Table 8. Percentage improvement of the SVAD-NTCs against the Reed Solomon (RS) algorithm SNR
RS residual errors after decoding
SVAD-NTCs reData error recovery sidual errors after improvement (RS vs decoding SVAD-NTCs)
Percentage (%) improvement SVAD-NTCs over RS
1
78,933
13,972
64,961
82.30
2
56,335
6,352
49,983
88.72
3
37,208
2,548
34,660
93.15
4
22,744
786
21,958
96.54
5
12,274
197
12,077
98.39
6
6,035
41
5,994
99.32
7
2,350
4
2,346
99.83
8
760
0
760
100.00
94 9
Mathematical Theory and Applications of Error Correcting Codes 212
0
212
100.00
10
22
0
22
100.00
11
5
0
5
100.00
2,16,878
23,900
1,92,978
88.98
Table 9 shows the residual error of one million bits sent for the Hard Viterbi Algorithm Decoder enhanced with Non-Transmittable Codewords (HVADNTCs) and the residual error for the SVAD-NTCs. The aim was to compare and see the percentage improvement in the storage media for the error correction when we use HVAD-NTCs and the SVAD-NTCs. Table 9 shows data error recovery improvement from HVAD-NTCs and SVAD-NTCs and its percentage improvement. The results show that there are 49,099 data error recovery improvement that meaning that the enhanced Soft Viterbi Algorithm with 6NTCs reduces residual error by 67.26% from the Hard Viterbi Algorithms enhanced with 6NTCs. These are good results which show that there is a big improvement when we use the enhanced Soft Viterbi Algorithm Decoder as compared to the enhanced Hard Viterbi Algorithm Decoder which can be implemented to improve the reliability of the storage media. Table 9. Percentage improvement of the SVAD-NTCs against the HVAD-NTCs SNR
SVAD-NTCs residual errors after decoding with NTCs (6NTCs)
HVAD-NTCs residual errors after decoding with NTCs (6NTCs)
Data error recovery improvement (HVAD-NTCs vs. SVAD-NTCs)
Percentage (%) improvement of SVAD-NTCs over HVADNTCs
1
13,972
34,407
20,435
59.39
2
6,352
20,417
14,065
68.89
3
2,548
10,650
8,102
76.08
4
786
4,824
4,038
83.71
5
197
1,985
1,788
90.08
6
41
571
530
92.82
7
4
127
123
96.85
8
0
16
16
100.00
Performance of Soft Viterbi Decoder enhanced with ... 9
0
2
2
100.00
10
0
0
0
NIL
11
0
0
0
NIL
23,900
72,999
49,099
67.26
95
CONCLUSION The Soft Viterbi Algorithm Decoder enhanced with Non-Transmittable Codewords showed a remarkable improvement in correcting the errors in the storage media. Out of one million bits which were encoded, the algorithm was able to correct up to 97% which is close to 100% efficiency. When the SNR is equal to one dB, it means that the signal strength is the same as noise strength. If the algorithm is able to correct by 98%, it means there is high possibility for storage media to retrieve the corrupted data. When the SNR is equal to 6 dB and above, the algorithm is able to correct the error by 100%. This means there is a percentage at which when the storage media are corrupted, the algorithm will be able to recover the data by 100% and hence increasing data reliability in storage media. Within all algorithms compared, the enhanced Soft Viterbi Algorithm specifies to be the best. It reduces the error post decoding by 84.31% from the normal Soft Viterbi Algorithms, 88.98% from Reed Solomon Algorithm and 67.26% from the Enhanced Hard Viterbi Algorithm. Further researches on application of Soft Viterbi Algorithm enhanced with NTCs on storage media are encouraged to improve data reliability.
FUNDING This research is part of first author PhD work which was supported by a Grant from Tanzania Government through Nelson Mandela African Institution of Science and Technology.
96
Mathematical Theory and Applications of Error Correcting Codes
REFERENCES 1.
2. 3.
4.
5. 6. 7.
8. 9.
10.
11.
12.
13.
Andruszkiewicz, G., Davis, M. H., & Lleo, S. (2014). Estimating animal spirits: Conservative risk calculation. Quantitative Finance Letters, 2, 14–21.10.1080/21649502.2014.946234 Becker, C. J. , & Traylor, K. B. (Ed.). (2012). Error correcting Viterbi decoder. Google Patents. [ Cartea, Á. & Jaimungal, S. (2013). Modelling asset prices for algorithmic and high-frequency trading. Applied Mathematical Finance, 20, 512–547.10.1080/1350486X.2013.771515 Coughlin, T. M. (2015). 2014 Survey summary for storage in professional media and entertainment. SMPTE Motion Imaging Journal, 124, 19–24.10.5594/j18634 Coughlin, T, & Handy, J. (2008). Digital storage in consumer electronic report Cover, T. M, & Thomas, J. A. (2012). Elements of information theory. Hoboken: John Wiley & Sons. [G Hassan, K. , Michael, K. , & Mrutu, S. I. (2015). Forward error correction for storage media: An overview. International Journal of Computer Science and Information Security, 13, 32. Jiang, Y. (2010). A practical guide to error-control coding using MATLAB. Norwood: Artech House. Katta, K. (2014). Design of convolutional encoder and Viterbi decoder using MATLAB. International Journal for Research in Emerging Science and Technology, 1 (7), 10–15. Marazin, M. , Gautier, R. , & Burel, G. (2011). Blind recovery of k/n rate convolutional encoders in a noisy environment. EURASIP Journal on Wireless Communications and Networking , 2011 , 1–9. Mousa, A. S. , Taman, A. , & Mahmoud, F. (2015). Implementation of soft decision viterbi decoder based on a digital signal processor. International Journal of Engineering Research and Technology, 4 (6), 450–454. Mrutu, S. I. , Sam, A. , & Mvungi, N. H. (2014a). Forward error correction convolutional codes for RTAs’ networks: An overview. International Journal of Computer Network and Information Security, 6 , 19.10.5815/ijcnis Mrutu, S. I. , Sam, A. , & Mvungi, N. H. (2014b). Trellis analysis of
Performance of Soft Viterbi Decoder enhanced with ...
14.
15.
16. 17.
18. 19.
20.
21.
22.
97
transmission burst errors in Viterbi decoding. International Journal of Computer Science and Information Security, 7 (6), 19–27. Mrutu, S. I. , Sam, A. , & Mvungi, N. H. (2014c). Assessment of nontransmittable codewords enhancement to Viterbi Algorithm Decoding. International Journal of Computer Science and Information Security, 12 (9), 13–18. Peters, E. C. , Rabinowitz, S. , & Jacobs, H. R. (Eds.). (2006). Computer system and process for transferring multiple high bandwidth streams of data between multiple storage units and multiple applications in a scalable and reliable manner. Google Patents. Pinheiro, E. , Weber, W.-D. , & Barroso, L. A. (2007). Failure trends in a large disk drive population. FAST, 7 , 17–23. Sargent, G., Bimbot, F. , & Vincent, E. (2011). A regularity-constrained Viterbi algorithm and its application to the structural segmentation of songs. In International Society for Music Information Retrieval Conference (ISMIR), inria-00616274. [ Sklar, B. (2001). Digital communications. Upper Saddle River, NJ: Prentice Hall. [ Sood, A. , & Sah, N. (2014). Implementation of forward error correction technique using Convolutional Encoding with Viterbi Decoding. International Journal of Engineering and Technical Research, 2 (5), 121–124. Takano, W. , Ishikawa, J. , & Nakamura, Y. (2015). Using a human action database to recognize actions in monocular image sequences: Recovering human whole body configurations. Advanced Robotics, 29 , 771–784.10.1080/01691864.2014.996604 Titsias, M. K. , Holmes, C. C. , & Yau, C. (2016). Statistical inference in hidden markov models using k-segment constraints. Journal of the American Statistical Association, 111 , 200– 215.10.1080/01621459.2014.998762 Wicker, S. B. (1995). Error control systems for digital communication and storage (vol. 1). Englewood Cliffs: Prentice Hall.
SECTION 3
LINEAR CODES: CYCLIC AND CONSTACYCLIC CODES
CHAPTER
6
The Structure of One Weight Linear and Cyclic Codes Over r2 × (2 + u2)s
Ismail Aydo˘gdu Department of Mathematics, Faculty of Arts and Sciences, Yıldız Technical University, ˙Istanbul, Turkey
ABSTRACT Inspired by the 2 4-additive codes, linear codes over r 2 × ( 2 + u 2) s have been introduced by Aydogdu et al. more recently. Although these family of codes are similar to each other, linear codes over r2 × ( 2 + u 2) s have some advantages compared to 2 4-additive codes. A code is called constant weight (one weight) if all the nonzero codewords have the Citation: AYDOGDU, Ismail. “The structure of one weight linear and cyclic codes over ”. An International Journal of Optimization and Control: Theories & Applications (IJOCTA), [S.l.], v. 8, n. 1, p. 92-101, dec. 2017. ISSN 2146-5703. http://dx.doi.org/10.11121/ijocta.01.2018.00512 Copyright: © This work is licensed under a Creative Commons Attribution 4.0 International License. The authors retain ownership of the copyright for their article, but they allow anyone to download, reuse, reprint, modify, distribute, and/or copy articles in IJOCTA, so long as the original authors and source are credited. To see the complete license contents, please visit. http://creativecommons.org/licenses/by/4.0/
102
Mathematical Theory and Applications of Error Correcting Codes
same weight. It is well known that constant weight or one weight codes have many important applications. In this paper, we study the structure of one weight 2 2[u] - linear and cyclic codes. We classify one weight 2 2 [u]-cyclic codes and also give some illustrative examples.
INTRODUCTION In algebraic coding theory, the most important class of codes is the family of linear codes. A linear code of length n is a subspace C of a vector space Fn q where Fq is a finite field of size q. When q = 2 then we have linear codes over F2 which are called binary codes. Binary linear codes have very special and important place all among the finite field codes because of their easy implementations and applications. Beginning with a remarkable paper by Hammons et al. [1], interest of codes over variety of rings have been increased. Such studies motivate the researchers to work on different rings even over other structural algebras such as groups or modules. A -submodule of n4 is called a quaternary linear code. The structure of 4 binary linear codes and quaternary linear codes have been studied in details for the last two decades. The reader can see some of them in [2–4]. In 2010, Borges et al. introduced a new class of error correcting codes over the ring α2 × β4 called additive codes that generalizes the class of binary linear codes and the class of quaternary linear codes in [5]. A 2 4-additive code C is defined to be a subgroup of α 2 × β 4 where α+2β = n. If β = 0 then 2 4-additive codes are just binary linear codes, and if α = 0, then 2 4- additive codes are the quaternary linear codes over 4. 2 4-additive codes have been generalized to 2 2 s -additive codes in 2013 by Aydogdu and Siap in [6], and recently this generalization has been extended to p r p s -additive codes, for a prime p, by the same authors in [7]. Later, cyclic codes over α × β 4 have been introduced in [8] in 2014 and more recently, in [9], one 2 weight codes over such a mixed alphabet have been studied. A code C is said to be one weight code if all the nonzero codewords in C have the same Hamming weight where the Hamming weight of any string is the number of symbols that are different from the zero symbol of the alphabet used. In [10], Carlet determined one weight linear codes over 4 and in [11], Wood studied linear one weight codes over m. Constant weight codes are very useful in a variety of applications such as data storage, fault-tolerant circuit design and computing, pattern generation for circuit testing, identification coding, and optical overlay networks [12].
The structure of one weight linear and cyclic codes over r2 × (2 + u2)s
103
Moreover, the reader can find the other applications of constant weight codes; determining the zero error decision feedback capacity of discrete memory less channels in [13], multiple access communications and spherical codes for modulation in [14, 15], DNA codes in [16, 17], powerline communications and frequency hopping in [18]. Another important ring of four elements other than the ring 4, is the ring 2 + u 2 = R = {0, 1, u, 1 + u} where u 2 = 0. For some of the works done in this direction we refer the reader to [19–21]. It has been shown that linear and cyclic codes over this ring have advantages compared to the ring 4. For an example; the finite field GF(2) is a subring of the ring R. So factorization over GF (2) is still valid over the ring R. The Gray image of any linear code over R is always a binary linear code which is not always the case for 4. In this work, we are interested in studying one weight codes over . This family of codes are special subsets of which their all nonzero code words have the same weight. Since the structure of one weight binary linear codes were well classified by Bonisoli [22], we conclude some results that coincides with the results in [22] for 2 2[u]-linear codes, and we classify cyclic codes over and also we give some one weight linear and cyclic code examples. Furthermore, we look at the Gray (binary) images of one weight cyclic codes over Z r 2 × Rs and we determine their parameters.
PRELIMINARIES Let R = 2 + u 2 = {0, 1, u, 1 + u} be the four element ring with u 2 = 0. It is easily seen that the ring 2 is a subring of the ring R. Then let us define the set
But we have a problem here, because the set 2 2[u] is not welldefined with respect to the usual multiplication by u ∈ R. So, we must define a new method of multiplication on 2 2[u] to make this set as an R-module. Now define the mapping
104
Mathematical Theory and Applications of Error Correcting Codes
which means; η(0) = 0, η(1) = 1, η(u) = 0 and η(1 + u) = 1. It can be easily shown that η is a ring homomorphism. Furthermore, for any element e ∈ R, we can also define a scalar multiplication on 2 2[u] as follows. This multiplication can be extended to ..., ar−1,b0, b1, ..., bs−1) ∈ as, Lemma 1. above.
for e ∈ R and v = (a0, a1,
is an R−module under the multiplication defined
Definition 1. A non-empty subset C of code if it is an Rsubmodule of .
is called a
-linear
Now, take any element a ∈ R, then there exist unique p, q ∈ 2 such that a = p + uq. Also note that the ring R is isomorphic to 2 2 as an additive group. Therefore, any linear code C is isomorphic to an abelian group of the form , where k0, k2 and k1 are nonnegative integers. Now define the following sets.
where if = Rs then b is called free over Rs .
Therefore, denote the dimension of C0, C1 and C F s as k0, k2 and k1 respectively. Under these parameters, we say that such a -linear code C is of type (r, s; k0, k1, k2). -linear codes can be considered as binary codes under a special Gray map. For (x, y) ∈ , where (x, y) = (x0, x1, . . . , xr−1, y0, y1, . . . , ys−1) and yi = pi+uqi the Gray map is defined as follows.
(1) where n = r + 2s. The Hamming distance between two strings x and y of the same length
The structure of one weight linear and cyclic codes over r2 × (2 + u2)s
105
over a finite alphabet Σ denoted by d(x, y) is defined as the number of positions at which these two strings differ. The Hamming weight of a string x over an alphabet Σ is defined as the number of its nonzero symbols in the string. More formally, the Hamming weight of a string is wt(x) = |{i : xi ≠ 0}|. Also note that wt(x − y) = d(x, y). The minimum distance of a linear code C, denoted by d(C) is defined by
The Lee distance for the codes over R is the Lee weight of their differences where the Lee weights of the elements of R are defined as wtL(0) = 0, wtL(1) = 1, wtL(u) = 2 and wtL(1 + u) = 1. The Gray map defined above is a distance preserving map which transforms the Lee distance in to -linear code C, the Hamming distance in n 2. Furthermore, for any we have that Φ (C) is a binary linear code as well. This property is not valid for the 2 4-additive codes. And also, we define where v = (v1, v2), wtH(v1) is the Hamming of weight of v1 and wtL(v2) is the Lee weight of v2. If C is a -linear code of type (r, s; k0, k1, k2) then the binary image C = Φ(C) is a binary linear code of length n = r + 2s -linear code. Now, let v = (a0, . . . , and size 2n . It is also called a ar−1, b0, . . . , bs−1), w = (d0, . . . , dr−1, e0, . . . , es−1) ∈ be any two elements. Then we can define the inner product as
According to this inner product, the dual linear code C ⊥ of a -linear code C is also defined in a usual way, Hence, if C is a code.
-linear code, then C ⊥ is also a
-linear
The standard forms of generator and parity-check matrices of a -linear code C are given as follows. Theorem 1. [23] Let C be a -linear code of type (r, s; k0, k1, k2). Then the standard forms of the generator and the parity-check matrices of C are:
106
Mathematical Theory and Applications of Error Correcting Codes
where A, A1, B1, B2, D, S and T are matrices over 2. Therefore, we can conclude the following corollary. Corollary 1. If C is a -linear code of type (r, s; k0, k1, k2) then C ⊥ is of type (r, s; r − k0, s − k1 − k2, k2). The weight enumerator of any k2) is defined as
-linear code C of type (r, s; k0, k1,
where, n = r + 2s. Moreover, the Mac Williams relations for codes over can be given as follows. Theorem 2. [23] Let C be a −linear code. The relation between the weight enumerators of C and its dual is
We have given some information about the general concept of codes over . To make reader understanding the paper easily we give the following example. Example 1. Let C be a linear code over generator matrix.
with the following
We will find the standard form of the generator matrix of C and then using this standard form, we find the generator matrix of the linear dual code C ⊥ and also we determine the types of both C and its dual. Now, applying elementary row operations to above generator matrix, we have the standard form as follows.
The structure of one weight linear and cyclic codes over r2 × (2 + u2)s
107
Since, G is in the standard form we can write this matrix as
Hence, with the help of Theorem 1 the paritycheck matrix of C is
Therefore, • •
C is of type (3, 4; 2, 1, 0) and has 2 24 1 = 16 codewords. C⊥ is of type (3, 4; 1, 3, 0) and has 2 14 3 = 128 codewords.
•
WC(x, y) = x 11 + 3x 8y 3 + x 7y 4 + 2x 6y 5 + 4x 5y 6 + x 4y 7 + 2x 3y 8 + 2x 2y 9.
• •
The Gray image Φ(C) of C is a [11, 4, 3] binary linear code. Φ(C ⊥) is a [11, 7, 2] binary linear code.
108
Mathematical Theory and Applications of Error Correcting Codes
THE STRUCTURE OF ONE WEIGHT CODES
-LINEAR
In this part of the paper, we study the structure of one weight codes over . Since the binary (Gray) images of -linear codes are always linear, our results about the one weight -linear codes will coincide with the results of the paper [22]. So, in this section of the paper we will prepare for Section 4 and also we give some fundamental definitions and illustrative examples of one weight -linear codes. Definition 2. Let C be a -linear code. C is called a one (constant) weight code if all of its nonzero code words have the same weight. Furthermore, if such weight is m then C is called a code with weight m. Definition 3. Let c1, c2, e1, e2 be any four distinct code words of a -linear code C. If the distance between c1 and e1 is equal to the distance between c2 and e2, that is, d(c1, e1) = d(c2, e2), then C is said to be equidistant. with all nonzero Theorem 3. [22] Let C be a [n, k] linear code over code words of the same weight. Assume that C is nonzero and no column of a generator matrix is identically zero. Then C is equivalent to the λ-fold replication of a simplex (i.e., dual of the Hamming) code. Corollary 2. Let C be an equidistant - linear code with distance m. Then C is a one weight code with weight m. Moreover, the binary image Φ(C) of C is also a one weight code with weight m.
Example 2. It is worth to note that the dual of a one weight code is not necessarily a one weight code. Let C be a -linear code of type (2, 2; 0, 1, 0) with C = h(1, 1|1 + u, 1 + u)i. Then C = {(0, 0|0, 0),(1, 1|1 + u, 1 + u),(1, 1|1, 1),(0, 0|u, u)} and C is a one weight code with weight m = 4. On the other hand, the dual code C ⊥ is generated by h(1, 0|u, 0),(0, 1|u, 0),(0, 0|1, 1)i and of type (2, 2; 2, 1, 0). But d(C ⊥) = 2 and C ⊥ is not a one weight code. Remark 1. The dual code for length greater than 3 is never a one weight code.
Example 3. Let C be a -linear code with the standard form of then C is of type (3, 2; 1, 1, 0) and one the generator matrix weight code with weight 4. Furthermore, Φ(C) is a binary linear code with parameters [7, 3, 4]. Here, note that the binary image of C is the binary simplex code of length 7, which is the dual of the [7, 4, 3] Hamming code.
The structure of one weight linear and cyclic codes over r2 × (2 + u2)s
109
Now, we give a theorem which gives a construction of one weight codes over . Corollary 3. Let C be a one weight - linear code of type (r, s; k0, k1, k2) and weight m. Then, a one weight code of type (γr, γs; k0, k1, k2) with weight γm exists, where γ is a positive integer. Definition 4. Let C be a -linear code. Let Cr (respectively Cs) be the punctured code of C by deleting the coordinates outside r (respectively s). If C = Cr × Cs then C is called separable. Corollary 4. There do not exist separable one weight codes.
-linear
Proof. Since Φ(Cr × Cs) = Φ(Cr) × Φ(Cs), the proof is obvious.
Corollary 5. If C is a -linear code of type (r, s; k0, k1, k2) with no all zero columns in the generator matrix of C. Then the sum of the weights of all code words of C is equal to . Proof. From [22], since the sums of the weights of a binary linear code [n, k] is n2 k−1 , the sum of the all code words of C is
Corollary 6. Let C be a one weight - linear code of type (r, s; k0, k1, k2) and weight m. If there is no zero columns in the generator matrix of C, then; i) m = α 2 (k0+2k1+k2)−1 where α is a positive integer satisfying (r + 2s) = α (2 k0+2k1+k2 – 1) . In addition, if m is an odd integer, then r is also odd an ii) d(C ⊥) ≥ 2. Also, d(C ⊥) ≥ 3 if and only if α = 1 iii) for α = 1, if |C| ≥ 4 then d(C ⊥) = 3
We have known from the above corollary that if C is a one weight -linear code of type (r, s; k0, k1, k2) and weight m then there is a positive integer α such that m = α 2 (k0+2k1+k2)−1 , so the minimum distance for a one weight - linear code must be even. In the following, we characterize the structure of -linear codes Theorem 4. Let C be a one weight with generator matrix G and weight m
-linear code over
Mathematical Theory and Applications of Error Correcting Codes
110
i)
If v = (a|b) is an any row of G, where a = (a0, . . . , ar−1) ∈ and b = (b0, . . . , bs−1) ∈ Rs , then the number of units(1 or 1 + u) in
b is either zero . ii) If v = (a|b) and w = (c|d) are two distinct rows of G, where b and d are free over Rs , then the coordinate positions where b has units (1 or 1 + u) are the same that the coordinate positions where d has units iii) If v = (a|b) and w = (c|d) are two distinct rows of G, where b and d are free over Rs , then |{j : bj = dj = 1 or 1 + u}| = |{j : bj = 1, dj = 1+u or bj = 1+u, dj = 1}| = . Proof. i) The weight of v = (a|b) is wt(v) = wtH(a) + wtL(b) = m. Since C is linear uv = (0|ub) is also in C then, if ub = 0 then b does not contain units. If ub ≠ 0, then wt(v) = m = 0 + wtL(ub) and therefore, wtL(ub) = 2|{j : bj = 1 or 1 + u}| = m. Hence, the number of units in b is ii)
Multiplying v and w by u we have, uv = (0|ub) and uw = (0|ud). If v and w have units in the same coordinate positions then we get uv + uw = 0. So, assume that they have some units in different coordinates. Since C is a one weight code with weight m, if uv + uw ≠ 0 then the number of coordinates where b and d have units in different places must be
iii)
.
. To obtain this, the number of
coordinates where {bj = 1 = dj} and {bj = 1 + u = dj} has to be , and in all other coordinates where {bj = 1 or 1 + u} we need {dj = 0 or u}, and also in all other coordinates where {bj = 0 or u} we need {dj = 1 or 1+u}. Hence, consider the vector v+ (1+u)w. This vector has the same weight as v + w in the first r coordinates but for the last s coordinates, it has u ′ s in the coordinates where {bj = 1 = dj} and {bj = 1 + u = dj}, so its weight is greater than m. This contradiction gives the result. Let x = v+w and y = v+ (1+u)w be two vectors in C. The binary parts of these two vectors are the same, and for the coordinates over Rs we know from ii) that v and w have units in the same coordinate positions, and for the all other coordinates in Rs , the values of x and y are the same. Therefore, the sum of the weights of the units in v must be same in x and y. So, they also have the same number of coordinates with u. But this is only possible if |{j : bj = dj = 1 or 1 + u}| = |{j : bj = 1, dj = 1 + u or bj = 1 + u, dj =
The structure of one weight linear and cyclic codes over r2 × (2 + u2)s
111
1}|. We also know from i) that the number of units in v is , so we have the result Theorem 5. Let C be a one weight code of type (r, s; k0, k1, k2). Then k1 ≤ 1 and C has the following standard form of the generator matrices. If k1 = 0 the
where s, a, b1, b2 are vectors over 2. Proof. From Theorem 4 i), we know that any two distinct free vectors have their units in the same coordinate positions. So, if we add the first free row of the generator matrix to the other rows, we have only one free row in the generator matrix. Hence, k1 ≤ 1 and considering this and using the standard form of the generator matrix for a -linear code C given in Theorem 1, we have the result.
ONE WEIGHT
-CYCLIC CODES
In this section, we study the structure of one weight -cyclic codes. At the beginning, we give some fundamental definitions and theorems about -cyclic codes. This information about -cyclic codes was given in [24], with details. Definition 5. An R-sub module C of is called a -cyclic code if for any code word v = (a0, a1, . . . , ar−1, b0, b1, . . . , bs−1) ∈ C, its cyclic shift is also in C. Any code word c = (a0, a1, . . . , ar−1, b0, b1, . . . , bs−1) ∈ be identified with a module element such that
can
112
Mathematical Theory and Applications of Error Correcting Codes
in . This identification gives a one-to-one correspondence between elements in and elements in Rr,s. Theorem 6. [24] Let C be a -cyclic code in Rr,s. Then we can identify C uniquely as C = h(f(x), 0),(l(x), g(x) + ua(x))i, where f(x)|(x r − 1) ( mod 2), and a(x)|g(x)|(x s − 1) (mod 2), and l(x) is a binary polynomial satisfying deg(l(x)) < deg(f(x)),
Considering the theorem above, the type of C = h(f(x), 0),(l(x), g(x) + ua(x))i can be written in terms of the degrees of the polynomials f(x), a(x) and g(x). Let t1 = deg f(x), t2 = deg g(x) and t3 = deg a(x). Then C is of type ([24])
Corollary 7. If C is a one weight cyclic code generated by (l(x), g(x) + ua(x)) ∈ Rr,s with weight m then m = 2s.
Proof. We know from Theorem 5 that if C is a one weight -linear code then k1, which generates the free part of the code, is less than or equal to 1. So, in the case where C is cyclic, it means that s − t2 ≤ 1, where t2 = deg g(x). Therefore we have deg g(x) = s − 1 and the polynomial g(x) + ua(x) generates the vector with all unit entries and length s. If we multiply the whole vector (length= r+s) by u, then we have a vector with all entries 0 in the first r coordinates and all coordinates u in the last s coordinates. So the weight of this vector is 2s. Hence the weight of C must be 2s. Theorem 7. [24] Let C = h(f(x), 0),(l(x), g(x) + ua(x))i be a cyclic code in Rr,s where f(x), l(x), g(x) and a(x) are as in Theorem 6 and f(x)hf (x) = x r −1, g(x)hg(x) = x s − 1, g(x) = a(x)b(x). Let
The structure of one weight linear and cyclic codes over r2 × (2 + u2)s
113
and
Then S = S1 ∪S2 ∪S3 forms a minimal spanning set for C as an R-module.
Let C = h(f(x), 0),(l(x), g(x) + ua(x))i be a one weight cyclic code in Rr,s. Consider the code words (v, 0) ∈ h(f(x), 0)i and (w1, w2) ∈ h(l(x), g(x) + ua(x))i. Since C is a one weight code, wt(v, 0) = wt(w1, w2). Further, since C is an R-submodule, u(w1, w2) = (0, uw2) ∈ C and wt(v, 0) = wt(0, uw2). Moreover, (v, uw2) ∈ C because of the linearity of C. But it is clear that wt(v, uw2) ≠ wt(v, 0) and wt(v, uw2) ≠ wt(0, uw2). Hence, h(f(x), 0)i cannot generate a one weight code. Now, let us suppose that C = h(l(x), g(x) + ua(x))i is a one weight cyclic code in Rr,s. We know from Corollary 7 that deg g(x) = s−1, m = 2s and g(x) generates a vector of length s with all unit entries. Therefore, l(x) also must generate a vector over 2 with weight s. Hence, to generate such a cyclic one weight code we have two different cases; r = s and r > s. If r = s then, to generate a vector with weight s, the degree of l(x) must be s − 1. So, (l(x), g(x) + ua(x)) generates the code word Further, if we multiply (l(x), g(x) + ua(x)) by hg(x) we get (hg(x)l(x), uhg(x)a(x)) and it generates code words of order 2. Since r = s and the degrees of the polynomials l(x) and g(x) are s−1 we have hg = x + 1 and hg(x)l(x) = 0. Hence, uhg(x)a(x) must generate a vector with weight 2s, i.e, hg(x)a(x) must generate a vector of length s with all unit entries. This means that
Mathematical Theory and Applications of Error Correcting Codes
114
Hence we get a(x) = . But, since we always assume that s is an odd integer, a(x) is not a factor of (x s − 1) and this contradicts with the assumption a(x)|(x s − 1). So, we cannot allow ua(x)hg(x) to generate a vector, i.e, we must always choose a(x) = g(x) to obtain ua(x)hg(x) = 0. So in the case where C is a one weight cyclic code generated by (l(x), g(x) + ua(x)) in Rr=s,s, we only have C is a -cyclic code of type (s, s; 0, 1, 0) with weight m = 2s. In the second case we have r > s. We know that C is a one weight cyclic generates a vector with exactly s code with weight m = 2s and g(x) = nonzero and all unit entries. Let v = (v1, v2) be a code word of C such that v1 =< l(x) > and v2 =< g(x) + ua(x) >. We can write v as
where then we have
. Since C is an R-submodule we can multiply v by u,
Let w = (w1, w2) be another codeword of C generated by (hg(x)l(x), ua(x)hg(x)). Since C is a one weight code of weight 2s, we can write w = .
Since w+uv must be a codeword in C, we have
Therefore, wt(w + uv) = 2s − 2p + 2s − 2p = 4s − 4p and since C is a one weight code with m = 2s,
The structure of one weight linear and cyclic codes over r2 × (2 + u2)s
115
But this contradicts with our assumption, that is, s is an odd integer. Consequently, for r > s and g(x) ≠ 0 there is no one weight -cyclic code. Under the light of all this discussion, we can give the following proved theorem. Theorem 8. Let C be a -cyclic code in Rr=s,s generated by (l(x), g(x) + ua(x)) with deg l(x) = deg a(x) = deg g(x) = s − 1. Then C is a one weight cyclic code of type (r, s; 0, 1, 0) with weight m = 2s. Furthermore, there do not exist any other one weight -cyclic code with g(x) 6= 0. Example 4. Let C = h(l(x), g(x) + ua(x))i be a cyclic code in R7,7 with l(x) = g(x) = a(x) = +x 4 + x 5 + x 6 . Hence, C is a one weight code with weight m = 14 and the following generator matrix.
Furthermore, the dual cyclic code C ⊥ has the following generator mat
It is obvious from this matrix that C ⊥ is not a one weight code. However, -cyclic code of type (7, 7; 7, 6, 0) and its image under the Gray it is a map is a binary cyclic code with the parameters [21, 19, 2].
EXAMPLES OF ONE WEIGHT
-CYCLIC CODES
In this part of the paper, we give some examples of one weight -cyclic codes. Furthermore, we look at their binary images under the Gray map that we defined in (1). Actually, according to the results of [22], any binary linear one (constant) weight code with no zero column is equivalent to a λ-fold replication of a simplex code. Hence, the examples of one weight -cyclic codes that will be given in this section are all λ-fold replication of simplex code Sk. Therefore, any such code has length n = λ2 k − 1, dimension k and weight (or minimum distance) d = λ2 k−1 . It is also well
116
Mathematical Theory and Applications of Error Correcting Codes
known that a binary simplex code is cyclic in the usual sense. If the minimum distance of any code C get the possible maximum value according to its length and dimension, then C is called optimal (distance optimal) or good parameter code. For an example, the binary image of a dual code in Example 4 has the parameters [21, 19, 2] which are optimal. Let C be a -linear code with minimum distance d = 2t + 1, then we say C is a t-error correcting code. Since, the Gray map preserves the distances, Φ(C) is also a t-error correcting code of length r + 2s over 2. Since, |Φ(C)| = |C|, -linear code C. With the we can write a sphere packing bound for a help of usual sphere packing bound in Z2,
we have
If C attains the sphere packing bound above then it is called a perfect code. Let C be a -linear code of type (3, 2; 2, 1, 0) with standard form of the generator matrix
It is easy to check that C attains the sphere packing bound, so C is a perfect code. Moreover, the dual code C ⊥ of C is generated by the matrix
by,
and C ⊥ is a one weight
Plotkin bound for a code over
-linear code with weight m = 4. with the minimum distance d is given
The structure of one weight linear and cyclic codes over r2 × (2 + u2)s
117
If C ⊆ attains the Plotkin bound then C is also an equidistant code [25]. Since any one weight binary linear code is a λ-fold replication of a simplex code and have the parameters [λ(2k − 1), k, λ(2k−1 )], a binary image of any one weight -cyclic code always meet the Plotkin bound. Finally, we will give the following examples of one weight -cyclic codes. We also determine the parameters of the binary images of these one weight cyclic codes. Further we list some of them in Table 1. Table 1. Some Examples of One Weight
Example 5. Let C be a g(x) + ua(x)) where
-cyclic Codes
-cyclic code in R15,15 generated by (l(x),
Then C is a one weight code with weight m = 24 and following generator matrix
Furthermore, the binary image Φ(C) of C is a [45, 4, 24] code, which is a binary optimal code [26]. Also, it is important to note that Φ(C) is a 3-fold replication of the simplex code S4 of length 15.
Example 6. The -cyclic code C = h(l(x), g(x) + ua(x))i in R9,9 is a one weight code with m = 18, where l(x) = g(x) = a(x) = 1 + x + x 2 + x 3 + x 4 + x 5 + x 6 + x 7 + x 8 . C has the generator matrix of the form,
118
Mathematical Theory and Applications of Error Correcting Codes
where . The Gray image of C is a 9-fold replication of the simplex code S2 of length 3 with the optimal parameters [27, 2, 18]. Example 7. Let C = h(l(x), g(x) + ua(x))i, l(x) = a(x) = 1 + x + x 2 + x 4 , g(x) = x 7 − 1, be a cyclic code in R7,7. Then the generator matrix of C i
C is a one weight code with m = 12 and Φ(C) is a 3-fold replication of the simplex code S3 of length 7 with the parameters [21, 3, 12]
CONCLUSION In this paper, we study the one weight linear and cyclic codes over s where u 2 = 0. We also classify one weight -cyclic codes and present some illustrative examples. We further list some binary linear codes with their parameters which are derived from the Gray images of one weight -cyclic codes
ACKNOWLEDGEMENT The author would like to thank the anonymous reviewers for their careful checking of the paper and the valuable comments and suggestions
The structure of one weight linear and cyclic codes over r2 × (2 + u2)s
119
REFERENCES 1.
2. 3. 4. 5.
6.
7. 8. 9.
10.
11. 12.
13.
Hammons, A. R., Kumar, V., Calderbank, A. R., Sloane, N.J.A. and Sol´e, P. (1994). The 4-linearity of Kerdock, Preparata, Goethals, and related codes. IEEE Trans. Inf. Theory, 40, 301-319. Calderbank, A.R. and Sloane, N.J.A. (1995). Modular and p-adic cyclic codes. Designs, Codes and Cryptog- raphy, 6, 21-35. Greferath, M. and Schmidt, S. E. (1999). Gray isometries for finite chain rings. IEEE Trans. Info. Theory, 45(7), 2522-2524. Honold, T. and Landjev, I. (1998). Linear codes over finite chain rings. In Optimal Codes and Related Topics, 116-126, Sozopol, Bulgaria. Borges, J., Fern´andez-C´ordoba, C., Pujol, J., Rif`a, J. and Villanueva, M. (2010). Z2Z4-linear codes: Generator Matrices and Duality. Designs, Codes and Cryptography, 54(2), 167-179. Aydogdu, I. and Siap, I. (2013). The structure of Z2Z2s−Additive codes: bounds on the minimum distance. Applied Mathematics and Information Sciences(AMIS), 7(6), 2271-2278. Aydogdu, I. and Siap, I. (2015). On prZps -additive codes. Linear and Multilinear Algebra, , 63(10), 2089-2102. Abualrub, T., Siap, I. and Aydin, N. (2014). Z2Z4-additive cyclic codes. IEEE Trans. Inf. Theory, 60(3), 1508-1514. Dougherty, S.T., Liu, H. and Yu, L. (2016). One Weight 24 additive codes. Applicable Algebra in Engineering, Communication and Computing, 27, 123-138. Carlet, C. (2000). One-weight 4-linear codes, In: Buchmann, J., Høholdt, T., Stichtenoth, H., Tapia Recillas,H. (eds.) Coding Theory, Cryptography and Related Areas. 57-72. Springer, Berlin. Wood, J.A.(2002) The structure of linear codes of constant weight. Trans. Am. Math. Soc. 354, 1007-1026. Skachek, V. and Schouhamer Immink, K.A. (2014). Constant weight codes: An approach based on Knuth’s balancing method. IEEE Journal on Selected Areas in Communications, 32(5), 909-918. Telatar, I.E. and Gallager, R.G (1990). Zero error decision feedback capacity of discrete memoryless channels. in BILCON-90: Proceedings of 1990 Bilkent International Conference on New Trends in Communication, Control and Signal Processing, E. Arikan, Ed. Elsevier, 228-233.
120
Mathematical Theory and Applications of Error Correcting Codes
14. Dyachkov, A.G. (1984). Random constant composition codes for multiple access channels. Problems Control Inform. Theory/Problemy Upravlen. Teor. Inform., 13(6), 357-369. 15. Ericson, T. and Zinoviev, V. (1995). Spherical codes generated by binary partitions of symmetric point sets. IEEE Trans. Inform. Theory, 41(1), 107-129. 16. King, O.D. (2003). Bounds for DNA codes with constant GC-content. Electron. J. Combin., 10(1), Research Paper 33, (electronic). 17. Milenkovic, O. and Kashyap, N. (2006). On the design of codes for DNA computing. Ser. Lecture Notes in Computer Science, vol. 3969. Berlin: Springer-Verlag, 100-119. 18. Colbourn, C. J., Kløve, T. and Ling, A. C. H. (2004). Permutation arrays for powerline communication and mutually orthogonal Latin squares. IEEE Trans. Inform. Theory, 50(6), 1289-1291. 19. Abualrub, T. and Siap, I. (2007). Cyclic codes over the rings . Designs Codes and Cryptography, 42(3), 273-287. 20. Al-Ashker, M. and Hamoudeh, M. (2011). Cyclic codes over 749.
. Turk. J. Math., 35, 37-
21. Dinh, H. Q. (2010). Constacyclic codes of length ps over . Journal of Algebra, 324, 940-950. 22. Bonisoli, A. (1984). Every equidistant linear code is a sequence of dual Ham- ming codes. Ars Combin., 18, 181-186. 23. Aydogdu, I., Abualrub, T. and Siap, I. (2015). On -additive codes. International Journal of Computer Mathematics, 92(9), 18061814. 24. Aydogdu, I., Abualrub, T. and Siap, I. (2017). Cyclic and constacyclic codes. IEEE Trans. Inf. Theory, 63(8), 4883-4893. 25. Van Lint, J.H. (1992). Introduction to Coding Theory. Springer-Verlag, New York. 26. Grassl, M., Code tables: Bounds on the parameters of various types of codes. Online database. Available at http://www.codetables.de/
CHAPTER
7
(1 + u)-Constacyclic Codes over Z 4 + uZ 4
Haifeng Yu1, Yu Wang1 and Minjia Shi2 1
Department of Mathematics and Physics, Hefei University, Hefei, China.
2
School of Mathematical Sciences, Anhui University, Hefei, China.
ABSTRACT Constacyclic codes are an important class of linear codes in coding theory. Many optimal linear codes are directly derived from constacyclic codes. In this paper, (1 + u)-constacyclic codes over Z 4 + uZ 4 of any length are studied. A new Gray map between Z 4 + uZ 4 and Z 44 is defined. By means of this map, it is shown that the Z 4 Gray image of a (1 + u)-constacyclic code
Citation: Yu, H., Wang, Y. & Shi, M. “-Constacyclic codes over Z4 + uZ4”. SpringerPlus 5, 1325 (2016). https://doi.org/10.1186/s40064-016-2717-0 Copyright: © 2016 The Author(s). This article is distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
122
Mathematical Theory and Applications of Error Correcting Codes
of length n over Z 4 + uZ 4 is a cyclic code over Z 4 of length 4n. Furthermore, by combining the classical Gray map between Z 4 and F 22, it is shown that the binary image of a (1 + u)-constacyclic code of length n over Z 4 + uZ is a distance invariant binary quasi-cyclic code of index 4 and length 8n. 4 Examples of good binary codes are constructed to illustrate the application of this class of codes. Keywords: Cyclic code, Constacyclic code, Quasi-cyclic code, Gray map
BACKGROUND Recently, several new classes of rings have been studied in connection with coding theory. Many optimal binary linear codes have been obtained from codes over these rings via some Gray map. In Yildiz and Karadenniz (2010a, b), the authors introduced the ring F 2 + uF 2 + vF 2 + uvF 2 and discussed linear and self-dual codes over F 2 + uF 2 + vF 2 + uvF 2. Later, the structures of cyclic codes and (1 + u)-constacyclic codes over F 2 + uF 2 + vF 2 + uvF were studied and many optimal binary linear codes were constructed from 2 such codes in Yildiz and Karadenniz (2011a, b). More generally, cyclic codes over the ring R k were investigated in Dougherty et al. (2012). Although the rings mentioned above are not finite chain rings, they have rich algebraic structures and produce binary codes with large auto orphism groups and new binary self-dual codes. This demonstrates that linear codes over such non-chain rings have been received increasing attention (see Dougherty et al. 2012; Kai et al. 2012; Shi 2014; Shi et al. 2012; Siap et al. 2012; Zhu and Wang 2011). More recently, linear codes over the non-chain ring Z 4 + uZ , where u 2 = 0, have been explored in Yildiz and Karadenniz (2014). The 4 authors defined a linear Gray map from Z 4 + uZ 4 to Z 24 and a non-linear Gray map from Z 4 + uZ 4 to (F 2 + uF 2)2, and used them to successfully construct formally self-dual codes over Z 4 and good non-linear codes over F 2 + uF 2. In Yildiz and Aydin (2014), the structure of cyclic codes over Z + uZ 4 was determined and many new linear codes over Z 4 were obtained 4 from them. Motivated by the works in Yildiz and Aydin (2014) and Yildiz and Karadenniz (2014), we focus on constacyclic codes over Z 4 + uZ 4 and intend to construct good binary codes from such codes. The ring Z 4 + uZ 4 is a finite commutative ring with characteristic 4, where u 2 = 0. The purpose of this paper is to investigate a class of constacyclic codes over this ring, that is, (1 + u)-constacyclic codes over Z 4 + uZ 4. Constacyclic codes over finite commutative rings were first introduced by
(1 + U)-Constacyclic Codes Over Z 4 + Uz 4
123
Wolfmann (1999), where it was proved that the binary image of a linear negacyclic code over Z 4 is a binary cyclic code (not necessarily linear). In Kai et al. (2012), the authors introduced a composite Gray map from F 2 + uF 2 + vF 2 + uvF 2 to F 42 and proved that the image of a (1 + u)-constacyclic code of length n over F 2 + uF 2 + vF 2 + uvF 2 under the Gray map is a distance invariant binary quasi-cyclic code of index 2 and length 4n. It is known that the structure of Z 4 + uZ 4 is similar to that of F 2 + uF 2 + vF 2 + uvF 2. It is natural to ask if there exists a Gray map such that the Gray image of a linear code over Z 4 + uZ 4 has good structure. For this, we introduce a new Gray map from Z 4 + uZ 4 to Z 4, and explore the images of (1 + u)constacyclic codes over Z 4 + uZ 4 under this Gray map.
(1 + U)-CONSTACYCLIC CODES OVER Z 4 + UZ 4
Throughout this paper, let R denote the ring Z 4 + uZ 4 with u 2 = 0. Any element in R can be written as a + bu, where a, b ∊ Z 4. The element a + bu is a unit in R if and only if a is a unit in Z 4. The ring R is a local Frobenius ring, but not a finite chain ring. It has a total of 7 ideals given by , where
A code over R of length n is a nonempty subset of R n , and a code is linear over R of length n if it is an R-submodule of R n . For some fixed unit λ ∊ R, the λ-constacyclic shift τ on R n is the shift τ(c 0, c 1, …, c n−1) = (λc , c 0, …, c n−2). A linear code C of length n over R is λ-constacyclic if the n−1 code is invariant under the λ-constacyclic shift τ. We identify the code-word c = (c 0, c 1, …, c n−1) with its polynomial representation c(x) = c 0 + c 1 x + ··· + c n−1 x n−1. Then xc(x) corresponds to a λ-constacyclic shift of c(x) in the ring R[x]/(x n − λ). Thus, λ-constacyclic codes of length n over R can be identified as ideals in the ring R[x]/(x n − λ). From the above discuss, we have the following result. Proposition 1. A subset C of Rnis a linear cyclic code of length n if and only if C is an ideal ofAn = R[x]/ (xn - 1). A subset C of Rnis a linear (1 + u)-constacyclic code of length n over R if and only if C is an ideal ofBn = R[x]/ (xn - 1 - u).
Mathematical Theory and Applications of Error Correcting Codes
124
Now, we determine a set of generators of (1 + u)-constacyclic codes for an arbitrary length over R. We begin by recalling a unique set of generators for cyclic codes over Z4. Lemma 2. [cf. Abualrub and Siap (2006), Theorem 6] Let C be a cyclic code of length n over Z4. Then 1.
2. 2.1. 2.2.
If n is odd thenC = ⟨g(x), 2a(x)⟩ = ⟨g(x) + 2a(x)⟩, whereg(x), a(x) are binary polynomials witha(x)|g(x)|(xn - 1) mod 2. If n is even then Ifg(x) = a(x), thenC = ⟨g(x) + 2p(x)⟩, whereg(x), p(x)are binary
polynomials withg(x)|(xn - 1) mod 2, andg(x) , C = ⟨g(x) + 2p(x), 2a(x)⟩, whereg(x), a(x)andp(x)are binary
polynomials witha(x)|g(x)|(xn - 1) mod 2, a(x) and deg g(x) > deg a(x) > deg p(x). For a linear code C of length n over R, we can denote two linear codes of length n over Z4 as follows:
1. The torsion code Tor(C) = {x ∊ Z4n|ux ∊ C}, 2. The residue code Res Consider the homomorphism φ:R → Z4 defined by φ(a + ub) = a. The
map φ extends naturally to a ring homomorphism φ:Rn→Z4(n)= defined by φ(c0 + c1x + ⋯ , cn-1xn-1) = φ(c0) + φ(c1)x + ⋯ + φ(cn-1)xn-1. Acting φ on C over R, we define a ring homomorphism φ:C → Res(C), φ(a + ub) = a where a, b ∈ Z4.
We can easily obtain that Kerφ ≅ Tor(C) and φ(C) = Res(C). By the first isomorphism theorem of finite groups, we have |C| = |Tor(C)||Res(C)|. It is obvious that the image of C under the map φ is a cyclic code of length n over Z4. Combining the above discussion with Lemma 2, we can obtain the set of generators for cyclic codes of length n over R. Theorem 3. Let C be a(1 + u)-constacyclic code of length n over R. Then 1.
2.
If n is odd thenC = ⟨g1(x) + 2a1(x) + ub(x), u(g2(x) + 2a2(x))⟩, whereb(x)is a polynomial inZ4[x]andgi(x), ai(x)are binary polynomials withai(x)|gi(x)|(xn - 1) mod 2 for i = 1, 2. If n is even then
(1 + U)-Constacyclic Codes Over Z 4 + Uz 4
125
2.1. Ifgi(x) = ai(x), thenC = ⟨g1(x) + 2p1(x) + ud(x), u(g2(x) + 2p2(x))⟩, whered(x) is a polynomial inZ4[x], gi(x), pi(x)are binary polynomials withgi(x)|(xn - 1) mod 2 andgi(x)
,fori=1,2;
2.2. C = ⟨g1(x) + 2p1(x) + ue1(x), 2a1(x) + ue2(x), ug2(x) + 2up2(x), 2a2(x)⟩, whereei(x)is a polynomial inZ4[x], andgi(x), ai(x), pi(x)are binary polynomials
withai(x)|gi(x)|(xn-1) mod 2 , ai(x) pi(x),for i = 1, 2.
and deg gi(x) > deg ai(x) > deg
Proof. We only give the proof of the part (1), and the proof of the part (2) is similar. Assume that n is odd. Let C be a (1 + u)-constacyclic code of length n over R. Then the image of C under the map φ is Res(C), which is a cyclic code of length n over Z4. By Lemma 2, we have φ(C) = 〈g1(x) + 2a1(x)〉, where g1(x), a1(x) are binary polynomials with a1(x)|g1(x)|(xn - 1) mod 2 . Thus, there exists b(x) ∊ Z4[x] such that g1(x) + 2a1(x) + ub(x) ∊ C.
Furthermore, note that Kerφ is a cyclic code of length n over Z4 + uZ4, so Kerφ = u 〈g2(x) + 2a2(x)〉, where g2(x), a2(x) are binary polynomials with a2(x)|g2(x)|(xn - 1) mod 2 . Hence, 〈g1(x) + 2a1(x) + ub(x), u(g2(x) + 2a2(x))〉 ⊆ C.
On the other hand, for any f(x) = f1(x) + uf2(x) ∊ C, where fi(x) ∊ Z4[x], for i = 1, 2, it is obvious that f1(x) ∊ φ(C). Hence, Since u(f2(x) − m(x)b(x)) ∊ Kerφ, we have
f(x) ∈ ⟨g1(x) + 2a1(x) + ub(x), u(g2(x) + 2a2(x))⟩.
This shows that C ⊆ 〈g1(x) + 2a1(x) + ub(x), u(g2(x) + 2a2(x))〉.
Thus, C = 〈g1(x) + 2a1(x) + ub(x), u(g2(x) + 2a2(x))〉.□
GRAY IMAGES OF (1 + U)-CONSTACYCLIC CODES OVER R A new Gray map Recall that the Gray map ϕ1 from Z4 to F22 is defined as ϕ1(z) = (q, q + r) where z = r + 2q with r, q ∊ F2. The map ϕ1 can be extended to Z4n as follows:
Mathematical Theory and Applications of Error Correcting Codes
126
where zi = ri + 2qi with ri, qi ∊ F2for 0 ≤ i ≤ n − 1. It is known that ϕ1 is a distance-preserving map from Z4n (Lee distance) to F22n (Hamming distance).
Now, we define a map ϕ2 from Rn to Z44n. First note that each element c ∊ R can be expressed as c = a + ub, where a, b ∊ Z4. The map ϕ2 is defined as ϕ2(c) = (b + 3a, b + 2a, b + a, b).
Clearly, this map can be also extended to Rn as follows:
where ci = ai + ubi with ai, bi ∊ Z4for 0 ≤ i ≤ n − 1.
It is well-known that the homogeneous weight has many applications for codes over finite rings and provides a good metric for the underlying ring in constructing superior codes. Next, we define a homogeneous weight on R. We first recall the definition of the homogeneous weight on a finite ring K. Definition 4. [cf. Greferath and O’Sullivan (2004), Definition 1.1] A real-valued function w on the finite ring K is called a (left) homogeneous weight if w(0) = 0 and the following is true: 1. 2.
For all x, y ∊ K, Kx = Ky implies that w(x) = w(y) holds. There exists a real number γ such that ∑y∊Kx w(y) = γ|Kx| for all x ∈ K\{0}. Right homogenous weight is defined accordingly. If a weight is both left homogenous and right homogeneous, we call it simply as a homogeneous weight. For any element c = a + ub ∊ R, we assign the weight, denoted by whom(c), as wL(b + 3a, b + 2a, b + a, b), i.e., whom(c) = wL(b + 3a, b + 2a, b + a, b). By simple calculation, we can obtain the weight of any element x = a + ub ∊ R as follows:
It is easy to verify that the weight defined above meets the conditions
(1 + U)-Constacyclic Codes Over Z 4 + Uz 4
127
of Definition 4, hence it is actually a homogeneous weight on R. The homogeneous distance of a linear code over R, denoted by dhom(C), is defined as the minimum homogeneous weight of nonzero codewords of C. It can be checked that the map ϕ2 is a distance-preserving map from Rn (homogeneous distance) to Z44n (Lee distance). Using the maps ϕ1 and ϕ2, we can define a composite map . Thus, we have obtained three distance-preserving maps as follows:
Gray images of (1 + u)-constacyclic codes Lemma 5. Let ϕ2be defined as above. Let τ be the (1 + u)-constacyclic shift on Rnand σ be the cyclic shift on Z44n. Thenϕ2τ = σϕ2.
Proof:-Let c = (c0, c1, …, cn−1) ∊ Rn. Let ci = ai + ubi where ai, bi ∊ Z4for 0 ≤ i ≤ n − 1. From definitions, we have Hence,
On the other hand,
Thus,
The result follows. Theorem 6. A linear code C of length n over R is a(1 + u)-constacyclic code if and only if ϕ2(C) is a cyclic code of length 4n over Z4.
Proof:- If C is a (1 + u)-constacyclic code, then using Lemma 5 we have σ(ϕ2(C)) = ϕ2(τ(C)) = ϕ2(C).
Hence, ϕ2(C) is a cyclic code of length 4n over Z4.
128
Mathematical Theory and Applications of Error Correcting Codes
Conversely, if ϕ2(C) is a cyclic code of length 4n over Z4, then using Lemma 5 again we get ϕ2(τ(C)) = σ(ϕ2(C)) = ϕ2(C). Note that ϕ2 is injection, so τ(C) = C.
Thus, we immediately have the following result.
Corollary 7. The image of a(1 + u)-constacyclic code of length n over R under the map ϕ2is a distance invariant cyclic code of length 4n over Z4.
Let σ be the cyclic shift. For any positive integer s, let σs be the quasishift given by σs(a(1)|a(2)| ⋯ |a(s)) = (σ(a(1))|σ(a(2))| ⋯ |σ(a(s)))
where a(1), a(2), …, a(s) ∊ F22n and “|”denotes the usual vector concatenation. A binary quasi-cyclic code C of index s and length 2ns is a subset of (F22n) s such that σs(C) = C.
Lemma 8:-Let ϕbe defined as above and let τ be the (1 + u)-constacyclic shift on Rn. Then ϕτ = σ4ϕ.
Proof
Let r = (r0, r1, …, rn−1) ∊ Rn. Let ri = ai + 2bi + uci + 2udi where ai, bi, ci, di ∊ F2for 0 ≤ i ≤ n − 1. Then we have and so
On the other hand,
Hence,
(1 + U)-Constacyclic Codes Over Z 4 + Uz 4
129
This completes the proof. Theorem 9. A linear code C of length n over R is a (1 + u)-constacyclic code if and only if ϕ(C) is a binary quasi-cyclic code of index 4 and length 8n.
Proof If C is (1 + u)-constacyclic, then using Lemma 8 we have σ4(ϕ(C)) = ϕ(τ(C)) = ϕ(C).
Hence, ϕ(C) is a binary quasi-cyclic code of index 4 and length 8n. Conversely, if ϕ(C) is a binary quasi-cyclic code of index 4 and length 8n, then using Lemma 8 again we get ϕ(τ(C)) = σ4(ϕ(C)) = ϕ(C). Also, ϕ is injection, hence τ(C) = C.□
From Theorem 9 and the definition of the map ϕ, we immediately have the following result.
Corollary 10. The image of a (1 + u)-constacyclic code of length n over R under the map ϕis a distance invariant binary quasi-cyclic code of index 4 and length 8n. Now, we can construct some binary codes with good parameters based on the new Gray map.
Example 11 Consider (1 + u)-constacyclic codes over Z4 + uZ4 of length 3. In F2[x], x − 1=(x − 1)(x2 + x + 1). 3
1.
2.
In Theorem 3, we take g1(x) = x−1, a1(x) = 1, b(x) = 3x, and g2(x) = a2(x) = x3 − 1. Then, we obtain the (1 + u)-constacyclic code C1 over R of length 3 with generator polynomial (1 + u)x + 1. That is 〈(1 − u)x + 1〉 = 〈x + (1 + u)〉, It is easy to see that both Res(C1) and Tor(C1) have size 16. Moreover, dhom(C1) = 8. By Corollary 7, ϕ2(C1) is a Z4-linear code of length 12 with size 256 and Lee distance 8. By Theorem 9, ϕ(C1) is a binary quasi-cyclic code of index 4 and length 24. We find that ϕ(C1) is a non-linear binary code with parameters (24, 256, 8). The code ϕ(C1) attains the performance of the best-known binary linear code with the same parameters based on Grassl’s codetables (Grassl 2007). In Theorem 3, g1(x) = x3 + 1, a1(x) = 1, b(x) = 3, g2(x) = x + 1, and a2(x) = x + 1. Then, we obtain the code C2 = 〈3u(x + 1)〉
130
Mathematical Theory and Applications of Error Correcting Codes
= 〈u(x + 1)〉. Obviously, Res(C2) = {0} and Tor(C2) has size 4. By Corollary 7, ϕ(C2) is a Z4-linear code of length 12 with size 4 and Lee distance 16. By Theorem 9, ϕ(C2) is a binary quasicyclic code of index 4 and length 24. The code ϕ(C2) is a linear binary code with parameters [24, 2, 16], which is optimal based on Grassl’s codetables (Grassl 2007).
CONCLUSION We study the structure of (1 + u)-constacyclic codes over Z4 + uZ4 of an arbitrary length, and obtain the examples of good binary codes from them. Our results show that a (1 + u)-constacyclic code of length n over Z4 + uZ4 under certain map is equivalent to a cyclic code of length 4n over Z4. Furthermore, we discuss the relation between (1 + u)-constacyclic codes of length n over Z4 + uZ4 and their binary Gray images. It would be interesting to study other constacyclic codes over Z4 + uZ4 and use them to construct more good codes over Z4 or F2.
ACKNOWLEDGEMENTS This work was supported by the Natural Science Fund of Education Department of Anhui province under Grant Nos. KJ2015A226, KJ2015B1105916, and Key Projects of Support Program for Outstanding Young Talents in Colleges and Universities under Grant No. gxyqZD2016270.
COMPETING INTERESTS The authors declare that they have no competing interests.
(1 + U)-Constacyclic Codes Over Z 4 + Uz 4
131
REFERENCES 1.
2. 3. 4.
5.
6. 7.
8.
9. 10.
11. 12. 13. 14.
Abualrub T, Siap I (2006) Reversible quaternary cyclic codes. In: Proceedings of the 9th WSEAS international conference on applied mathematics, pp 441–446 Dougherty ST, Karadeniz S, Yildiz B. Cyclic codes over Rk. Des Codes Cryptogr. 2012;63:113–126. doi: 10.1007/s10623-011-9539-4. Grassl M (2007) Bounds on the minimum distance of linear codes and quantum codes. http://www.codetables.de Greferath M, O’Sullivan ME. On bounds for codes over Frobenius rings under homogeneous weights. Discrete Math. 2004;289:11–24. doi: 10.1016/j.disc.2004.10.002. Kai XS, Zhu SX, Wang L. A family of constacyclic codes over F2 + uF2 + vF2 + uvF2. J Syst Sci Complex. 2012;25:1032–1040. doi: 10.1007/ s11424-012-1001-9. Shi MJ. Optimal p-ary codes from constacyclic codes over a non-chain ring R. Chin J Electron. 2014;23:773–777. Shi MJ, Yang SL, Zhu SX. Good p-ary quasi-cyclic codes from cyclic codes over Fp + vFp. J Syst Sci Complex. 2012;25:375–384. doi: 10.1007/s11424-012-0076-7. Siap I, Abualrub T, Yildiz B. One generator quasi-cyclic codes over F2 + uF2 + vF2 + uvF2. J Frankl Inst. 2012;349:284–292. doi: 10.1016/j. jfranklin.2011.10.020. Wolfmann J. Negacyclic and cyclic codes over Z4. IEEE Trans Inf Theory. 1999;45:2527–2532. doi: 10.1109/18.796397. Yildiz B, Aydin N. On cyclic codes over Z4 + uZ4 and their Z4images. Int J Inf Coding Theory. 2014;2:226–237. doi: 10.1504/ IJICOT.2014.066107. Yildiz B, Karadenniz S. Linear codes over F2 + uF2 + vF2 + uvF2. Des Codes Cryptogr. 2010;54:61–81. doi: 10.1007/s10623-009-9309-8. Yildiz B, Karadenniz S. Self-dual codes over F2 + uF2 + vF2 + uvF2. J Frankl Inst. 2010;347:1888–1894. doi: 10.1016/j.jfranklin.2010.10.007. Yildiz B, Karadenniz S. Cyclic codes over F2 + uF2 + vF2 + uvF2. Des Codes Cryptogr. 2011;58:221–234. doi: 10.1007/s10623-010-9399-3. Yildiz B, Karadenniz S. (1 + v)-constacyclic codes over F2 + uF2 + vF2 + uvF2. J Frankl Inst. 2011;348:2625–2632. doi: 10.1016/j. jfranklin.2011.08.005.
132
Mathematical Theory and Applications of Error Correcting Codes
15. Yildiz B, Karadenniz S. Linear codes over Z4 + uZ4: MacWilliams identities, pro-jections, and formally self-dual codes. Finite Fields Appl. 2014;27:24–40. doi: 10.1016/j.ffa.2013.12.007. 16. Zhu SX, Wang L. A class of constacyclic codes over Fp + vFp and its Gray image. Discrete Math. 2011;311:2377–2682. doi: 10.1016/j. disc.2011.08.015.
SECTION 4
INTRODUCTION TO ERROR-CORRECTING CODES
CHAPTER
8
Projection Decoding of Some Binary Optimal Linear Codes of Lengths 36 and 40
Lucky Galvez 1,2, and Jon-Lark Kim 1 1
Department of Mathematics, Sogang University, Seoul 04107, Korea
Institute of Mathematics, University of the Philippines Diliman, Quezon City 1101, Philippines
2
ABSTRACT Practically good error-correcting codes should have good parameters and efficient decoding algorithms. Some algebraically defined good codes, such as cyclic codes, Reed–Solomon codes, and Reed–Muller codes, have nice decoding algorithms. However, many optimal linear codes do not have an
Citation: Galvez, L.; Kim, J.-L. “Projection Decoding of Some Binary Optimal Linear Codes of Lengths 36 and 40”. Mathematics 2020, 8, 15. https://doi.org/10.3390/ math8010015 Copyright: © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
136
Mathematical Theory and Applications of Error Correcting Codes
efficient decoding algorithm except for the general syndrome decoding which requires a lot of memory. Therefore, a natural question to ask is which optimal linear codes have an efficient decoding. We show that two binary optimal [36,19,8] linear codes and two binary optimal [40,22,8] codes have an efficient decoding algorithm. There was no known efficient decoding algorithm for the binary optimal [36, 19, 8] and [40, 22, 8] codes. We project them onto the much shorter length linear [9,5,4] and [10,6,4] codes over GF(4) , respectively. This decoding algorithm, called projection decoding, can correct errors of weight up to 3. These [36, 19, 8] and [40, 22, 8] codes respectively have more codewords than any optimal self-dual [36, 18,8] and [40,20,8] codes for given length and minimum weight, implying that these codes are more practical. Keywords: codes; optimal codes; self-dual codes; projection decoding
INTRODUCTION Coding theory or the theory of error-correcting codes requires a lot of mathematical concepts but has wide applications in data storage, satellite communication, smart phone, and High Definition TV. The well-known classes of (linear) codes include Reed-Solomon codes, Reed-Muller codes, turbo codes, LDPC codes, Polar codes, network codes, quantum codes, and DNA codes. One of the most important reasons why these codes are useful is that they have fast and efficient decoding algorithms. A linear [n,k] code C over GF(q) or is a k-dimensional subspace of . The dual of C is C⊥={x∈ |x⋅c=0 for any c∈C}, where the dot product is either a usual inner product or a Hermitian inner product. A linear code C is called self-dual if C=C⊥. If q=2, then C is called binary. A binary selfdual code is called doubly-even if all codewordshave weight ≡0(mod4) and singly-even if some codewords have a weight ≡2(mod4). If q=4, let GF(4)={0,1,ω, }, where =ω2=ω+1. It is more natural to consider the Hermitian inner product ⟨,⟩ on GF(4)n: for x=(x1,x2,…,xn) and y=(y1,y2,… . ,yn) in GF(4)n, ⟨x,y⟩=
Researchers have tried to find an efficient decoding algorithms for some optimal linear codes. Some well-known methods to decode linear codes are permutation decoding [1,2,3] for codes with large groups of automorphisms, decoding based on error locator polynomials in cyclic or AG codes [3], message passing algorithm for LDPC codes [4,5], and Successive Cancellation Decoding for polar codes [6].
Projection Decoding of Some Binary Optimal Linear ...
137
Projection decoding was used for self-dual codes. For example, Pless [7] showed an efficient decoding of the [24,12,8] extended binary self-dual Golay code by projecting it onto the [6,3,4] hexacode over GF(4). Later, Gaborit, Kim, and Pless [8,9] showed that a similar projection can be done for some singly even and doubly-even self-dual binary [32,16,8] codes including the binary Reed Muller code. Recently, Kim and Lee [10] gave two algorithms for the projection decoding of a binary extremal self-dual code of length 40. The idea was to use the projection of the said code onto a Hermitian self-dual code over GF(4). One of the two algorithms called syndrome decoding uses the syndrome decoding of the shorter code over GF(4). Most examples for the projection decoding were extremal self-dual codes. Therefore it is natural to ask whether such a projection decoding can be done for a non-self-dual code which has larger dimensions than a selfdual code with the same length and minimum distance. In this paper, we show that this is possible. The purpose of this paper is to show how to decode efficiently a binary optimal [36,19,8] linear code and a binary optimal [40,22,8] code by projecting them onto the much shorter length linear [9,5,4] and [10,6,4] codes over GF(4), respectively. This decoding algorithm, which we will call projection decoding can correct errors of weight up to . It can be seen that the decoding algorithm presented in this paper is a generalization of the syndrome projection decoding in [10] since this algorithm can be used to decode any linear code with a projection into an additive or a linear code over GF(4) for errors of weight at most 3. Our decoding is the first time to decode those two optimal [36, 19, 8] and [40, 22, 8] codes, whose parameters are better than any optimal self-dual [36,18,8] and [40,20,8] codes. This paper is organized as follows. In Section 2, we introduce a way to project a binary code of length 4n into an additive code of length n over GF(4). We also mention some properties of this projection as given in [11,12]. In Section 3, we show explicitly how to construct [36,19,8] and [40,22,8] binary optimal codes with projections onto additive codes over GF(4). Using this projection, we give a handy decoding algorithm to decode the said binary optimal codes in Section 4. This decoding algorithm exploits the properties of codes with projection onto an additive code over GF(4) to locate the errors in the noisy codewords, assuming not more than 3 errors occurred. This algorithm has low complexity and can be done with hand. In Section 5, we conclude by showing working examples of how the decoding is carried out.
Mathematical Theory and Applications of Error Correcting Codes
138
PROJECTION OF BINARY LINEAR CODES Let v∈GF(2)4m. We associate to v a rectangular array v of zeros and ones. If we label the rows of the array with the elements of GF(4), then the inner product of a column of the array with the row labels is an element of GF(4). Thus, we obtain a corresponding element of GF(4)m, which we call the projection of v, denoted Proj(v). We show this by the following example. Let v=(100001000010000101101110010101111111)∈GF(2)36. Then we write v column-wisely as follows.
We define the parity of the column to be even (or odd) if an even (or odd) number of ones exists in the column, and the parity of the first row is defined similarly. We call C4⊂GF(4)n an additive code of length n over GF(4) if it is closed under addition. Therefore, a linear code over GF(4) is automatically additive, but not the converse.
Definition 1. Let S be a subset of GF(2)4m and C4 an additive code over GF(4) of length m. Then S is said to have a projection O onto C4 if for any v∈S, (i). (ii).
Proj(v)∈C4. the columns of v are of the same parity, i.e., the columns are either all even or all odd. (iii). the parity of the first row of v is the same as the parity of the columns. The main advantage of this projection is that we can decode the long binary code by decoding its projection, thus generally decreasing the complexity of the decoding process. Several authors exhibited this fact [7,8,10]. Another similar projection is defined in [12], called projection E.
Projection Decoding of Some Binary Optimal Linear ...
139
Definition 2. Using the same notation as in Definition 1, S is said to have a projection E onto C4 if conditions (i) and (ii) are satisfied together with the additional condition: (iii). the parity of the first row of v is always even. Let C4 be an additive code of length m over GF(4). Consider the map ϕ:GF(4)→GF(2)4 such that ϕ(0)=0000, ϕ(1)=0011, ϕ(ω)=0101, and ϕ( W )=0110. Define Let d be the code consisting of all even sums of weight 4 vectors whose ones appear in the same column in the projection, together with one additional vector d1=(10001000…1000) if m is odd and d2=(10001000…0111) if m is even. In [12], the following constructions were given: • Construction O: ρO(C4)=ϕ(C4)+d, • Construction E: ρE(C4)=ϕ(C4)+d′, where d′ is the code consisting of d1 if m is even and d2 if m is odd.
The following result also was also given in [12].
Theorem 1. Let C4 be an additive (m,2r) code over GF(4). Then 1. ρO(C4) and ρE(C4) are binary linear [4m,m+r] codes with projection O and E, respectively, onto C4.
2. Any binary linear code with projection O or E onto C4 can be constructed in this way.
CONSTRUCTION OF BINARY OPTIMAL [36, 19, 8] AND [40, 22, 8] CODES In this section, we apply the constructions given in the previous section to construct binary optimal codes of lengths 36 and 40. We were able to obtain two inequivalent codes for each length. Let C94 be a (9,210) additive code over GF(4) with the following generator matrix
140
Mathematical Theory and Applications of Error Correcting Codes
In fact the rows consisting of the odd indexed rows of G(C94) form a [9,5,4]-linear code over GF(4) which can be found in MAGMA with the name of BKLC(GF(4),9,5). The code C94 has weight distribution A4=51,A5 =135,A6=210,A7=318,A8=234,A9=75.
Let C104 be the (10,212) additive code over GF(4) generated by the following matrix
The minimum distance of both of these codes is 4. The rows consisting of the odd indexed rows of G(C104) form a [10,6,4]-linear code over GF(4) which can be found in (Computational Algebra System) MAGMA with the name of (Best Known Linear Code) BKLC(GF(4),10,6). The code C94 has weight distribution A4=87,A5=258,A6 =555,A7=1020,A8=1200,A9=738,A10=237.
Denote by C36O and C40O the binary linear codes obtained from the additive codes C94 and C104, respectively, over GF(4) by construction O. That is, C36O=ρO(C94) and C40O=ρO(C104). Their generator matrices are given below. These two codes constructed are inequivalent and both have projection O on to C94 and C104, respectively.
Projection Decoding of Some Binary Optimal Linear ...
141
Similarly, we apply construction E on the codes C94 and C104. We obtain two inequivalent codes C36E=ρE(C94) and C40E=ρE(C104) with projection E on to C94 and C104, respectively. Their generator matrices are as follows.
142
Mathematical Theory and Applications of Error Correcting Codes
Proposition 1. The codes C36O and C36E are inequivalent binary optimal [36,19,8] linear codes.
Proof. Since C94 is an additive (9,210) code over GF(4), it follows from Theorem 1 that C36O=ρO(C94) and C36E=ρE(C94) are binary [36,19] linear codes. It remains to show that the minimum distance is 8. It is known that codes of these parameters are optimal [13]. Please note that any codeword c∈C36O can be written as c=a+b+d where a∈ϕ(C94), b is an even sum of weight 4 vectors whose ones appear in the same column in the projection and d is equal to either d1=(1000…1000) or the zero vector. Since the minimum distance of C94 is 4 and thus ϕ(C94) is of minimum distance 8, wt(a)≥8. By the definition of b, it is clear that wt(b)≤8. Thus, the minimum distance of C36O is at most 8. Now partition c into blocks of length 4 and call the ith block ci (which corresponds to the ith column in the projection). For each i=1,…,9, write ci=ai+bi+ci. From the construction, it can be seen that ci=0000 if and only if ai=bi=di=0000. If d is the zero vector, i.e., di=0000 for all i, then wt(ai+ci)∈{2,4}. Thus, wt(c)=wt(ai+bi)≥8. Suppose di=1000 for all i. If bi=0000, then wt(ci)=wt(ai+di)=3. If bi=1111, then wt(ai+bi+di)∈{1,3}. Since at least 4 blocks ci are nonzero, we conclude that wt(c)≥8. Hence the minimum distance of this code is 8. The case of C36E is proved similarly.
Finally, the codes C36O and C36E have different weight distributions given in Table 1 and Table 2 which was computed by MAGMA. Therefore they are inequivalent. Furthermore, we checked by MAGMA that these two codes
Projection Decoding of Some Binary Optimal Linear ...
143
are not equivalent to the currently known optimal [36,19,8] code denoted by BKLC(GF(2),36,19) in the MAGMA database because BKLC(GF(2),36,19) has A8=1033. Please note that the codes C36O and C36E have the automorphism groups of order 6, while BKLC(GF(2),36,19) has a trivial automorphism. Table 1. Weight distribution of C36O.
Table 2. Weight distribution of C36E.
For length 40, we have a similar result as in Proposition 2. We display the weight distributions of C40O and C40E in Table 3 and Table 4. Furthermore, we checked by MAGMA that these two codes are not equivalent to the currently known optimal [40,22,8] code denoted by BKLC(GF(2),40,22) in the MAGMA database because BKLC(GF(2),40,22) has A8=1412. Table 3. Weight distribution of C40O.
Table 4. Weight distribution of C40E.
Proposition 2. The codes C40O and C40E are inequivalent binary optimal [40,22,8] linear codes.
Mathematical Theory and Applications of Error Correcting Codes
144
PROJECTION DECODING Let v∈GF(2)4m and v its associated array, defined in Section 2. From this, we can partition the elements of GF(2)4 with respect to its inner product with the row labels, as follows: From Definitions 1 and 2, we know that if v∈C, where C is a code with projection O on to C4, then the columns of v, as well as the first row have the same parity. Before we give the decoding algorithm, we first take note of the following observations regarding the error positions in the array v.
Remark 1. An error on the first row of v preserves Proj(v). An error on the coordinate that is not in the first row definitely changes Proj(v). 3. Two errors in the same column definitely changes Proj(v). 4. Three errors in the same column preserves Proj(v) if and only if the first entry is not changed. We now present a simple decoding algorithm, which we will call the projection decoding, that can be used for any binary linear code with projection O or projection E onto some additive code C4 over GF(4). This decoding algorithm can correct errors of weight up to three. The idea is to use syndrome decoding in C4, which has shorter length to locate errors in the binary codeword. The decoding is then completed by changing the columns in the array corresponding to the corrected entry in the projection, by using Table 5. 1. 2.
Table 5. Projection of GF(2)4 onto GF(4).
Let C be a binary linear code of length 4n with projection O or projection E onto C4, an additive code over GF(4). Let y be the received vector and
Projection Decoding of Some Binary Optimal Linear ...
145
assume that y is a noisy codeword of C with at most 3 errors. Let y be the associated array and y =Proj(y). Denote by yi the ith column of y and by y i the ith entry of y . Let H be the parity-check matrix of C4 and denote the ith column of H by Hi. The projection decoding of C is carried out as follows.
Projection decoding algorithm 1. 2. 3.
Check the parity of the columns and the first row of y. Let yo, ye be the number of columns of y with odd or even parity, resp., and let p=min(yo,ye). Depending on p, perform the following:
(a). If p=0, compute the syndrome s= y HT. i. If s=0, then we say that no error occurred. ii. If s≠0, the syndrome is a scalar multiple of one of the columns of H, say s=eiHi for some ei∈GF(4). Hence, the two errors occurred on the ith column of y. Replace the ith coordinate of y by y i+ei∈GF(4) and replace the ith column of y with the vector corresponding to y i+ei (see Table 5) such that the parity conditions are satisfied. (b).
If p=1, let yi be the column with the different parity. Compute the syndrome s=y¯¯HT. i. If s=0, check the parity of the first row. A. If the parity of the first row is the same as the parity of yi, then one error occurred on the first entry of yi. Change this entry to decode y. B. If the parity of the first row is different from the parity of yi, then three errors occurred on the 2nd to 4th entries of yi. Change these entries to decode y. ii. If 0≠s=eiHi, then one or three errors occurred in the ith column of y. Replace the ith coordinate of y by y i+ei∈GF(4) and replace the ith column of y with the vector corresponding to y i+ei such that the parity conditions are satisfied. iii. If s=ejHj for some j≠i, then two errors occurred in column j and one error in the first entry of column i. Replace the jth coordinate of y by y j+ej∈GF(4) and replace the jth column of y with the vector corresponding to y j+ej with the same parity as yj and of distance 2 from yj. Finally, replace the first entry of yi and check that the parity conditions are satisfied.
Mathematical Theory and Applications of Error Correcting Codes
146
(c)
If p=2, let yi,yj be the columns with the different parity. Compute the syndrome s= y HT. We know that s=eiHi+ejHj. i. If s=0, then the two errors occurred on the first coordinates of yi and yj. Replace both coordinates and check if the parity conditions are satisfied. ii. If ei≠0 and ej=0, then errors occurred in the ith column of y and the first coordinate of yj. Replace the first coordinate of yj. Then replace the ith coordinate of y by y i+ei∈GF(4) and replace the ith column of y with the vector corresponding to y i+ei such that the parity conditions are satisfied. iii. If ei,ej≠0, then errors occurred in the ith and jth columns of y. Replace the ith coordinate of y by y i+ei∈GF(4) and the jth coordinate by y i+ej∈GF(4) then replace the ith and jth columns of y with the corresponding vectors such that the parity conditions are satisfied. (d). If p=3, let yi,yj and yk be the columns with different parity. Then there are one error each on these columns. Compute the syndrome s= y HT. We know that s=eiHi+ejHj+ekHk i. If s=0, then the three errors occurred on the first coordinates of yi,yj and yk. Replace the first coordinates in these columns and check if the parity conditions are satisfied. ii. If ei≠0 and ej,ek=0, then one error occurred in the ith column of y and one error each on the first coordinates of yj and yk. Replace the first coordinates of yj and yk. Then replace the ith coordinate of y by y i+ei∈GF(4) and replace the ith column of y with the vector corresponding to y i+ei such that the parity conditions are satisfied. iii. If ei≠0,ej≠0 and ek=0, then one error each occurred in the ith and jth column of y and the first coordinate of yk. Replace the first coordinate of yk. Then replace the ith coordinate of y by y i+ei∈GF(4) and the jth coordinate by y j+ej∈GF(4) then replace the ith and jth columns of y with the corresponding vectors such that the parity conditions are satisfied. iv.
If ei,ej,ek≠0, then let y′= y +e where e is the vector of length 9 with ei,ej and ek on the i,j and kth coordinate and
Projection Decoding of Some Binary Optimal Linear ...
147
zero elsewhere. Replace one coordinate each in the i,j,kth columns of y so that it becomes y′ and the parity conditions are satisfied.
Remark 2. We remark that our algorithm for codes C36O, C36E, C40O, and O40E is complete and ending because we considered all possible number of column parities p=min(yo,ye)≤3 and adjusted at most three errors according to the top row parity so that Propositions 1 and 2 are satisfied.
EXAMPLES In this section, we provide examples to illustrate how the given decoding algorithm works. Even though these are samples, most remaining cases are done similarly. The following two examples illustrate how to decode C36O by hand. As a linear code over GF(4), C94 has the following parity check matrix.
Example 1. In this example, we illustrate the projection decoding of C36O. Let y∈GF(2)36 be a codeword in C36O with some error of weight up to three. Consider the following projection of y.
All the columns have odd parity except for the 5th and 6th columns, and hence p=2. Then we have the syndrome
148
Mathematical Theory and Applications of Error Correcting Codes
Therefore we proceed to ii of Step 3(c) with e5=ω and e6=0. We replace y =ω by y 5+e5=ω+ω=0 and then replace by a vector in Table 5 corresponding to 0 closest to it. Finally, we replace the first entry of y6. So there are two errors in y. Therefore, we decoded y as 5
Example 2. Let y∈GF(2)36 be a codeword of C36E with some error of weight up to three and have the following projection
Please note that p=1 and the parity of the 8th column differs from the rest of the column, i.e., i=8 in Step 3(b). We then compute the syndrome:
So we have j=5≠i and e5=
. Proceeding with iii of Step 3(b), we replace
and replace the fifth column by a vector in Table 5 corresponding to 0 which is of the same parity and distance 2 from y5. Then we change the first entry of y8. There are three errors in y. We decoded y as
Projection Decoding of Some Binary Optimal Linear ...
149
Next we show examples of projection decoding of C40O and C40E. We use the following parity check matrix for C104.
Example 3. Assume y∈GF(2)40 is a codeword of C40O with error of weight up to three and with the following projection
All the columns and the first row are of odd parity, so p=0. Computing the syndrome, we have
Since s≠0, by ii of Step 3(a) with e4= , two errors occurred on the by and replace the 4th column of y. We replace fourth column y4 by the vector in Table 5 corresponding to 0 and of distance 2 from
There are two errors in y. Therefore, y is decoded as
150
Mathematical Theory and Applications of Error Correcting Codes
Example 4. Assume y∈GF(2)40 is a codeword of C40E with some error of weight up to 3. Let y have the following projection
All the columns except columns 3, 8 and 10 are even so p=3. The syndrome is
Hence, from iii of Step 3(d), we have e8=1, e10= and e3=0. So we and y8 by the vector corresponding to ω in replace . We also replace and by Table 5 closest to the vector closest to it corresponding to 1 in Table 5. Finally we replace the first entry of column 3. There are three errors in y. Therefore, y is decoded as
Remark 3. The most time consuming part (or the dominating complexity part of the algorithm) is to decode the projected vector in the linear code C94 or C104 using the syndrome decoding. However, since we know which positions
Projection Decoding of Some Binary Optimal Linear ...
151
(or columns) have errors, this syndrome decoding can be done in at most 3×3×3=27 possible ways because there are scalar multiples in the syndrome equation involving three nonzero scalars in Step (d) of the algorithm.
CONCLUSIONS In this paper, we described how to decode binary optimal [36,19,8] and [40,22,8] codes by projecting them onto the linear [9,5,4] and [10,6,4] codes over GF(4). Even though there were similar decoding for self-dual codes of lengths 24, 32, 40, there was no known efficient decoding algorithm for these non-self-dual optimal codes. Actually our algorithm works for any linear code with a projection onto a linear or additive code over GF(4) for errors of weight at most 3.
Author Contributions Conceptualization, J.-L.K.; methodology, J.-L.K.; validation, L.G. formal analysis, L.G.; investigation, L.G.; resources, J.-L.K.; writing–original draft preparation, L.G.; writing–review and editing, J.-L.K.; visualization, L.G.; supervision, J.-L.K.; funding acquisition, J.-L.K. All authors have read and agreed to the published version of the manuscript.
FUNDING J.-L.K. was supported by Basic Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2016R1D1A1B03933259).
CONFLICTS OF INTEREST The authors declare no conflict of interest.
152
Mathematical Theory and Applications of Error Correcting Codes
REFERENCES 1. 2. 3. 4. 5. 6.
7. 8. 9.
10. 11.
12. 13.
Huffman, W.C.; Pless, V. Fundamentals of Error-Correcting Codes; Cambridge University Press: Cambridge, UK, 2003. MacWilliams, F.J. Permutation decoding of systematic codes. Bell Syst. Tech. J. 1964, 43, 485–505. MacWilliams, F.J.; Sloane, N.J.A. The Theory of Error-Correcting Codes; North-Holland: Amsterdam, The Netherland, 1983. Gallager, R. Low-density parity-check codes. IEEE Trans. Inform. Theory 1962, 8, 21–28. Loeliger, H.-A. An introduction to factor graphs. IEEE Signal Proc. Mag. 2004, 21, 28–41. Arikan, E. Channel polarization: A method for constructing capacityachieving codes for symmetric binary-input memoryless channels. IEEE Trans. Inf. Theory 2009, 55, 3051–3073. Pless, V. Decoding the Golay codes. IEEE Trans. Inf. Theory 1986, 32, 561–567. Gaborit, P.; Kim, J.-L.; Pless, V. Decoding binary R(2, 5) by hand. Discret. Math. 2003, 264, 55–73. Kim, J.-L.; Pless, V. Decoding some doubly even self-dual [32,16,8] codes by hand. In Proceedings of the Conference Honoring Professor Dijen Ray-Chaudhuri on the Occasion of His 65th Birthday, Columbus, OH, USA, 18–21 May 2000; pp. 165–178. Kim, J.-L.; Lee, N. A projection decoding of a binary extremal selfdual code of length 40. Des. Codes. Cryptogr. 2017, 83, 589–609. Amrani, O.; Be’ery, Y. Reed-Muller codes. Projections on GF(4) and multilevel construction. IEEE Trans. Inform. Theory 2001, 47, 2560– 2565. Kim, J.-L.; Mellinger, K.E.; Pless, V. Projections of binary linear codes onto larger fields. SIAM J. Discret. Math. 2003, 16, 591–603. Brouwer, A.E. Bounds on the size of linear codes. In Handbook of Coding Theory; Pless, V.S., Huffman, W.C., Eds.; Elsevier: Amsterdam, The Netherland, 1988; pp. 295–461.
CHAPTER
9
Reed-Solomon Turbo Product Codes for Optical Communications: From Code Optimization to Decoder Design Raphael Le Bidan, Camille Leroux, Christophe Jego, Patrick Adde, and Ramesh Pyndiah ¨ Institut TELECOM, TELECOM Bretagne, CNRS Lab-STICC, Technopole Brest-Iroise, CS 83818, 29238 Brest Cedex 3, France
ABSTRACT Turbo product codes (TPCs) are an attractive solution to improve link budgets and reduce systems costs by relaxing the requirements on expensive optical devices in high capacity optical transport systems. In this paper, we investigate the use of Reed-Solomon (RS) turbo product codes for 40 Gbps
Citation: Le Bidan, R., Leroux, C., Jego, C. et al. “Reed-Solomon Turbo Product Codes for Optical Communications: From Code Optimization to Decoder Design”. J Wireless Com Network 2008, 658042 (2008). https://doi.org/10.1155/2008/658042 Copyright: © 2008 Raphaël Le Bidan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
154
Mathematical Theory and Applications of Error Correcting Codes
transmission over optical transport networks and 10 Gbps transmission over passive optical networks. An algorithmic study is first performed in order to design RS TPCs that are compatible with the performance requirements imposed by the two applications. Then, a novel ultrahigh-speed parallel architecture for turbo decoding of product codes is described. A comparison with binary Bose-Chaudhuri-Hocquenghem (BCH) TPCs is performed. The results show that high-rate RS TPCs offer a better complexity/performance tradeoff than BCH TPCs for low-cost Gbps fiber optic communications.
INTRODUCTION The field of channel coding has undergone major advances for the last twenty years. With the invention of turbo codes [1] followed by the rediscovery of low-density parity- check (LDPC) codes [2], it is now possible to approach the fundamental limit of channel capacity within a few tenths of a decibel over several channel models of practical interest [3]. Although this has been a major step forward, there is still a need for improvement in forward-error correction (FEC), notably in terms of code flexibility, throughput, and cost. In the early 90’s, coinciding with the discovery of turbo codes, the deployment of FEC began in optical fiber commu- nication systems. For a long time, there was no real incentive to use channel coding in optical communications since the bit error rate (BER) in lightwave transmission systems can be as low as 10−9–10−15. Then, the progressive introduction of in-line optical amplifiers and the advent of wavelength division multiplexing (WDM) technology accelerated the use of FEC up to the point that it is now considered almost routine in optical communications. Channel coding is seen as an efficient technique to reduce systems costs and to improve margins against various line impairments such as beat noise, channel crosstalk, or nonlinear dispersion. On the other hand, the design of channel codes for optical communications poses remarkable challenges to the system engineer. Good codes are indeed expected to provide at the same time low overhead (high code rate) and guaranteed large coding gains at very low BER [4]. Furthermore, the issue of decoding complexity should not be overlooked since data rates have now reached 10 Gbps and beyond (up to 40 Gbps), calling for FEC devices with low power- consumption. FEC schemes for optical communications are commonly classified into three generations. The reader is referred to [5, 6] for an in-depth historical perspective of FEC for optical communication. First-generation FEC schemes mainly relied on the (255, 239) Reed-Solomon (RS) code over the
Reed-Solomon Turbo Product Codes for Optical Communications: ...
155
Galois field GF (256), with only 6.7% overhead. In particular, this code was recommended by the ITU for long-haul submarine transmissions. Then, the development of WDM technology provided the impetus for moving to second-generation FEC systems, based on concatenated codes with higher coding gains [7]. Third-generation FEC based on soft-decision decoding is now the subject of intense research since stronger FEC are seen as a promising way to reduce costs by relaxing the requirements on expensive optical devices in high- capacity transport systems.
Figure 1: Codewords of the product code P = C1 ⊗ C2.
First introduced in [8], turbo product codes (TPCs) based on binary Bose-Chaudhuri-Hocquenghem (BCH) codes are an efficient and mature technology that has found its way in several (either proprietary or public) wireless transmission systems [9]. Recently, BCH TPCs have received considerable attention for third-generation FEC in optical systems since they show good performance at high code rates and have a high-minimum distance by construction. Fur- thermore, their regular structure is amenable to very-high- data-rate parallel decoding architectures [10, 11]. Research on TPCs for lightwave systems culminated recently with the experimental demonstration of a record coding gain of 10.1 dB at a BER of 10−13 using a (144, 128) × (256, 239) BCH turbo product code with 24.6% overhead [12]. This gain was measured using a turbo decoding very-largescale- integration (VLSI) circuit operating on 3-bit soft input at a data rate of 12.4 Gbps. LDPC codes are also considered as serious candidate for third generation FEC. Impressive cod- ing gains have notably been demonstrated by Monte-Carlo simulation [13]. To date however, to the best of the authors knowledge, no high-rate LDPC decoding architecture has been proposed in order to demonstrate the practicality of LDPC codes for Gbps optical communications.
156
Mathematical Theory and Applications of Error Correcting Codes
In this work, we investigate the use of Reed-Solomon TPCs for thirdgeneration FEC in fiber optic communi- cation. Two specific applications are envisioned, namely 40 Gbps line rate transmission over optical transport net- works (OTNs), and 10 Gbps data transmission over passive optical networks (PONs). These two applications have different requirements with respect to FEC. An algorithmic study is first carried out in order to design RS product codes for the two applications. In particular, it is shown that high-rate RS TPCs based on carefully designed single-error-correcting RS codes realize an excellent performance/complexity trade- off for both scenarios, compared to binary BCH TPCs of similar code rate. In a second step, a novel parallel decoding architecture is introduced. This architecture allows decoding of turbo product codes at data rates of 10 Gbps and beyond. Complexity estimations show that RS TPCs better trade- off area and throughput than BCH TPCs for full-parallel decoding architectures. An experimental setup based on field-programmable gate array (FPGA) devices has been successfully designed for 10 Gbps data transmission. This prototype demonstrates the practicality of RS TPCs for next- generation optical communications. The remainder of the paper is organized as follows. Construction and properties of RS product codes are introduced in Section 2. Turbo decoding of RS product codes is described in Section 3. Product code design for optical communication and related algorithmic issues are discussed in Section 4. The challenging issue of designing a high-throughput parallel decoding architecture for product codes is developed in Section 5. A comparison of throughput and complexity between decoding architectures for RS and BCH TPCs is carried out in Section 6. Section 7 describes the successful realization of a turbo decoder prototype for 10 Gbps transmission. Conclusions are finally given in Section 8.
REED-SOLOMON PRODUCT CODES Code construction and systematic encoding Let C1 and C2 be two linear block codes over the Galois field GF(2m), with parameters (N1,K1, D1) and (N2,K2, D2), respectively. The product code P = C1 ⊗ C2 consists of all N1 × N2 matrices such that each column is a codeword in C1 and each row is a codeword in C2. It is well known that P is an (N1N2,K1K2) linear block code with minimum distance D1D2 over
Reed-Solomon Turbo Product Codes for Optical Communications: ...
157
GF(2m) [14]. The direct product construction thus offers a simple way to build long block codes with relatively large minimum distance using simple, short component codes with small minimum distance. When C1 and C2 are two RS codes over GF (2m), we obtain an RS product code over GF (2m). Similarly, the direct product of two binary BCH codes yields a binary BCH product code. Starting from a K1 × K2 information matrix, systematic encoding of P is easily accomplished by first encoding the K1 information rows using a systematic encoder for C2. Then, the N2 columns are encoded using a systematic encoder for C1, thus resulting in the N1 × N2 coded matrix shown in Figure 1.
Binary image of RS product codes Binary modulation is commonly used in optical communication systems. A binary expansion of the RS product code is then required for transmission. The extension field GF (2m) forms a vector space of dimension mover GF(2). A binary image Pb of P is thus obtained by expanding each code symbol in the product code matrix into m bits using some basis B for GF(2m). The polynomial basis B = {1, α, ..., αm−1} where α is a primitive element of GF(2m) is the usual choice, although other basis exist [15, Chapter 8]. By construction, Pb is a binary linear code with length mN1N2, dimension mK1K2, and minimum distance d at least as large as the symbol-level minimum distance D = D1D2 [14, Section 10.5].
TURBO DECODING OF RS PRODUCT CODES Product codes usually have high dimension which precludes maximumlikelihood (ML) soft-decision decoding. Yet the particular structure of the product code lends itself to an efficient iterative “turbo” decoding algorithm offering closeto-optimum performance at high-enough signal-to-noise ratios (SNRs). Assume that a binary transmission has taken place over a binary-input channel. Let Y = (yi,j) denote the matrix of samples delivered by the receiver front-end. The turbo decoder soft input is the channel log-likelihood ratio (LLR) matrix, R = (ri,j), with
(1)
158
Mathematical Theory and Applications of Error Correcting Codes
Here A is a suitably chosen constant term, and fb(y) denotes the probability of observing the sample y at the channel output given that bit b has been transmitted. Turbo decoding is realized by decoding successively the rows and columns of the channel matrix R using soft-input soft-output (SISO) decoders, and by exchanging reliability information between the decoders until a reliable decision can be made on the transmitted bits.
SISO decoding of the component codes In this work, SISO decoding of the RS component codes is performed at the bit-level using the Chase-Pyndiah algorithm. First introduced in [8] for binary BCH codes and latter extended to RS codes in [16], the Chase Pyndiah decoder consists of a soft-input hard-output Chase2 decoder [17] augmented by a soft-output computation unit. Given a soft-input sequence r = (r1, ... ,rmN ) corresponding to a row (N = N2) or column (N = N1) of R, the Chase-2 decoder first forms a binary harddecision sequence y = (y1, ... , ymN ). The reliability of the harddecision yi on the ith bit is measured by the magnitude |ri| of the corresponding soft input. Then, Nep error patterns are generated by testing different combinations of 0 and 1 in the Lr least reliable bit positions. In general, Nep ≤ 2Lr with equality if all combinations are considered. Those error patterns are added modulo-2 to the hard-decision sequence y to form candidate sequences. Algebraic decoding of the candidate sequences returns a list with at most Nep distinct candidate codewords. Among them, the codeword d at minimum Euclidean distance from the input sequence r is selected as the final decision. Soft-output computation is then performed as follows. For a given bit i, the list of candidate codewords is searched for a competing codeword c at minimum Euclidean distance from r and such that . If such a codeword exists, then the soft output on the ith bit is given by (2)
Figure 2. Block diagram of the turbo-decoder at the kth halfiteration.
Reed-Solomon Turbo Product Codes for Optical Communications: ...
159
where denotes the squared norm of a sequence. Otherwise, the soft output is computed as follows: (3) where β is a positive value, computed on a per-codeword basis, as suggested in [18]. Following the so-called “turbo principle,” the soft input ri is finally subtracted from the soft output r’ i to obtain the extrinsic information (4) which will be sent to the next decoder.
Iterative Decoding of the Product Code The block diagram of the turbo decoder at the kth half- iteration is shown in Figure 2. A half-iteration stands for a row or column decoding step, and one iteration comprises two half-iterations. The input of the SISO decoder at half- iteration k is given by (5) where αk is a scaling factor used to attenuate the influence of extrinsic information during the first iterations, and where Wk = (wi,j) is the extrinsic information matrix delivered by the SISO decoder at the previous halfiteration. The decoder outputs an updated extrinsic information matrix Wk+1, and possibly a matrix Dk of hard-decisions.
Decoding stops when a given maximum number of iterations have been performed, or when an early-termination condition (stop criterion) is met.
The use of a stop criterion can improve the convergence of the iterative decoding process and also reduce the average power-consumption of the decoder by decreasing the average number of iterations required to decode a block. An efficient stop criterion taking advantage of the structure of the product codes was proposed in [19]. Another simple and effective solution is to stop when the hard decisions do not change between two successive half-iterations (i.e., no further corrections are done).
160
Mathematical Theory and Applications of Error Correcting Codes
RS PRODUCT CODE DESIGN FOR OPTICAL COMMUNICATIONS Two optical communication scenarios have been identified as promising applications for third-generation FEC based on RS TPCs: 40 Gbps data transport over OTN, and 10 Gbps data transmission over PON. In this section, we first review the own expectations of each application with respect to FEC. Then, we discuss the algorithmic issues that have been encountered and solved in order to design RS TPCs that are compatible with these requirements.
FEC design for data transmission over OTN and PON 40 Gbps transport over OTN calls for both high-coding gains and low overhead (10 Kbits) with high-coding gain. BER requirements are less stringent than for OTN and are typically of the order of 10−11. High-coding gains result in increased link budget [20]. On the other hand, decoding complexity should be kept at a minimum in order to reduce the cost of optical network units (ONUs) deployed at the end-user side. Channel codes for PON are also expected to be robust against burst errors.
Choice of the component codes On the basis of the above-mentioned requirements, we have chosen to focus on RS product codes with less than 20% overhead. Higher overheads lead to larger signal bandwidth, thereby increasing in return the complexity of
Reed-Solomon Turbo Product Codes for Optical Communications: ...
161
electronic and optical components. Since the rate of the product code is the product of the individual rates of the component codes, RS component codes with code rate R ≥ 0.9 are necessary. Such code rates can be obtained by considering multiple- error-correcting RS codes over large Galois fields, that is, GF (256) and beyond. Another solution is to use single-error- correcting (SEC) RS codes over Galois fields of smaller order (32 or 64). The latter solution has been retained in this work since it leads to low-complexity SISO decoders. First, it is shown in [21] that 16 error patterns are suffi- cient to obtain near-optimum performance with the Chase- Pyndiah algorithm for SEC RS codes. In contrast, more sophisticated SISO decoders are required with multiple- error-correcting RS codes (e.g., see [22] or [23]) since the number of error patterns necessary to obtain near- optimum performance with the Chase-Pyndiah algorithm grows exponentially with mt for a t-errorcorrection RS code over GF(2m). In addition, SEC RS codes admit low-complexity algebraic decoders. This feature further contributes to reducing the complexity of the Chase-Pyndiah algorithm. For multiple-error-correcting RS codes, the Berlekamp-Massey algorithm and the Euclidean algorithm are the preferred algebraic decoding methods [15]. But they introduce unnecessary overhead computations for SEC codes. Instead, a simpler decoder is obtained from the direct decoding method devised by Peterson, Gorenstein, and Zierler (PGZ decoder) [24, 25]. First, the two syndromes S1 and S2 are calculated by evaluating the received polynomial r(x) at the two code roots αb and αb+1:
(6) If S1 = S2 = 0, r(x) is a valid codeword and decoding stops. If only one of the two syndromes is zero, a decoding failure is declared. Otherwise, the error locator X is calculated as (7) from which the error location i is obtained by taking the discrete logarithm of X. The error magnitude E is finally given by (8) Hence, apart from the syndrome computation, at most two divisions over
162
Mathematical Theory and Applications of Error Correcting Codes
GF(2m) are required to obtain the error position and value with the PGZ decoder (only one is needed when b = 0). The overall complexity of the PGZ decoder is usually dominated by the initial syndrome computation step. Fortunately, the syndromes need not be fully recomputed at each decoding attempt in the Chase-2 decoder. Rather, they can be updated in a very simple way by taking only into account the bits that are flipped between successive error patterns [26]. This optimization further alleviates SISO decoding complexity. On the basis of the above arguments, two RS product codes have been selected for the two envisioned applications. The (31, 29)2 RS product code over GF(32) has been retained for PON systems since it combines a moderate overhead of 12.5% with a moderate code length of 4805 coded bits. This is only twice the code length of the classical (255, 239) RS code over GF(256). On the other hand, the (63, 61)2 RS product code over GF(64) has been preferred for OTN, since it has a smaller overhead (6.3%), similar to the one introduced by the standard (255, 239) RS code, and also a larger coding gain, as we will see later.
Performance analysis and code optimization RS product codes built from SEC RS component codes are very attractive from the decoding complexity point of view. On the other hand, they have low-minimum distance D = 3 × 3 = 9 at the symbol level. Therefore, it is of capital interest to verify that this low-minimum distance does not introduce error flares in the code performance curve that would penalize the effective coding gain at low BER. Monte-carlo simulations can be used to evaluate the code performance down to BER of 10−10–10−11 within a reasonable computation time. For lower BER, analytical bounding techniques are required. In the following, binary on-off keying (OOK) intensity modulation with direct detection over additive white Gaussian noise (AWGN) is assumed. This model was adopted here as a first approximation which simplifies the analysis and also facilitates the comparison with other channel codes. More sophisticated models of optical systems for the purpose of assessing the performance of channel codes are developed in [27, 28]. Under the previous assumptions, the BER of the RS product code at high SNRs and under ML soft-decision decoding is well approximated by the first term of the union bound:
Reed-Solomon Turbo Product Codes for Optical Communications: ...
163
(9) where Q is the input Q-factor (see [29, Chapter 5]), d is the minimum distance of the binary image Pb of the product code, and Bd the corresponding multiplicity (number of codewords with minimum Hamming weight d in Pb). This expression shows that the asymptotic performance of the product code is determined by the bit-level minimum distance d of the product code, not by the symbol minimum distance D1D2.
The knowledge of the quantities d and Bd is required in order to predict the asymptotic performance of the code in the high Q-factor (low BER) region using (9). These parameters depend in turn on the basis B used to represent the 2m-ary symbols as bits, and are usually unknown. Computing the exact binary weight enumerator of RS product codes is indeed a very difficult problem. Even the symbol weight enumerator is hard to find since it is not completely determined by the symbol weight enumerators of the component codes [30]. An average binary weight enumerator for RS product codes was recently derived in [31]. This enumerator is simple to calculate. However simulations are still required to assess the tightness of the bounds for a particular code realization.
A computational method that allows the determination of d and Ad under certain conditions was recently suggested in [32]. This method exploits the fact that product codewords with minimum symbol weight D1D2 are readily constructed as the direct product of a minimum-weight row codeword with a minimum-weight column codeword. Specifically, there are exactly distinct codewords with symbol weight D1D2 in the product code C1 ⊗ C2. (10) They can be enumerated with the help of a computer provided the number AD1D2 of such codewords is not too large. Estimates are then obtained by computing the Hamming weight of the binary expansion of those codewords.
164
Mathematical Theory and Applications of Error Correcting Codes
Table 1. Minimum distance d and multiplicity Bd for the binary image of the (31, 29)2 and (63, 61)2 RS product codes as a function of the first code root αb.
Necessarily, d ≤ . If it can be shown that product codewords of symbol weight >D1D2 necessarily have binary minimum distance > (this is not always the case, depending on the value of that
at the bit level
), then it follows
.
This method has been used to obtain the binary minimum distance and multiplicity of the (31, 29)2 and (63, 61)2 RS product codes using narrowsense component codes with generator polynomial g(x) = (x − α)(x − α2). This is the classical definition of SEC RS codes that can be found in most textbooks. The results are given in Table 1. We observe that in both cases, we are in the most unfavorable case where the bit-level minimum distance d is equal to the symbol-level minimum distance D, and no greater. Simulation results for the two RS TPCs after 8 decoding iterations are shown in Figures 3 and 4, respectively. The corresponding asymptotic performance calculated using (9) are plotted in dashed lines. For comparison purpose, we have also included the performance of algebraic decoding of RS codes of similar code rate over GF(256). We observe that the low-minimum distance introduces error flares at BER of 10−8 and 10−9 for the (31, 29)2 and (63, 61)2 product codes, respectively. Clearly, the two RS TPCs do not match the BER requirements imposed by the envisioned applications. One solution to increase the minimum distance of the product code is to resort to code extension or expurgation. However this approach increases the overhead. It also increases decoding complexity since a higher number of error patterns are then required to maintain near-optimum performance with the Chase-Pyndiah algorithm [21]. In this work, another approach has been considered. Specifically, investigations have been conducted in order to identify code constructions that can be mapped into binary images with minimum distance larger than 9. One solution is to investigate different basis B. How to find a basis that maps a nonbinary code into a binary code with bit-level minimum distance strictly larger than the symbol-level designed
Reed-Solomon Turbo Product Codes for Optical Communications: ...
165
distance remains a challenging research problem. Thus, the problem was relaxed by fixing the basis to be the polynomial basis, and studying instead the influence of the choice of the code roots on the minimum distance of the binary image. Any SEC RS code over GF(2m) can be compactly described by its generator polynomial where b is an integer in the range 0 ··· 2m − 2. (11)
Figure 3. BER performance of the (31, 29)2 RS product code as a function of the first code root αb, after 8 iterations.
Narrowsense RS codes are obtained by setting b = 1 (which is the usual choice for most applications). Note however that different values for b generate different sets of codewords, and thus different RS codes with possibly different binary weight distributions. In [32], it is shown that alternate SEC RS codes obtained by setting b = 0 have minimum distance d = D + 1 = 4 at the bit level. This is a notable improvement over classical narrow-sense (b = 1) RS codes for which d = D = 3. This result suggests that RS product codes should be preferably built from two RS component codes with first root α0. RS product codes constructed in this way will be called alternate RS product codes in the following. We have computed the binary minimum distance d and multiplicity Ad of the (31, 29)2 and (63, 61)2 alternate RS product codes. The values are reported in Table 1. Interestingly, the alternate product codes have a minimum distance d as high as 14 at the bit-level, at the expense of an increase of the error coefficient Bd. Thus, we get most of the gain offered by extended or expurgated codes (for which d = 16, as verified by computer search) but without reducing the code rate. It is also worth noting that this
166
Mathematical Theory and Applications of Error Correcting Codes
extra coding gain is obtained without increasing decoding complexity. The same SISO decoder is used for both narrow-sense and alternate SEC RS codes. In fact, the only modifications occur in (6)–(8) of the PGZ decoder, which actually simplify when b = 0. Simulated performance and asymptotic bounds for the alternate RS product codes are shown in Figures 3 and 4. A notable improvement is observed in comparison with the performance of the narrow-sense product codes since the error flare is pushed down by several decades in both cases. By extrapolating the simulation results, the net coding gain (as defined in [5]) at a BER of 10−13 is estimated to be around 8.7 dB and 8.9 dB for the RS(31, 29)2 and RS(63, 61)2 , respectively. As a result, the two selected RS product codes are now fully compatible with the performance requirements imposed by the respective envisioned applications. More importantly, this achievement has been obtained at no cost.
Figure 4. BER performance of the (63, 61)2 RS product code as a function of the first code root αb, after 8 decoding iterations.
Comparison with BCH product codes A comparison with BCH product codes is in order since BCH product codes have already found application in optical communications. A major limitation of BCH product codes is that very large block lengths (>60000 coded bits) are required to achieve high code rates (R > 0.9). On the other hand, RS product codes can achieve the same code rate than BCH product codes, but with a block size about 3 times smaller [21]. This is an interesting advantage since, as shown latter in the paper, large block lengths increase the decoding latency and also the memory complexity in the decoder architecture. RS product codes are also expected to be more robust to error bursts than BCH product codes. Both coding schemes inherit burst-correction properties from
Reed-Solomon Turbo Product Codes for Optical Communications: ...
167
the row- column interleaving in the direct product construction. But RS product codes also benefit from the fact that, in the most favorable case, m consecutive erroneous bits may cause a single symbol error in the received word. A performance comparison has been carried out between the two selected RS product codes and extended BCH (eBCH) product codes of similar code rate: the eBCH(128, 120)2 and the eBCH(256, 247)2. Code extension has been used for BCH codes since it increases minimum distance without increasing decoding complexity nor decreasing significantly the code rate, in contrast to RS codes. Both eBCH TPCs have minimum distance 16 with multiplicities 853442 and 6908802, respectively. Simulation results after 8 iterations are shown in Figures 3 and 4. The corresponding asymptotic bounds are plotted in dashed lines. We observe that eBCH TPCs converge at lower Q-factors. As a result, a 0.3-dB gain is obtained at BER in the range 10−8–10−10. However, the large multiplicities of eBCH TPCs introduce a change of slope in the performance curves at lower BER. In fact, examination of the asymptotic bounds shows that alternate RS TPCs are expected to perform at least as well as eBCH TPCs in the BER range of interest for optical communication, for example, 10−10–10−15. Therefore, we conclude that RS TPCs compare favorably with eBCH TPCs in terms of performance. We will see in the next sections that RS TPCs have additional advantages in terms of decoding complexity and throughput for the target applications.
Figure 5. BER performance for the (63, 61)2 RS product code as a function of the number of quantization bits for the soft-input (sign bit included).
168
Mathematical Theory and Applications of Error Correcting Codes
Soft-input quantization The previous performance study assumed unquantized soft values. In a practical receiver, a finite number q of bits (sign bit included) is used to represent soft information. Soft-input quantization is performed by an analog-to-digital converter (ADC) in the receiver front-end. The very high bit rate in fiber optical systems makes ADC a challenging issue. It is therefore necessary to study the impact of softinput quantization on the performance. Figure 5 presents simulation results for the (63, 61)2 alternate RS product code using q = 3 and q = 4 quantization bits, respectively. For comparison purpose, the performance without quantization is also shown. Using q = 4 bits yields virtually no degradation with respect to ideal (infinite) quantization, whereas q = 3 bits of quantization introduce a 0.5 dB penalty. Similar conclusions have been obtained with the (31, 29)2 RS product code and also with various eBCH TPCs, as reported in [27, 33] for example.
FULL-PARALLEL TURBO DECODING ARCHITECTURE DEDICATED TO PRODUCT CODES Designing turbo decoding architectures compatible with the very high-line rate requirements imposed by fiber optics systems at reasonable cost is a challenging issue. Parallel decoding architectures are the only solution to achieve data rates above 10 Gbps. A simple architectural solution is to duplicate the elementary decoders in order to achieve the given throughput. However, this solution results in a turbo decoder with unacceptable cumulative area. Thus, smarter parallel decoding architectures have to be designed in order to better trade-off performance and complexity under the constraint of a high-throughput. In the following, we focus on an (N2, K2) product code obtained from with two identical (N,K) component codes over GF(2m). For 2m-ary RS codes, m > 1 whereas m = 1 for binary BCH codes.
Previous work Many turbo decoder architectures for product codes have been proposed in the literature. The classical approach involves decoding all the rows or all the columns of a matrix before the next half-iteration. When an application requires high-speed decoders, an architectural solution is to cascade SISO elementary decoders for each half-iteration. In this case, memory blocks are
Reed-Solomon Turbo Product Codes for Optical Communications: ...
169
necessary between each half- iteration to store channel data and extrinsic information. Each memory block is composed of four memories of mN 2 soft values. Thus, duplicating a SISO elementary decoder results in duplicating the memory block which is very costly in terms of silicon area. In 2002, a new architecture for turbo decoding product codes was proposed [10]. The idea is to store several data at the same address and to perform semiparallel decoding to increase the data rate. However, it is necessary to process these data by row and by column. Let us consider l adjacent rows and l adjacent columns of the initial matrix. The l2 data constitute a word of the new matrix that has l2 times fewer addresses. This data organization does not require any particular memory architecture. The results obtained show that the turbo decoding throughput is increased by l2 when l elementary decoders processing l data simultaneously are used. Turbo decoding latency is divided by l. The area of the l elementary decoders is increased by l/2 while the memory is kept constant.
Full-parallel decoding principle All rows (or all columns) of a matrix can be decoded in parallel. If the architecture is composed of 2N elementary decoders, an appropriate treatment of the matrix allows the elimination of the reconstruction of the matrix between each half-iteration decoding step. Specifically, let i and j be the indices of a row and a column of the N × N matrix. In full-parallel processing, the row decoder i begins the decoding by the soft value in the ith position. Moreover, each row decoder processes the soft values by increasing the index by one modulo N. Similarly, the column decoder j begins the decoding by the soft value in the jth position. In addition, each column decoder processes the soft values by decreasing the index by one modulo N. In fact, fullparallel decoding of turbo product code is possible thanks to the cyclic property of BCH and RS codes. Indeed, every cyclic shift c’ = (cN−1,c0, ... ,cN−3,cN−2) of a codeword c = (c0,c1, ... ,cN−2,cN−1) is also a valid codeword in a cyclic code. Therefore, only one-clock period is necessary between two successive matrix decoding operations. The fullparallel decoding of an N × N product code matrix is described in Figure 6. A similar strategy was previously presented in [34] where memory access conflicts are resolved by means of an appropriate treatment of the matrix.
170
Mathematical Theory and Applications of Error Correcting Codes
Figure 6. Full-parallel decoding of a product code matrix.
The elementary decoder latency depends on the structure of the decoder (i.e., number of pipeline stages) and the code length N. Here, as the reconstruction matrix is removed, the latency between row and column decoding is null.
Full-parallel architecture for product codes The major advantage of our full-parallel architecture is that it enables the memory block of 4mN2 soft values between each half-iteration to be removed. However, the codeword soft values exchanged between the row and column decoders have to be routed. One solution is to use a connection network for this task. In our case, we have chosen an Omega network. The Omega network is one of several connection networks used in parallel machines [35]. It is composed of log2N stages, each having N/2 exchange elements. In fact, the Omega network complexity in terms of number of connections and of 2×2 switch transfer blocks is N ×log2N and (N/2) log2N, respectively. For example, the equivalent gate complexity of a 31 × 31 network can be estimated to be 200 logic gates per exchanged bit. Figure 7 depicts a fullparallel architecture for the turbo decoding of product codes. It is composed of cascaded modules for the turbo decoder. Each module is dedicated to one iteration. However, it is possible to process several iterations by the same module. In our approach, 2N elementary decoders and 2 connection blocks are necessary for one module. A connection block is composed of 2 Omega networks exchanging the R and Rk soft values. Since the Omega network has low complexity, the full-parallel turbo decoder complexity essentially depends on the complexity of the elementary decoder.
Reed-Solomon Turbo Product Codes for Optical Communications: ...
171
Elementary SISO decoder architecture The block diagram of an elementary SISO decoder is shown in Figure 2, where k stands for the current half-iteration number. Rk is the soft-input matrix computed from the previous half-iteration whereas R denotes the initial matrix delivered by the receiver front-end (Rk = R for the 1st halfiteration). Wk is the extrinsic information matrix. αk is a scaling factor that depends on the current halfiteration and which is used to mitigate the influence of the extrinsic information during the first iterations. The decoder architecture is structured in three pipelined stages identified as reception, processing, and transmission units [36]. During each stage, the N soft values of the received word Rk are processed sequentially in N clock periods. The reception stage computes the initial syndromes Si and finds the Lr least reliable bits in the received word. The main function of the processing stage is to build and then to correct the Nep error patterns obtained from the initial syndrome and to combine the least reliable bits. Moreover, the processing stage also has to produce a metric (Euclidean distance between error pattern and received word) for each error pattern. Finally, a selection function identifies the maximum likelihood codeword d and the competing codewords c (if any). The transmission stage performs different functions: computing the reliability for each binary soft value, computing the extrinsic information, and correcting the received soft values. The N soft values of the codeword are thus corrected sequentially. The decoding process needs to access the R and Rk soft values during the three decoding phases. For this reason, these words are implemented into six random access memories (RAMs) of size q × m × N controlled by a finite-state machine. In summary, a full parallel TPC decoder architecture requires low-complexity decoders.
COMPLEXITY AND THROUGHPUT ANALYSIS OF THE FULL-PARALLEL REED-SOLOMON TURBO DECODERS Increasing the throughput regardless of the turbo decoder complexity is not relevant. In order to compare the through- put and complexity of RS and BCH turbo decoders, we propose to measure the efficiency η of a parallel architecture by the ratio (12) where T is the throughput and C is the complexity of the design.
172
Mathematical Theory and Applications of Error Correcting Codes
An efficient architecture is expected to have a high η ratio, that is, a high throughput with low hardware complexity. In this section, we determine and compare the efficiency of TPC decoders based on SEC BCH and RS component codes, respectively.
Figure 7. Full-parallel architecture for decoding of product codes.
Turbo decoder complexity analysis A turbo decoder of product code corresponds to the cumulative area of computation resources, memory resources, and communication resources. In a full-parallel turbo decoder, the main part of the complexity is composed of memory and computation resources. Indeed, the major advantage of our full-parallel architecture is that it enables the memory blocks between each half-iteration to be replaced by Omega connection networks. Communication resources thus represent less than 1% of the total area of the turbo decoder. Consequently, the following study will only focus on memory and computation resources.
Complexity analysis of computation resources The computation resources of an elementary decoder are split into three pipelined stages. The reception and transmission stages have O(log(N)) complexity. For these two stages, replacing a BCH code by an RS code of same code length N (at the symbol level) over GF(2m) results in an increase of both complexity and throughput by a factor m. As a result, efficiency is constant in these parts of the decoder. However, the hardware complexity of the processing stage increases linearly with the number Nep of error patterns. Consequently, the increase in the local parallelism rate has no influence on the area of this stage and thus increases the efficiency of an RS SISO
Reed-Solomon Turbo Product Codes for Optical Communications: ...
173
decoder. In order to verify those general considerations, turbo decoders for the (15, 13)2, (31, 29)2, and (63, 61)2 RS product codes were described in HDL language and synthesized. Logic syntheses were performed using the Synopsys tool Design Compiler with an STmicroelectronics 90 nm CMOS process. All designs were clocked with 100 MHz. Complexity of BCH turbo decoders was estimated thanks to a generic complexity model which can deliver an estimation of the gate count for any code size and any set of decoding parameters. Therefore, taking into account the implementation and performance constraints, this model can be used to select a code size N and a set of decoding parameters [37]. In particular, the numbers of error patterns Nep and also the number of competing codewords kept for soft-output computation directly affect both the hardware complexity and the decoding performance. Increasing these parameter values improves performance but also increases complexity Table 2. Computation resource complexity of selected TPC decoders in terms of gate count.
Table 2 summarizes some computation resource complexities in terms of gate count for different BCH and RS product codes. Firstly, the complexity of an elementary decoder for each product code is given. The results clearly show that RS elementary decoders are more complex than BCH elementary decoders over the same Galois field. Complexity results for a full-parallel module of the turbo decoding process are also given in Table 2. As described in Figure 7, a full-parallel module is composed of 2N elementary decoders and 2 connection blocks for one iteration. In this case, full-parallel modules composed of RS elementary decoders are seen to be less complex than fullparallel modules composed of BCH elementary decoders when comparing eBCH and RS product codes of similar code rate R. For instance, for a code rate R = 0.88, the computation resource complexity in terms of gate count are about 892, 672 and 267, 220 for the BCH(128, 120)2 and RS(31, 29)2 , respectively. This is due to the fact that
174
Mathematical Theory and Applications of Error Correcting Codes
RS codes need smaller code length N (at the symbol level) to achieve a given code rate, in contrast to binary BCH codes. Considering again the previous example, only 31×2 decoders are necessary in the RS case for fullparallel decoding compared to 128 × 2 decoders in the BCH case. Similarly,
Figure 8. Comparison of computation resource complexity.
Figure 8 gives computation resource area of BCH and RS turbo decoders for 1 iteration and different parallelism degrees. We verify that higher P (i.e., higher throughput) can be obtained with less computation resources using RS turbo decoders. This means that RS product codes are more efficient in terms of computation resources for full-parallel architectures dedicated to turbo decoding.
Complexity analysis of memory resources A half-iteration of a parallel turbo decoder contains N banks of q × m × N bits. The internal memory complexity of a parallel decoder for one halfiteration can be approximated by (13) where γ is a technological parameter specifying the number of equivalent gate counts per memory bit, q is the number of quantization bits for the soft values, and m is the number of bits per Galois field element. Using (17), it can also be expressed as (14) where P is the parallelism degree, defined as the number of generated bits per clock period (t0).
Let us consider a BCH code and an RS code of similar code length N=
Reed-Solomon Turbo Product Codes for Optical Communications: ...
175
2m − 1. For BCH codes, a symbol corresponds to 1 bit, whereas it is made of m bits for RS codes. Calculating the SISO memory area for both BCH and RS gives the following ratio: (15) This result shows that RS turbo decoders have lower memory complexity for a given parallelism rate. This was confirmed by memory area estimations results showed in Figure 9. Random access memory (RAM) area of BCH and RS turbo decoders for a half-iteration and different parallelism degrees are plotted using a memory area estimation model provided by STMicroelectronics. We can observe that higher P (i.e., higher throughput) can be obtained with less memory when using an RS turbo decoder. Thus, fullparallel decoding of RS codes is more memory-efficient than BCH code turbo decoding
Figure 9. Comparison of internal RAM complexity.
Turbo decoder throughput analysis In order to maximize the data rate, decoding resources are assigned for each decoding iteration. The throughput of a turbo decoder can be defined as (16) where R is the code rate and f0 = 1/t0 is the maximum frequency of an elementary SISO decoder. Ultrahigh throughput can be reached by increasing these three parameters. (i)
(ii)
R is a parameter that exclusively depends on the code considered. Thus, using codes with a higher code rate (e.g., RS codes) would provide larger throughput. In a full-parallel architecture, a maximum throughput is obtained
Mathematical Theory and Applications of Error Correcting Codes
176
by duplicating N elementary decoders generating m soft values per clock period. The parallelism degree can be expressed as P = N × m. (17) Therefore, enhanced parallelism degree can be obtained by using nonbinary codes (e.g., RS codes) with larger code length N. (iii)
Finally, in a high-speed architecture, each elementary decoder has to be optimized in terms of working frequency f0. This is accomplished by including pipeline stages within each elementary SISO decoder. RS and BCH turbo decoders of equivalent code size have equivalent working frequency f0 since RS decoding is performed by introducing some local parallelism at the soft value level. This result was verified during logic syntheses. The main drawback of pipelining elementary decoders is the extra complexity generated by internal memory requirement.
Table 3. Hardware efficiency of selected TPC decoders.
Since RS codes have higher P and R for equivalent f0, RS turbo decoder can reach a higher data rate than equivalent BCH turbo decoder. However, the increase in throughput cannot be considered regardless of the turbo decoder complexity.
Turbo product code comparison: throughput versus complexity The efficiency η between the decoder throughput and the decoder complexity can be used to compare eBCH and RS turbo product codes. We have reported in Table 3 the code rate R, the parallelism degree P, the throughput T (Gbps), the complexity C (kgate) and the efficiency η (kbps/gate) for each code. All designs have been clocked at f0 = 100 MHz for the computation of the throughput T. An average ratio of 3.5 between RS and BCH decoder efficiency is observed. The good compromise between performance, throughput and complexity
Reed-Solomon Turbo Product Codes for Optical Communications: ...
177
clearly makes RS product codes good candidates for next-generation PON and OTN. In particular, the (31, 29)2 RS product code is compatible with the 10 Gbps line rate envisioned for PON evolutions. Similarly, the (63, 61)2 RS product code can be used for data transport over OTN at 40 Gbps provided the turbo decoder is clocked at a frequency slightly higher than 100 MHz.
IMPLEMENTATION OF AN RS TURBO DECODER FOR ULTRA HIGH THROUGHPUT COMMUNICATION An experimental setup based on FPGA devices has been designed in order to show that RS TPCs can effectively be used in the physical layer of 10 Gbps optical access networks. Based on the previous analysis, the (31, 29)2 RS TPC was selected since it offers the best compromise between performance and complexity for this kind of application.
10 Gbps experimental setup The experimental setup is composed of a board that includes 6 Xilinx Virtex-5 LX330 FPGAs [38]. A Xilinx Virtex-5 LX330 FPGA contains 51,840 slices that can emulate up to 12 million gates of logic. It should be noted that Virtex-5 slices are organized differently from previous generations. Each Virtex-5 slice contains four look up tables (LUTs) and four flip-flops instead of two LUTs and two flip-flops in previous generation devices. The board is hosted on a 64-bit, 66 MHz PCI bus that enables communication at full PCI bandwidth with a computer. An FPGA embedded memory block containing 10 encoded and noisy product code matrices is used to generate input data towards the turbo decoder. This memory block exchanges data with a computer thanks to the PCI bus. One decoding iteration was implemented on each FPGA resulting in a 6 full-iteration turbo decoder as shown in Figure 10. Each decoding module corresponds to a full parallel architecture dedicated to the decoding of a matrix of 31 × 31 coded soft values. We recall here that a coded soft value over GF (32) is mapped onto 5 LLR values, each LLR being quantized on 5 bits. Besides, the decoding process needs to access the 31 coded soft values from each of the matrices R and Rk during the three decoding phases of a half-iteration as explained in Section 4. For these reasons, 31×5×5×2 = 1, 550 bits have to be exchanged between the decoding modules during each clock period f0 = 65 MHz. The board offers 200 chip to chip LVDS for each FPGA
178
Mathematical Theory and Applications of Error Correcting Codes
to FPGA interconnect. Unfortunately, this number of LVDS is insufficient to enable the transmission of all the bits between the decoding modules. To solve this implementation constraint, we have chosen to add SERializer/ DESerializer (SERDES) modules for the parallel-to-serial conversions and for the serial-to-parallel conversions in each FPGA. Indeed, SERDES is a pair of functional blocks commonly used in high-speed communications to convert data between parallel data and serial interfaces in each direction. SERDES modules are clocked with f1 = 2 × f0 = 130 MHz and operate at 8: 1 serialization or 1: 8 deserialization. In this way, all data can be exchanged between the different decoding modules. Finally, the total occupation rate of the FPGA that contains the more complex design (decoding module + two SERDES modules + memory block + PCI protocol module) is slightly higher than 66%. This corresponds to 34,215 Virtex-5 slices. Note that the decoding module represents only 37% of the total design complexity. More details about this are given in the next section. Currently, a new design phase of the experimental setup is in progress. The objective is to include channel emulator and BER measurement facilities in order to verify decoding performance of the turbo decoder by plotting some BER curves as in our previous experimental setup [37].
Characteristics and performance of the implemented decoding module A decoding module for one iteration is composed of 31 × 2 = 62 elementary decoders and 2 connection blocks. Each elementary decoder uses information quantized on 5 bits with Nep = 8 error patterns and only 1 competing codeword. These reduced parameter values allow a decrease in the required area for a performance degradation which remains inferior to 0.5 dB. Thus a (31, 29) RS elementary decoder occupies 729 slice LUTs, 472 slice FlipFlops and 3 BlockRAM of 18 Kbs. A connection block occupies only 2,325 slice LUTs. Computation resources of a decoding module take up 29,295 slice Flip-Flops and 49,848 slice LUTs. It means that the occupation rates are about 14% and 24% of a Xilinx Virtex-5 LX330 FPGA for slice registers and slice LUTs, respectively. Besides, memory resources for the decoding module take up 186 BlockRAM of 18 Kbits. It represents 32% of the total BlockRAM available in the Xilinx Virtex-5 LX330 FPGA. Note that one BlockRAM of 18 Kbits is allocated by the Xilinx tool ISE to memorize only 31 × 5 × 5 = 775 bits in our design. The occupation rate of each BlockRAM of 18 Kbits is then only about 4%. Input data are clocked at f0 = 65 MHz
Reed-Solomon Turbo Product Codes for Optical Communications: ...
179
resulting in a data rate of Tin = 10 Gbps at the turbo-decoder input. By taking into account the code rate R = 0.87, the information rate becomes Tout = 8.7 Gbps. In conclusion, the implementation results showed that a turbo decoder dedicated to the (31, 29)2 RS product code can effectively be integrated to the physical layer of a 10 Gbps optical access network.
Figure 10. 10 Gbps experimental setup for turbo decoding of (31, 29)2 RS product code.
(63, 61) 2 RS TPC complexity estimation for a 40 Gbps transmission over OTN A similar prototype based on the (63, 61)2 RS TPC can be designed for 40 Gbps transmission over OTN. Indeed, the architecture of one decoding iteration is the same for the two RS TPCs considered in this work. For the (63, 61)2 RS product code, a decoding module for one iteration is now composed of 63 × 2 = 126 elementary decoders and 2 connection blocks. Logic syntheses were performed using the Xilinx tool ISE to estimate the complexity of a (63, 61) RS elementary decoder. This decoder occupies 1070 slice LUTs, 660 slice Flip-Flops, and 3 BlockRAM of 18 Kbs. These estimations immediately give the complexity of a decoding module dedicated to one iteration. Computation resources of a (63, 61)2 RS decoding module take up 83,160 slice FlipFlops and 134,820 slice LUTs. The occupation rates are then about 40% and 65% of a Xilinx Virtex-5 LX330 FPGA for slice registers and slice LUTs, respectively. Memory resources of a (63, 61)2 RS decoding module take up 378 BlockRAM of 18 Kbits that represents 65% of the total BlockRAM available in the considered FPGA device. One BlockRAM of 18 Kbits is allocated by the Xilinx tool ISE to memorize only 63×6×5 = 1890 bits. For a (63, 61) RS elementary decoder, the occupation rate of each BlockRAM of 18 Kbits is only about 10.5%.
180
Mathematical Theory and Applications of Error Correcting Codes
CONCLUSION We have investigated the use of RS product codes for forward-error correction in high-capacity fiber optic transport systems. A complete study considering all the aspects of the problem from code optimization to turbo product code implementation has been performed. Two specific applications were envisioned: 40 Gbps line rate transmission over OTN and 10 Gbps data transmission over PON. Algorithmic issues have been ordered and solved in order to design RS turbo product codes that are compatible with the respective requirements of the two transmission scenarios. A novel fullparallel turbo decoding architecture has been introduced. This architecture allows decoding of TPCs at data rates of 10 Gbps and beyond. In addition, a comparative study has been carried out between eBCH and RS TPCs in the context of optical communications. The results have shown that high-rate RS TPCs offer similar performance at reduced hardware complexity. Finally, we have described the successful realization of an RS turbo decoder prototype for 10 Gbps data transmission. This experimental setup demonstrates the practicality and also the benefits offered by RS TPCs in lightwave systems. Although only fiber optic communications have been considered in this work, RS TPCs may also be attractive FEC solutions for next-generation free-space optical communication systems.
ACKNOWLEDGMENTS The authors wish to acknowledge the financial support of France Telecom R&D. They also thank Gerald Le Mestre for his significant help during the experimental setup design phase. This paper was presented in part at IEEE International Conference on Communication, Glasgow, Scotland, in June 2007.
Reed-Solomon Turbo Product Codes for Optical Communications: ...
181
REFERENCES 1.
C. Berrou, A. Glavieux, and P. Thitimajshima, “Near shannon limit error-correcting coding and decoding: turbo-codes 1,” in Proceedings of the IEEE International Conference on Communi- cations (ICC ’93), vol. 2, pp. 1064–1070, Geneva, Switzerland, May 1993. 2. R. G. Gallager, “Low-density parity-check codes,” IEEE Trans- actions on Information Theory, vol. 8, no. 1, pp. 21–28, 1962. 3. D. J. Costello Jr. and G. D. Forney Jr., “Channel coding: the road to channel capacity,” Proceedings of the IEEE, vol. 95, no. 6, pp. 1150– 1177, 2007. 4. S. Benedetto and G. Bosco, “Channel coding for optical communications,” in Optical Communication: Theory and Techniques, E. Forestieri, Ed., chapter 8, pp. 63–78, Springer, New York, NY, USA, 2005. 5. T. Mizuochi, “Recent progress in forward error correction for optical communication systems,” IEICE Transactions on Communications, vol. E88-B, no. 5, pp. 1934–1946, 2005. 6. T. Mizuochi, “Recent progress in forward error correction and its interplay with transmission impairments,” IEEE Journal of Selected Topics in Quantum Electronics, vol. 12, no. 4, pp. 544– 554, 2006. 7. “Forward error correction for high bit rate DWDM submarine systems,” International Telecommunication Union ITU-T Recommandation G.975.1, February 2004. 8. R. Pyndiah, A. Glavieux, A. Picart, and S. Jacq, “Near optimum decoding of product codes,” in Proceedings of the IEEE Global Telecommunications Conference (GLOBECOM ’94), vol. 1, pp. 339– 343, San Francisco, Calif, USA, November-December 1994. 9. K. Gracie and M.-H. Hamon, “Turbo and turbo-like codes: principles and applications in telecommunications,” Proceed- ings of the IEEE, vol. 95, no. 6, pp. 1228–1254, 2007. 10. J. Cuevas, P. Adde, S. Kerouedan, and R. Pyndiah, “New architecture for high data rate turbo decoding of product codes,” in Proceedings of the IEEE Global Telecommunications Conference (GLOBECOM ’02), vol. 2, pp. 1363–1367, Taipei, Taiwan, November 2002. 11. C. Je´go, P. Adde, and C. Leroux, “Full-parallel architecture for turbo decoding of product codes,” Electronics Letters, vol. 42, no. 18, pp. 1052–1054, 2006.
182
Mathematical Theory and Applications of Error Correcting Codes
12. T. Mizuochi, Y. Miyata, T. Kobayashi, et al., “Forward error correction based on block turbo code with 3-bit soft decision for 10-Gb/s optical communication systems,” IEEE Journal of Selected Topics in Quantum Electronics, vol. 10, no. 2, pp. 376– 386, 2004. 13. I. B. Djordjevic, S. Sankaranarayanan, S. K. Chilappagari, and B. Vasic, “Low-density parity-check codes for 40-Gb/s optical transmission systems,” IEEE Journal of Selected Topics in Quantum Electronics, vol. 12, no. 4, pp. 555–562, 2006. 14. F. J. MacWilliams and N. J. A. Sloane, The Theory of Error- Correcting Codes, North-Holland, Amsterdam, The Nether- lands, 1977. 15. R. E. Blahut, Algebraic Codes for Data Transmission, Cam- bridge University Press, Cambridge, UK, 2003. 16. O. Aitsab and R. Pyndiah, “Performance of Reed-Solomon block turbo code,” in Proceedings of the IEEE Global Telecom-munications Conference (GLOBECOM ’96), vol. 1, pp. 121– 125, London, UK, November 1996. 17. D. Chase, “A class of algorithms for decoding block codes with channel measurement information,” IEEE Transactions on Information Theory, vol. 18, no. 1, pp. 170–182, 1972. 18. P. Adde and R. Pyndiah, “Recent simplifications and improvements in block turbo codes,” in Proceedings of the 2nd International Symposium on Turbo Codes and Related Topics, pp. 133–136, Brest, France, September 2000. 19. R. Pyndiah, “Iterative decoding of product codes: block turbo codes,” in Proceedings of the 1st International Symposium on Turbo Codes and Related Topics, pp. 71–79, Brest, France, September 1997. 20. J. Briand, F. Payoux, P. Chanclou, and M. Joindot, “Forward error correction in WDM PON using spectrum slicing,” Optical Switching and Networking, vol. 4, no. 2, pp. 131–136, 2007. 21. R. Zhou, R. Le Bidan, R. Pyndiah, and A. Goalic, “Low- complexity high-rate Reed-Solomon block turbo codes,” IEEE Transactions on Communications, vol. 55, no. 9, pp. 1656– 1660, 2007. 22. P. Sweeney and S. Wesemeyer, “Iterative soft-decision decod- ing of linear block codes,” IEE Proceedings: Communications, vol. 147, no. 3, pp. 133–136, 2000. 23. M. Lalam, K. Amis, D. Leroux, D. Feng, and J. Yuan, “An improved iterative decoding algorithm for block turbo codes,” in Proceedings of
Reed-Solomon Turbo Product Codes for Optical Communications: ...
24.
25.
26.
27.
28.
29. 30.
31.
32.
33.
34.
183
the IEEE International Symposium on Information Theory (ISIT ’06), pp. 2403–2407, Seattle, Wash, USA, July 2006. W. W. Peterson, “Encoding and error-correction procedures for the Bose-Chaudhuri codes,” IEEE Transactions on Informa- tion Theory, vol. 6, no. 4, pp. 459–470, 1960. D. Gorenstein and N. Zierler, “A class of error correcting codes in pm symbols,” Journal of the Society for Industrial and Applied Mathematics, vol. 9, no. 2, pp. 207–214, 1961. S. A. Hirst, B. Honary, and G. Markarian, “Fast Chase algorithm with an application in turbo decoding,” IEEE Transactions on Communications, vol. 49, no. 10, pp. 1693– 1699, 2001. G. Bosco, G. Montorsi, and S. Benedetto, “Soft decoding in optical systems,” IEEE Transactions on Communications, vol. 51, no. 8, pp. 1258–1265, 2003. Y. Cai, A. Pilipetskii, A. Lucero, M. Nissov, J. Chen, and J. Li, “On channel models for predicting soft-decision error correction performance in optically amplified systems,” in Proceedings of the Optical Fiber Communications Conference (OFC ’03), vol. 2, pp. 532– 533, Atlanta, Ga, USA, March 2003. G. P. Agrawal, Lightwave Technology: Telecommunication Sys- tems, John Wiley & Sons, Hoboken, NJ, USA, 2005. L. M. G. M. Tolhuizen, “More results on the weight enu- merator of product codes,” IEEE Transactions on Information Theory, vol. 48, no. 9, pp. 2573–2577, 2002. M. El-Khamy and R. Garello, “On the weight enumer- ator and the maximum likelihood performance of linear product codes,” IEEE Transaction on Information Theory, arXiv:cs.IT/0601095 (preprint) Jan 2006. R. Le Bidan, R. Pyndiah, and P. Adde, “Some results on the binary minimum distance of Reed-Solomon codes and block turbo codes,” in Proceedings of the IEEE International Con- ference on Communications (ICC ’07), pp. 990–994, Glasgow, Scotland, June 2007. P. Adde, R. Pyndiah, and S. Kerouedan, “Block turbo code with binary input for improving quality of service,” in Mul- tiaccess, Mobility and Teletraffic for Wireless Communications, X. Lagrange and B. Jabbari, Eds., vol. 6, Kluwer Academic Publishers, Boston, Mass, USA, 2002. Z. Chi and K. K. Parhi, “High speed VLSI architecture design for block
184
35.
36.
37.
38.
Mathematical Theory and Applications of Error Correcting Codes
turbo decoder,” in Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS ’02), vol. 1, pp. 901–904, Phoenix, Ariz, USA, May 2002. D. H. Lawrie, “Access and alignment of data in an array processor,” IEEE Transactions on Computers, vol. C-24, no. 12, pp. 1145–1155, 1975. S. Kerouedan and P. Adde, “Implementation of a block turbo decoder on a single chip,” in Proceedings of the 2nd International Symposium on Turbo Codes and Related Topics, pp. 243–246, Brest, France, September 2000. C. Leroux, C. Je´go, P. Adde, and M. Jezequel, “Towards Gb/s turbo decoding of product code onto an FPGA device,” in Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS ’07), pp. 909–912, New Orleans, La, USA, May 2007. http://www.dinigroup.com/DN9000k10PCI.php.
CHAPTER
10
Enhancing BER Performance Limit of BCH and RS Codes Using Multipath Diversity
Alyaa Al-Barrak 1,2,,Ali Al-Sherbaz 1,Triantafyllos Kanakis 1 and Robin Crockett 3 Department of Computing & Immersive Technologies, University of Northampton, Northampton NN2 6JD, UK
1
Department of Computer Science, College of Science, University of Baghdad, Baghdad 10071, Iraq
2
Department of Environmental and Geographical Sciences, University of Northampton, Northampton NN2 6JD, UK
3
Citation: Al-Barrak, A.; Al-Sherbaz, A.; Kanakis, T.; Crockett, R. “Enhancing BER Performance Limit of BCH and RS Codes Using Multipath Diversity”. Computers 2017, 6, 21. https://doi.org/10.3390/computers6020021 Copyright: © 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license
186
Mathematical Theory and Applications of Error Correcting Codes
ABSTRACT Modern wireless communication systems suffer from phase shifting and, more importantly, from interference caused by multipath propagation. Multipath propagation results in an antenna receiving two or more copies of the signal sequence sent from the same source but that has been delivered via different paths. Multipath components are treated as redundant copies of the original data sequence and are used to improve the performance of forward error correction (FEC) codes without extra redundancy, in order to improve data transmission reliability and increase the bit rate over the wireless communication channel. For a proof of concept Bose, RayChaudhuri, and Hocquenghem (BCH) and Reed-Solomon (RS) codes have been used for FEC to compare their bit error rate (BER) performances. The results showed that the wireless multipath components significantly improve the performance of FEC. Furthermore, FEC codes with low error correction capability and employing the multipath phenomenon are enhanced to perform better than FEC codes which have a bit higher error correction capability and did not utilise the multipath. Consequently, the bit rate is increased, and communication reliability is improved without extra redundancy. Keywords: Multipath; Propagation; Phenomenon; Fec; Bch; ReedSolomon; Column Weight Multipath Combiner; Snr
INTRODUCTION The past decade or so has witnessed remarkable growth in the demand for providing reliable communication links and high transmission rates. A reliable digital communication system involves the sending and receiving of data with vanishingly small error rates [1, 2]. Any wireless communication system is prone to a certain level of noise, reflection, diffraction, shadowing, and fading. Furthermore, the signal that is transmitted through a wireless channel arrives at the receiver via a number of different paths, referred to as multipath transmission, and this leads to fading (signal distortion and burst errors) [3]. Therefore, transmission reliability is very challenging on wireless channels. One of the most widely used techniques to provide reliable communication is forward error correction (FEC). The investment of FEC requires either increasing channel bandwidth or decreasing the rate of the transmission [4]. Therefore, the high transmission rate and transmission reliability need high bandwidth, but the bandwidth is a substantial issue for communication which means that increasing the bandwidth is not a
Enhancing BER Performance Limit of BCH and RS Codes Using....
187
wise decision [5]. In contrast, the multipath phenomenon can be utilised to improve communication reliability and increase the transmission rate without increasing bandwidth. In this paper, the effectiveness of the multipath phenomenon to improve the error correction capability (transmission reliability) with as little redundancy as possible is considered. In this article, Reed–Solomon (RS) and Bose, Ray–Chaudhuri, and Hocquenghem (BCH) codes with different parameters are used to provide high data rate transmission and analysis of the communication performance with and without multipath propagation. The paper is organised as follows: Related work is given in Section 2. A brief overview of FEC is demonstrated in Section 3. The multipath phenomenon is described in Section 4. The methodology is explained in Section 5. In Section 6 the proposed combiner is described. Simulation parameters, and results, are presented in Section 7 with the conclusions being reported in Section 8.
RELATED WORK Recently, research has shown intense interest towards analysing the performance of various FEC techniques rather than how to improve these without extra redundancy. Researchers have not taken into consideration the positive effect of the multipath phenomenon on the performance of FEC and how it could be utilised to improve FEC capability. Some authors have compared the performance regarding bit error rate (BER) of different forward error correction codes such as RS, convolutional code (CC), RS-CC, and CC-RS codes [6]. They evaluated the BER of CC at various code rates. Likewise, they evaluated the performance of RS codes for different code rates, as well as block length. Furthermore, they compared the performance of both CC-RS and RS-CC concatenated codes with the individual codes and with uncoded data transmission. Some authors examined the performance of RS codes with binary phase-shift keying (BPSK) and quadrature phaseshift keying (QPSK) modulation over an additive white Gaussian noise (AWGN) channel [7]. Additionally, they compared the performance of RS codes with the BCH codes. After examining the results, they found the RS code performance is better than the BCH code. Some authors implemented RS codes for phase-shift keying (PSK) modulation over the AWGN communication channel. They performed the simulation of RS codes for the same code rates. They showed that the BER performance is poor for lower signal-noise-ratios (SNRs). On the other hand, the BER performance
188
Mathematical Theory and Applications of Error Correcting Codes
improved for large block lengths [8]. Moreover, other authors have simulated RS and BCH codes in the presence of a Rayleigh fading channel and have shown that the BCH code exceeds the RS code in the binary environment [1]. This paper investigates the possibility of utilising the multipath phenomenon to improve the performance of FEC over the AWGN and Rayleigh channels. Additionally, analysis of the effectiveness of the utilisation of multipath propagation on the error correction capability of FEC with low redundancy was conducted. Furthermore, a combiner based on Hamming weight to combine the selected paths (redundant copies of the transmitted signal) into one strong signal to decode it is proposed.
FORWARD ERROR CORRECTION Appropriate techniques are necessary to overcome the problem of errors that are introduced during the transmission which occur due to inter-symbol interference (ISI), the multipath phenomenon, and noise. FEC is used for combatting this issue in a communication system. FEC techniques add some redundancy to the data which enables the receiver to detect and correct errors. BCH and RS codes are FEC techniques which work by appending extra data at the end of each message, known as a codeword (see Figure 1). This section shows some preliminaries of BCH and RS codes.
Figure 1. BCH and RS code codeword block.
BCH Codes A BCH code is a kind of binary cyclic code discovered by Bose, Ray– Chaudhuri, and Hocquenghem [9]. This code has been studied intensively due to the strict algebraic structure introduced in the codes. In BCH codes, the codewords are created by dividing a polynomial mi(x) by a generator polynomial g(x) and taking the remainder which will be presented as parity check bits r(x). The encoded data C(x) will be constituted as:
Enhancing BER Performance Limit of BCH and RS Codes Using....
C(x)=mi(x)+r(x)
189
(1)
The characteristics of the code are determined by the selected generator polynomial g(x). For integer’s m1 and t, the BCH can correct up to independent errors, and its possible codes are [1]:
where m≥3, t k12, and (n2, k21) and for RS code where k21 > k22 and (n2, k22) are simulated with three, five, and seven paths. The threshold value was set at 40–75% from the SNR of the LoS signal to choose the NLoS signals. The BER ratio was computed by changing Eb/N0 from 1 to 25. In the simulation results, the red slope represents the uncoded signal (LoS). The black and the black with stars slopes represent the LoS which is encoded by using codes where t = 1, and t = 2, respectively. The blue slope represents the combining of the coded LoS with two coded NLoS, where LoS and NLoS are encoded by using codes where t = 1. Figure 6 and Figure 7 show that the BCH and RS codes in a multipath transmission consistently perform better than the BCH and RS codes without CWMC in AWGN and Rayleigh channels under a binary environment. It can be seen that combining three, five, and seven coded paths result in a better slope than solely the coded LoS. The absolute BER performance of BCH and RS codes is improved by approximately 3 dB and between 8 and 9 dB over AWGN and Rayleigh channels, respectively, for three combining paths. The combining of five paths improved the BER performance by approximately 4 dBm and from 11 to 12 dB over AWGN and Rayleigh channels. The combining of seven paths improved the performance of BER more than the combining of three or five paths. The improvement was 6 dB over the AWGN channel and between 13 and 14 dB over the Rayleigh channel at a BER of 10−3.
Figure 6. BER vs. SNR of the BCH (15,11) code with and without CWMC. (a) The AWGN channel; and (b) the Rayleigh channel.
196
Mathematical Theory and Applications of Error Correcting Codes
Figure 7. BER vs. SNR of the RS (15,13) code with and without CWMC. (a) Over the AWGN channel; and (b) over the Rayleigh channel.
BCH (15,11), BCH (127,120), and BCH (255,247) codes have an error correction capability t1 = 1, while the error correction capability of BCH (15,7), BCH (127,113), and BCH (255,239) codes is t2 = 2. Figure 8, Figure 9 and Figure 10, show that the combining of three paths improved the performance of BCH (15,11), BCH (127,120), and BCH (255,247) codes. The performance is improved by approximately 3 dB, 2.8 dB, and 2.75 dB, respectively, over the AWGN channel, and 7 dB, 10 dB, and 11.5 dB, respectively, over the Rayleigh channel at a BER of 10−3. Furthermore, their performance is better than BCH (15,7), BCH (127,113), and BCH (255,239) codes by approximately 0.75 dB, 0.85 dB, and 1 dB, respectively, over the AWGN channel, and 2 dB, 6.2 dB, and 7.5 dB, respectively, over the Rayleigh channel at a BER of 10−3.
Figure 8. BER vs. SNR of the BCH (15,11) code with and without CWMC vs. the BCH (15,7) code without CWMC, where 3P represents the three paths that were combined to improve the error correction capability of the BCH(15,11) code. (a) The AWGN channel; and (b) the Rayleigh channel.
Enhancing BER Performance Limit of BCH and RS Codes Using....
197
Figure 9. BER vs. SNR of the BCH (127,120) code with and without CWMC vs. the BCH(127,113) code CWMC, where 3P represents the three paths that were combined to improve the error correction capability of the BCH(127,120) code. (a) The AWGN channel; and (b) the Rayleigh channel.
Figure 10. BER vs. SNR of the BCH (255,247) code with and without CWMC vs. the BCH (255,239) code without CWMC, where 3P represents the three paths that were combined to improve the error correction capability of the BCH (255,247) code. (a) The AWGN channel; and (b) the Rayleigh channel.
Similarly, the RS (15,13), RS (127,125), and RS (255,253) codes have error correction capability t1 = 1 and, for RS (15,11), RS (127,123), and RS (255,251) codes, t2 = 2. Figure 11, Figure 12 and Figure 13 show that the combining of three paths improved the performance of RS (15,13), RS (127,123), and RS (255,251) codes. The performance is improved by approximately 2 dB, 2.5 dB, and 2.6 dB, respectively, over the AWGN channel, and 9 dB, 12 dB, and 13.5 dB, respectively, over the Rayleigh channel at a BER of 10−3. Furthermore, their performance is better than RS(15,11), RS(127,123), and RS(255,251) codes by approximately 0.5 dB, 0.75 dB, and 1.5 Db, respectively, over the AWGN channel, and 5 dB, 10 dB, and 11.5 dB, respectively, over the Rayleigh channel at a BER of 10−3.
198
Mathematical Theory and Applications of Error Correcting Codes
Figure 11. BER vs. SNR of the RS (15,13) code with and without CWMC vs. the RS (15,11) code without CWMC, where 3P represents the three paths that were combined to improve the error correction capability of the RS (15,13) code. (a) The AWGN channel; and (b) the Rayleigh channel.
Figure 12. BER vs. SNR of the RS (127,125) code with and without CWMC vs. the RS (127,123) code without CWMC, where 3P represents the three paths that were combined to improve the error correction capability of the RS (127,125) code. (a) The AWGN channel; and (b) the Rayleigh channel.
Figure 13. BER vs. SNR of the RS (255,253) code with and without CWMC vs. the RS (255,251) code without CWMC, where 3P represents the three paths that
Enhancing BER Performance Limit of BCH and RS Codes Using....
199
were combined to improve the error correction capability of the RS (255,253) code. (a) The AWGN channel; and (b) the Rayleigh channel.
CONCLUSIONS This paper shows that the performance of FEC codes can be improved in order to enhance BER performance. Furthermore, it demonstrates that a FEC with low redundancy and low error correction capability can perform better than ones with higher redundancy and higher error correction capability. This is achieved through utilising an existing phenomenon in wireless communication called multipath propagation and proposing a new combiner known as CWMC with low complexity. Additionally, the improvement in the performance of FEC increased by increasing the number of combined paths. The CWMC combiner improved BER performance, and it can be enhanced by increasing the number of the combining paths, as shown in the simulation results. Additionally, the improvement of BER performance depends on the error correction capability of the FEC codes. Furthermore, the BER performance can be improved by increasing the error correction capability (t). In other words, increasing the redundancy will improve the BER performance, while it will reduce the gross transmission rate. It is shown in the simulations that the BCH and RS codes with n = 15, k = 7, and t = 2, and n = 15, k = 11, and t = 2, respectively, improved the performance of BER more than when n = 15, k = 11, and t = 1, and n = 15, k = 13, and t = 1, respectively. In contrast, the results show that BCH and RS codes with t = 1 can be improved to enhance BER performance more than BCH and RS codes with t = 2 by using CWMC, and the transmission rate is increased. Moreover, the simulation shows that the CWMC improved BCH performance more than RS over the AWGN channel. However, the CWMC improved RS performance more than BCH over the Rayleigh channel, because the RS codes are correcting burst errors. In possible future work, this research could be extended by analysing and evaluating the performance of FEC techniques with high modulation schemes, multiple-input multiple-output (MIMO) systems, and over different wireless channel models. As open research topics, it is recommended to further investigate the following: •
The performance of FEC codes which utilise the multipath phenomenon can be compared with turbo code performance.
Mathematical Theory and Applications of Error Correcting Codes
200
•
• •
The performance of LDPC code can be compared with the performance of FEC codes which employ the multipath phenomenon. The performance analysis can be extended to codewords with different lengths. Analyse the overall system and compare it with turbo and LDPC codes in term of complexity and overhead.
ACKNOWLEDGEMENTS We thank the Iraq Higher Committee for Education Development (HCED) for their financial support. We are indebted to the Faculty of Arts, Science and Technology, University of Northampton, for providing the postgraduate studentship and the software, resources, and support required.
CONFLICTS OF INTEREST The authors declare no conflict of interest.
ABBREVIATIONS The following abbreviations are used in this manuscript: FEC
Forward Error Correction
BER
Bit Error Rate Bose, Ray–Chaudhuri and Hocquenghem
BCH RS
Reed–Solomon
CC
Convolutional Code
BPSK
Binary Phase-Shift Keying
QPSK
Quadrature Phase-Shift Keying Additive White Gaussian Noise
AWGN PSK SNR ISI LDPC LoS NLoS
Phase-Shift Keying Signal-Noise-Ratio Inter-Symbol Interference Low-Density Parity-Check Line of Sight Non-Line of Sight
Enhancing BER Performance Limit of BCH and RS Codes Using.... WCDMA CWMC
Wideband Code Division Multiple Access Column Weight Multipath Combiner
Nomenclature K
Length of uncodedword
N
Length of codedword
K-N
Number of redundant bits/symbols
x
Data
mi(x)
Uncodeword polynomial
g(x)
Generator polynomial
r(x)
Remainder polynomial
C(x)
Codeword polynomial
m1
Any positive integer greater than or equal to 3
t
Number or errors that can be corrected in a codeword
d
Hamming distance
m2
Number of bits per symbol
q(x)
Quotient polynomial
TL
Path length (the travelling distance needs between paths)
ω
Delay time for NLoS
L
Number of selective NLoS
Lc
wh
Represents the matrix of selective NLoS with L rows and N columns Represents the bits of each codeword received from NLoS and LoS paths Hamming weight
yij
Represents the combined output for each column
cij
n1 n2 k11, k12 k21, k22
Codeword length for BCH code Codeword length for RS code Number of uncoded bits used for BCH codes Number of uncoded symbols used for RS codes
201
202
Mathematical Theory and Applications of Error Correcting Codes
REFERENCES 1.
2.
3. 4. 5.
6.
7.
8.
9. 10. 11. 12.
13.
Lone, F.R.; Puri, A.; Kumar, S. Performance comparison of Reed Solomon Code and BCH Code over Rayleigh Fading Channel. Int. J. Comput. Appl. 2013, 71, 23–26. Sanghvi, A.S.; Mishra, N.B.; Waghmode, R.; Talele, K.T. Performance of Reed-Solomon Codes in AWGN. Int. J. Electron. Commun. Eng. 2001, 4, 259–266. Zigangirov, K.S. Theory of Code Division Multiple Access Communication; John Wiley & Sons: Hoboken, NJ, USA, 2004. Bagad, V.S. Wireless Communication, 1st ed.; Technical Publications: Pune, India, 2009. Nandaniya, J.S.; Kalani, N.B.; Kulkarni, G.R. Comparative analysis of different channel coding techniques. Int. J. Comput. Netw. Wirel. Commun. 2014, 4, 84–89. Kumar, S.; Gupta, R. Performance comparison of different forward error correction coding techniques for wireless communication systems. Int. J. Comput. Sci. Technol. 2011, 2, 553–557. Ratnam, D.V.; SivaKumar, S.; Sneha, R.; Reddy, N.S.; Brahmanandam, P.S.; Krishna, S.G. A Study on performance evaluation of Reed-Solomon (RS) Codes through an AWGN Channel Model in a Communication System. Int. J. Comput. Sci. Commun. 2012, 3, 37–40. Korrapati, V.; Prasad, M.V.D. A Study on performance evaluation of Reed Solomon Codes through an AWGN Channel model for an efficient Communication System. Int. J. Eng. Trends Technol. 2013, 4, 1038–1041. Sweeney, P. Error Control Coding: From Theory to Practice; John Wiley & Sons: New York, NY, USA, 2002. Wallace, H. Error Detection and Correction Using the BCH Code. Available online: http://bbs.hwrf.com.cn/downpcbe/bch-6143.pdf Rao, K.D. Channel Coding Techniques for Wireless Communications; Springer: New Delhi, India, 2015. Di, Y. The evaluation and application of forward error coding. In Proceedings of the 2011 International Conference on Computer Science and Network Technology (ICCSNT), Harbin, China, 24–26 December 2011. Shrivastava, P.; Singh, U.P. Error detection and correction using Reed
Enhancing BER Performance Limit of BCH and RS Codes Using....
14. 15.
16.
17. 18.
203
Solomon Codes. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2013, 3, 965–969. Wicker, S.B.; Bhargava, V.K. Reed-Solomon Codes and Their Applications; John Wiley & Sons: New York, NY, USA, 1999. Adamek, J. Foundations of Coding: Theory and Applications of Error-Correcting Codes with an Introduction to Cryptography and Information Theory, 1st ed.; Wiley-Interscience: Hoboken, NJ, USA, 1991. Rashmi; Nag, V.R. Performance study on the suitability of Reed Solomon codes in communication system. CT Int. J. Inf. Commun. Technol. 2013, 1, 13–15. Holma, H.; Toskala, A. WCDMA for UMTS: HSPA Evolution and LTE; John Wiley & Sons: Chichester, UK, 2007. Al-Barrak, A.; Al-Sherbaz, A.; Kanakis, T.; Crockett, R. Utilisation of Multipath Phenomenon to Improve the Performance of BCH and RS Codes. In Proceedings of the 8th Computer Science and Electronic Engineering Conference (CEEC), Essex, UK, 28–30 September 2016.
SECTION 5
QUASI CYCLIC CODES
CHAPTER
11
Quasi-Cyclic Codes via Unfolded Cyclic Codes and Their Reversibility
Ramy Taki Eldin 1,2 And Hajime Matsui 2 (On leave) Faculty of Engineering, Ain Shams University, Cairo 11517, Egypt 1
2
Toyota Technological Institute, Nagoya 468-8511, Japan
ABSTRACT The finite field q(ℓ) of q ℓ elements contains q as a subfield. If θ ∈ q(ℓ) is of degree ℓ over q , it can be used to unfold elements of q(ℓ) to vectors in ℓ q .We apply the unfolding to the coordinates of all codewords of a cyclic code C over q(ℓ) of length n. This generates a quasi-cyclic code Q over q of length nℓ and index ℓ. We focus on the class of quasi-cyclic codes
Citation: R. Taki Eldin and H. Matsui, “Quasi-Cyclic Codes Via Unfolded Cyclic Codes and Their Reversibility,” in IEEE Access, vol. 7, pp. 184500-184508, 2019. https://doi. org/10.1109/ACCESS.2019.2960569 Copyright: © This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/
208
Mathematical Theory and Applications of Error Correcting Codes
resulting from the unfolding of cyclic codes. Given a generator polynomial g(x) of a cyclic code C, we present a formula for a generator polynomial matrix for the unfolded code Q. On the other hand, for any quasi-cyclic code Q with a reduced generator polynomial matrix G, we provide a necessary and sufficient condition on G that determines whether or not the code Q can be represented as the unfolding of a cyclic code. Furthermore, as an application, we discuss the reversibility of the class of quasi-cyclic codes resulting from unfolding of cyclic codes. Specifically, we provide a necessary and sufficient condition on the defining set T of the cyclic code C that ensures the reversibility of the unfolded code. Numerical examples are used to illustrate theoretical results. Some of these examples show that quasi-cyclic codes reversibility does not necessarily require a self-reciprocal generator polynomial for the cyclic code. Since reversibility is essential in constructing DNA codes, some DNA codes are designed as examples.
SECTION I. INTRODUCTION We refer to a finite field of order q by q, a cyclic code by C, and a quasicyclic (QC) code by Q .A cyclic code C of length n over q is a linear subspace of nq that is invariant under cyclic shifts of its codewords. The rich algebraic structures of cyclic codes make it one of the most prominent codes in many applications. It is known that a cyclic code C over q of length n is an ideal in the ring R, where R is the quotient ring q[x]/⟨xn−1⟩. Therefore, C is uniquely identified by a monic generator polynomial g(x) of least degree such that C=⟨g(x)⟩ . A broader class of codes that includes cyclic codes is the class of QC codes. We denote a QC code over q of length nℓ and index ℓ by Q .It is known that Q is a linear subspace of nℓq that is invariant under cyclic shifts of its codewords by ℓ coordinates. The QC codes have been shown to be asymptotically good [2]–[3][4]. Although QC codes are a generalization of periodic codes, they have a rather similar algebraic structure [7]. A QC code Q over q can be considered as a q[x] -submodule of Rℓ. Therefore, Q is uniquely identified by a reduced generator polynomial matrix G [1]. There are some subclasses of the QC codes. For instance, the subclass of QC codes that can be represented as a direct sum of ℓ cyclic codes of length n. The QC codes for this subclass are identified by a diagonal generator polynomial matrix G. Another subclass is the one that contains QC codes generated by
Quasi-Cyclic Codes via Unfolded Cyclic Codes and Their Reversibility
209
unfolding of cyclic codes over the extension field qℓ [7]. By unfolding, we mean applying a one-to-one map φθ that represents elements of qℓ as vectors in Fℓq using an appropriate basis. The unfolding of a cyclic code C of code length n , dimension k and minimum distance dmin results in a QC code Q of code length nℓ , dimension kℓ and minimum distance not less than dmin .
A code is called reversible if it is invariant under reversing the coordinates of its codewords. Massey [8] showed that a cyclic code is reversible if and only if its generator polynomial g(x) is self-reciprocal, i.e., g(x)=xdegg(x)g(1/x) .In some applications, it is considerably necessary to construct reversible codes, for example, the construction of DNA codes [9]–[10][11][12][13]. In [9]–[10][11], the reversibility of cyclic DNA codes over 4 has been investigated. In [12], [13], the construction of reversible quasi-cyclic DNA codes over other algebraic structures is considered. In [5], we investigated the reversibility of QC codes represented as the direct sum of cyclic codes and those with index ℓ=2. In this paper, we present some results on the subclass of QC codes resulting from the unfolding of cyclic codes using a one-one map φθ: qℓ→ℓq. The map φθ unfolds a cyclic code C over qℓ of length n to a QC code Q over q of length nℓ and index ℓ .Our main results are divided into three parts. In the first part, we present a formula for a generator polynomial matrix of the QC code Q resulting from the unfolding of a cyclic code C .The formula provides the generator polynomial matrix in terms of the transpose Companion matrix N of θ , an upper Toeplitz matrix U and a lower Toeplitz matrix L .The matrices U and L depend only on the generator polynomial g(x) of C .Although the obtained generator polynomial matrix is not necessarily in reduced form, the reduced form can easily be obtained by applying row operations. In the second part, we specify the condition on the generator polynomial matrix G of any QC code Q that ensures that Q is the unfolding of some cyclic code C .Namely, we demonstrate that Q is the unfolding of a cyclic code if and only if GN is a generator polynomial matrix to Q .The latter condition is equivalent to GN=MG , for some invertible matrix M .In the third part, we consider the application of reversibility in the subclass of QC codes Q resulting from the unfolding of cyclic codes C .The zeros of the generator polynomial g(x) of C specify a defining set T={s|g(γs)=0} , where γ is an nth root of unity. For even ℓ, we characterize the defining set T of C to ensure the reversibility of the unfolded QC code Q .This characterization requires that −sqℓ/2∈T for every s∈T .The simplicity
210
Mathematical Theory and Applications of Error Correcting Codes
of this condition suggests designing the generator polynomial g(x) of C in a way that constructs a reversible QC code Q with an even index ℓ .Lemma 1 specifies an element θ∈qℓ of degree ℓ over q for the unfolding of C .In addition, the construction of a reversible QC with odd ℓ is also considered. Specifically, the reversible QC code with odd ℓ is constructed as a direct sum of a reversible QC code with even ℓ and a reversible cyclic code. Some numerical examples of even and odd ℓ are given to illustrate the theoretical results. Some of these examples show that the reversible QC code Q resulting from the unfolding of the cyclic code C=⟨g(x)⟩ does not necessarily require a self-reciprocal generator polynomial g(x) .In other examples, our construction is applied to generate DNA codes with an enormous number of long code words. The rest of the paper is organized as follows. Some necessary preliminaries are summarized in Section II to substantiate our results. In Section III-A, we present a formula for a generator polynomial matrix of the QC code resulting from the unfolding of a cyclic code C .In Section III-B, we identify the generator polynomial matrix G of any QC code Q to represent Q as the unfolding of a cyclic code. In Section IV-A, we propose the condition that ensures the reversibility of the unfolded code Q for even index ℓ. The case of odd ℓ is considered in Section IV-B. Some examples that illustrate our construction are given in Section V. We conclude the work in Section VI.
SECTION II. PRELIMINARIES Throughout this paper, q is a prime power, n is a positive integer coprime to q, R is the quotient ring q[x]/⟨xn−1⟩, C is a cyclic code over qℓ of length n , the polynomial g(x)∈ qℓ[x] generates C , Q is a QC code over q of length nℓ and G is the reduced generator polynomial matrix of Q .
A. Algebraic Structures of C and Q An qℓ -linear code C is cyclic if it is invariant under the cyclic shift of codewords. There is a polynomial representation for codewords of C .In [6, Chapter 4], the codeword c=(α0,α1,…,αn−1)∈C
is represented by the polynomial
Quasi-Cyclic Codes via Unfolded Cyclic Codes and Their Reversibility
211
c(x)=α0+α1 x+α2 x2+⋯+αn−1xn−1∈C,
where αi∈qℓ for 0≤i≤n−1 .In polynomial representation, the cyclicity indicates that C is an ideal in the principal ideal ring qℓ[x]/⟨xn−1⟩ .Thus, C is generated by a polynomial g(x) that divides xn−1 in qℓ[x] , this is referred to as g(x)|(xn−1) . The qℓ -cyclotomic coset modulo n of the representative s is denoted by Cs .Namely, (1) where μ is the least positive integer such that n|s(q −1) .There is a correspondence between qℓ -cyclotomic cosets modulo n and the irreducible factors of xn−1 in qℓ[x] .Therefore, the factorization of xn−1 can be obtained by listing all the qℓ -cyclotomic cosets modulo n . μℓ
The splitting field of xn−1 is qℓτ, where τ be the least positive integer such is a primitive that n|(qℓτ−1) .If η is a primitive element of qℓτ , then nth root of unity. The zeros of xn−1 in qℓτ are precisely {γi|i=0,1,…,n−1} .Because g(x)|(xn−1) , zeros of g(x) are defined by a set of integers T , called the defining set of C , such that
Theorem 4.2.1(vii) in [6] shows that the defining set T is partitioned to disjoint qℓ -cyclotomic cosets modulo n .That is
where s runs over a set of representatives for the cyclotomic cosets Cs⊆T .So, we have the following two consequences: c(x) is a codeword of C if and only if
(2) c(γ )=0 for every codeword c(x)∈C if and only if s∈T s
(3)
An q -linear code Q is quasi-cyclic code if, for any codeword (4) the word
212
Mathematical Theory and Applications of Error Correcting Codes
obtained by ℓ cyclic shifts is a codeword, where ai,j ∈q for 0≤i≤n−1 and 0≤j≤ℓ−1 .
B. Folding Q and Unfolding C If θ is a zero of a monic irreducible polynomial p(x) of degree ℓ over q, then
The unfolding map φθ is a one-to-one map from qℓ to ℓq defined as follows.
Definition 1: For any
for some aj∈q , then
Folding is the process φ−1θ:(a0,a1,…,aℓ−1)↦α , while unfolding is φθ:α↦(a0,a1,…,aℓ−1) .One can extend the map φθ to fold codewords of Q or unfold codewords of C as follows.
where
, for 0≤i≤n−1 and 0≤l≤ℓ−1 .
In [7], unfolding C generates a QC code of length nℓ over q .However, folding Q generates a code over qℓ invariant under cyclic shifts but not necessary qℓ -linear. Whenever the latter is qℓ -linear, it is a cyclic code C and we say that Q is generated by unfolding C .We are interested in the class of QC codes resulting from cyclic codes unfolding. In particular, we distinguish these QC codes from other QC codes.
Quasi-Cyclic Codes via Unfolded Cyclic Codes and Their Reversibility
213
C. Generator Polynomial Matrix G of Q Let Q be a QC code over q of length nℓ .Similar to cyclic codes, Q has polynomial vector representation to its codewords [14]. For this purpose, we divide each codeword c∈Q into ℓ sub-words, the length of each cyclic interval is n .From (4), each codeword c∈Q is partitioned as follows
In this representation, Q is invariant under the local cyclic shift, where the local cyclic shift of c is
Thus, the polynomial vector representation c(x) of c is
where .However, other polynomial representations of QC codes exist [7]. The local cyclic shift corresponds to multiplication by x then reduction modulo xn−1 .Therefore, Q can be considered as an q[x] -submodule of Rℓ .A generator polynomial matrix is a matrix whose rows are codewords that generate Q as an q[x] -submodule. In [1], the reduced form of a generator polynomial matrix is the upper triangular matrix
of size ℓ×ℓ that meets the following conditions: 1.
For each 1≤i≤ℓ , gi,i is monic and has a minimum degree among all codewords of the form (0,…,0,ci,…,cℓ) with ci≠0 . 2. For 1≤i≠j≤ℓ, we have deg(gi,j) 0. Let c(x) = c0+c1(x)+...+cn−1xn−1 be a codeword in C. Note that xatc(x) = θat(c0)x1+ln+ θat(c1)x2+ln + ... + θat(cn−1)xn+ln = cn−1 + c0x + ... + cn−2xn−2 ∈ C. Thus C is a cyclic code of length n. Proposition 2.2 A code C of length n over R is a skew cyclic code if and only if C is a left R[x; θ]-submodule of the left R[x; θ]-module R[x; θ]/(xn −1).
Proof Let c(x) = c0 + c1x + ... + cn−1xn−1 be a codeword in C. Since C is cyclic, it follows that xc(x), x2c(x),... ,xi c(x) are all elements in C, where all the indices are taken modulo n. Therefore, r(x)c(x) ∈ C for any r(x) ∈ R[x; θ]. Thus C is an R[x; θ]-submodule of R[x; θ]/(xn − 1).
Conversely, suppose C is a left R[x; θ]-submodule of the left R[x; θ]module R[x; θ]/(xn − 1). Then for any codeword c(x) ∈ C, xc(x) ∈ C. Therefore, C is skew cyclic.
Note that not all left R[x; θ]-submodules are R-free, but in following we will focus on those submodules. Similar to the case that the order of θ divides n, the following proposition gives a well-defined properties of free skew cyclic codes for any length n. Proposition 2.3 A skew cyclic code C of length n over R is free if and only if it is generated by a monic right divisor g(x) of xn − 1 with degree k. The set {g(x), xg(x),... ,xn−k−1g(x)} forms a basis of C and the rank of C is n − k.
SKEW QUASI-CYCLIC CODES Let θ be an auto orphism of R and n = ls. A linear code C over R is called skew quasi-cyclic with index l if and only if (c0,0, c0,1,... ,c0,l−1, c1,0, c1,1,... ,c1,l−1,... , cs−1,0, cs−1,1,... ,cs−1,l−1) ∈C⇒ (θ(cs−1,0), θ(cs−1,1),... , θ(cs−1,l−1), θ(c0,0), θ(c0,1),... ,θ(c0,l−1),... ,θ(cs−2,0), θ(cs−2,1),... ,θ(cs−2,l−1)) ∈ C. If θ is the identity map, we call C a quasi-cyclic code over R. In the following, we illustrate the relationship between skew cyclic codes and quasi-cyclic codes over R.
232
Mathematical Theory and Applications of Error Correcting Codes
Proposition 3.1 Let C be a skew cyclic code of length n over R and let θ be an automorphism with order t. If gcd(t, n) = l, then C is equivalent to a quasi-cyclic code of length n with index l over R. Proof Let n = sl and (c0,0, c0,1,... ,c0,l−1, c1,0, c1,1,... ,c1,l−1,... ,cs−1,0, cs−1,1, ... ,cs−1,l−1) ∈ C. Since gcd(t, n) = d, there exist integers a, b such that at + bn = d. Therefore, at = d−bn = d+gn, where g is a nonnegative integer. Note that θd+gn(c0,0, c0,1,... ,c0,l−1, c1,0, c1,1,... ,c1,l−1,... ,cs−1,0, cs−1,1,... ,cs−1,l−1) = (cs−1,0, cs−1,1,... ,cs−1,l−1, c0,0, c0,1,... ,c0,l−1,... ,cs−2,0, cs−2,1,... ,cs−2,l−1) ∈ C. Thus, C is equivalent a quasi-cyclic code of length n with index l over R. From Proposition 3.1, we have the following corollary directly Corollary 3.2 Let C be a skew quasi-cyclic code of length n with index l over R and let θ be an auto orphism with order t. If gcd(t, n) = k, then C is equivalent to a quasi-cyclic code of length n with index lk over R. Let C be an skew qusi-cyclic codes of length n with index l over R. As traditional study of quasi-cyclic codes , we can identity an element (c0,0, c0,1, ... ,c0,l−1, c1,0, c1,1,... ,c1,l−1,... ,cs−1,0, cs−1,1,... ,cs−1,l−1) ∈ C with the polynomial (c0(x), c1(x),... ,cl−1(x)) ∈ (R[x; θ]/(xs−1))l , where R[x; θ]/ (xs−1), j = 0, 1,... ,l−1. Then, like in the case of skew cyclic codes in section 2, it is easy to see that skew quasi-cyclic code of length n with index l over R is a left R[x; θ]-submodule of (R[x; θ]/(xs −1))l ; and conversely, a left R[x; θ]submodule of (R[x; θ]/(xs − 1))l is a skew quasi-cyclic code of length n with index l over R. It can lead us to compute the number of distinct skew cyclic and quasi-cyclic codes over R. A 1-generator skew quasi-cyclic code C defined as C generated by an element (g1(x), g2(x),... ,gl(x)) ∈ (R[x; θ]/(xn−1))l . For 1-generator skew quasi-cyclic codes, we have the following property
Proposition 3.3 Let C be an 1-generator skew quasi-cyclic code over R, which generated by (g1(x), g2(x),... ,gl(x)) ∈ (R[x; θ]/(xs − 1))l . For each i = 1, 2,... ,l, if gi(x) generates an R-free skew cyclic code over R, then C is R-free with rank s−degg(x), where g(x) = gcld(g1(x), g2(x),... ,gl(x), xs−1).
EXAMPLES Example 4.1 Let R = GR(4, 2), θ be a Frobenius auto orphism. Let g(x) = x + 2x2 + x + 3, which is a right divisor of x7 −1. Since gcd(2, 7) = 1, by Proposition 2.1 and Proposition 2.3, skew cyclic code C = is a free cyclic code with rank 7 − 3=4 over R. In fact, it is an [7, 4, 3] cyclic code. 3
Skew Cyclic and Quasi-Cyclic Codes of Arbitrary Length ...
233
Example 4.2 Let R = GR(9, 2), θ be a Frobenius auto orphism. Let g(x) = x + α2 is a right divisor of x4 − 1, where α is a primitive element in R. This polynomial generates a MDS skew cyclic code with parameters [4, 3, 2] over R. Since gcd(2, 4) = 2, this code is equivalent to a quasi-cyclic code of length 4 with index 2 generated by g1(x)=1 and g2(x) = α2x over R.
Example 4.3 Let R = GR(9, 2), θ be a Frobenius auto orphism. Let g(x) = x + α2 is a right divisor of x4 − 1, where α is a primitive element in R. Let C = (g(x), g(x), g(x)) be a 1-generator skew quasi-cyclic code of length 12 with index 3 over R. Then by Corollary 3.2 and Proposition 3.3, C is an R-free quasi-cyclic code of length 12 with index 2 × 3=6 over R. In fact, it is an [12, 3, 6] code over R.
234
Mathematical Theory and Applications of Error Correcting Codes
REFERENCES 1.
2. 3. 4. 5. 6.
T. Abualrub, A. Ghrayeb, N. Aydim and I. Siap, On the construction of skew quasi-cyclic codes, IEEE. Trans. Inform. Theory, 56(2010), 2081- 2090. D. Boucher, F. Ulmer, Coding with skew polynoial rings, Journal of Symbolic Computation, 44(2009), 1644-1656. D. Boucher, W. Geiselmann and F. Ulmer, Skew cyclic codes, AAECC, 18(2007), 379-389. M. Bhaintwal, Skew quasi-cyclic codes over Galois rings,Designs, Codes and Cryptography, 62(2012), 85-101. I. Siap, T. Abualrub, N. Aydin and P. Seneviratne, Skew cyclic codes of arbitrary length, Int. J. Inf. Coding Theory, 2(2011), 10-20.
SECTION 6
LOW DENSITY PARITY CHECK CODES
CHAPTER
13
On the use of ordered statistics decoders for low-density parity-check codes in space telecommand links Marco Baldi1,2 , Nicola Maturo1,2, Enrico Paolini3 and Franco Chiaraluce1,2 Dipartimento di Ingegneria dell’Informazione, Università Politecnica delle Marche, Ancona, Italy.
1
Consorzio Nazionale Interuniversitario per le Telecomunicazioni (CNIT), Parma, Italy.
2
Department of Electrical, Electronic, and Information Engineering “G. Marconi,” University of Bologna, Cesena, Italy
3
Citation: Baldi, M., Maturo, N., Paolini, E. et al. “On the use of ordered statistics decoders for low-density parity-check codes in space telecommand links”. J Wireless Com Network 2016, 272 (2016). https://doi.org/10.1186/s13638-016-0769-z Copyright: © The Author(s). 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Mathematical Theory and Applications of Error Correcting Codes
238
ABSTRACT The performance of short low-density parity-check (LDPC) codes that will be included in the standard for next-generation space telecommanding is analyzed. The paper is focused on the use of a famous ordered statistics decoder known as most reliable basis (MRB) algorithm. Despite its complexity may appear prohibitive in space applications, this algorithm is shown to actually represent a realistic option for short LDPC codes, enabling significant gains over more conventional iterative algorithms. This is possible by a hybrid approach which combines the MRB decoder with an iterative decoding procedure in a sequential manner. The effect of quantization is also addressed, by considering two different quantization laws and comparing their performance. Finally, the impact of limited memory availability onboard of spacecrafts is analyzed and some solutions are proposed for efficient processing, towards a practical onboard decoder implementation.
INTRODUCTION Wireless communication links in space missions extensively use errorcorrecting codes in order to improve transmission reliability [1–4]. There are two types of space links: telemetry (TM) links from space to ground and tele command (TC) links from ground to space. The purpose of a TM link is to reliably and transparently convey remote measurement information to users located on Earth. For such a purpose, several coding schemes are employed. In particular, • Convolutional codes, • Reed–Solomon codes, • parallel turbo codes, • Low-density parity-check (LDPC) codes. A comprehensive description of all these families of codes can be found, for example, in [5]. A TC link, in turn, is used to initiate, modify, or terminate equipment functions onboard (O/B) of space objects. From the communication viewpoint, TCs have a number of distinctive features. Among them, • •
the data rates (measured in bits/s) are usually very low compared with TM links, TCs are originated and assembled on the ground.
On the use of ordered statistics decoders for low-density ...
239
The first above feature implies milder requirements for the error-correcting code. Similarly, because of the second feature, fewer limits and constraints are imposed on the available transmitted power. Recommendations for TC space links are issued traditionally by two organizations: the Consultative Committee for Space Data Systems (CCSDS) and the European Cooperation for Space Standardization (ECSS). The only error-correcting code so far included in these recommendations [6, 7] is a Bose–Chaudhuri– Hocquenghem (BCH) code. This linear block code is characterized by a codeword length n=63 bits, an information block length k=56 bits and is typically decoded O/B via hard-decision decoding. Reliable space telecommanding is of fundamental importance as the success of a mission may be compromised because of an error corrupting a TC message. This imposes strict constraints on the maximum tolerable error rates. In particular, the code word error rate (CER) is defined as the ratio of the number of decoding failures to the total number of received code words. Requirements are often specified in terms of average CER, and a typical value is CER ≤10−5. TC links are currently evolving to include new and more challenging mission profiles, characterized by much more demanding requirements. Compliance with these new requirements imposes the use of more advanced coding techniques than the BCH (63, 56) code. In space telecommanding, it is always of fundamental importance to ensure the integrity of very short emergency commands, with a very low data rate and limited latency. Such constraints impose the adoption of short codes. The mentioned BCH code is short enough, but its performance is far from meeting the requirements of next generation missions. In the following analysis, we will refer to a classical binary phase shift keying (BPSK) modulation over an additive white Gaussian noise (AWGN) channel. This model is particularly suitable to describe deep-space communications. Using the standard BCH(63, 56) code with hard-decision decoding, the signal-tonoise ratio required to achieve CER ≈10−5 is E b /N 0≈9.1 dB, where E b is the energy per information bit and N 0 the one-sided noise power spectral density. This value of E b /N 0 can be reduced, so improving performance, by applying soft-decision decoding. For example, by using a decoder based on the BCJR algorithm [8], a gain of about 2.1 dB with respect to the harddecision decoder can be achieved. These E b /N 0 values, however, remain too large for space missions of next generation. These missions shall be characterized by a significant increase of the supported data rate and/or
240
Mathematical Theory and Applications of Error Correcting Codes
maximum distance with respect to the present scenarios. Both these factors degrade the signal-to-noise ratio and impose to achieve the target CER over a worse channel. In order to achieve higher coding gains, new advanced coding schemes have been designed and evaluated [9]. After a long campaign of simulations and comparisons, two short binary LDPC codes have emerged as the most attractive candidates. Most of the steps necessary for their inclusion in the standard for space TC have already been completed, so that they will certainly be employed in next-generation missions. LDPC codes can be decoded through classical soft-decision iterative algorithms (IAs), such as the sum-product algorithm (SPA) [10] or its simplified versions, e.g., min-sum (MS) [11] or normalized min-sum (NMS) [12]. As regards SPA, throughout this paper, we will refer to its loglikelihood ratio (LLR) implementation. By using LDPC codes with IAs, the coding gain is larger than with the BCH code, although the performance remains relatively far from the theoretical limits. More substantial improvements can result from the adoption of algorithms based on ordered statistics decoding (OSD) [13]. In this paper, in particular, we consider the well-known most reliable basis (MRB) algorithm [14]. MRB is a non-iterative soft-decision decoding algorithm able to approach the performance of a maximum likelihood (ML) decoder. The latter is optimal, in the sense of minimizing the decoding error probability. The main drawback of MRB is represented by its complexity, which makes its use problematic in TC links where decoding is performed O/B with very limited computational and memory resources. As mentioned, however, the length of the TC LDPC codes is small, namely, n=128 bits and n=512 bits, which makes MRB a potential candidate especially for the shorter code. Complexity can be reduced by resorting to a hybrid solution, which combines MRB with IAs. More precisely, the hybrid approach consists of performing low-complexity decoding through an IA, at first, and invoking the MRB algorithm only when the IA is not able to find any valid codeword (detected error). The hybrid decoder has recently been used to decode also LDPC codes constructed on non-binary finite fields [15], which represent another option for space TC links [16, 17]. Due to their higher decoding complexity, however, non-binary LDPC codes are less attractive than their binary counterparts. For the LDPC codes analyzed in this paper, the hybrid decoder outperforms the IA or the MRB algorithm when used individually. A
On the use of ordered statistics decoders for low-density ...
241
qualitative explanation of this favorable behavior is as follows. The IA decoders here considered are not bounded-distance decoders: Therefore, they may be able to successfully decode soft-decision sequences from the channel even if they have relatively large Euclidean distances from the BPSK-modulated transmitted codeword and, at the same time, they may fail to decode soft-decision sequences at relatively small Euclidean distances from it. The MRB decoder instead works in a completely different way: Once having fixed its order i, it is able to correct all the error patterns involving i errors or less in the most reliable basis. Then, the two decoders complement one each other, this way improving the individual performances. On the other hand, it is evident that in case the IA is characterized by a high undetected codeword error rate (UCER), the MRB algorithm is not invoked in many times where it could correct and the performance of the plain MRB is better than that of the hybrid approach. This event does not occur for the considered codes that, by design, are characterized by values of UCER which are orders of magnitude lower than the CER. As an example, while the CER value at the working point must be at least as low as 10−5, the UCER value at the same point must be at least as low as 10−9 [18]. Consequently, the MRB decoder is not invoked in a very few number of cases where it should be, and the advantage offered by the SPA-LLR to sporadically correct a (relatively) large number of errors dominates. While ensuring excellent error rate performance, the hybrid decoder is characterized by a significantly lower average complexity than MRB decoding used alone. Moreover, such a performance is not seriously affected by practical issues like, for example, quantization, on condition that suitable quantization rules are applied. Motivated by the above considerations, in this paper, a thorough analysis of the performance of the MRB algorithm, used individually or in hybrid form, is developed with reference to space TC applications. While a preliminary version of this analysis has been considered in [19], the study is here deepened by addressing all technical issues in a more complete way. Among them, we investigate the impact of limited memory availability (a typically very stringent constraint in O/B processing), with the aim to discuss also the problems arising in practical implementations. In fact, while the presence of enough O/B memory would help fast convergence of the decoding algorithm, memories are usually considered not reliable in hostile environments like the one characterizing space missions, so that their usage is generally minimized. As a consequence, “on-the-fly” computations are preferred to memory accesses. Therefore, we extend the analysis in [19] by
242
Mathematical Theory and Applications of Error Correcting Codes
addressing both the case of a limited amount of memory and the case of total absence of memory. We also study a parallel implementation for controlling other important variables, like the decoding latency. The organization of the paper is as follows. In Section 2, the considered error-correcting codes are described. In Section 3, the various decoding procedures are presented and expressions for computing their complexities are provided. The performance of these schemes is then assessed in Section 4, also taking into account the quantization issues. Section 5 is devoted to the analysis of the impact of limited memory and to the latency evaluation. Concluding remarks are given in Section 6.
LDPC CODES FOR SPACE TELECOMMAND LINKS The two considered LDPC codes are described in detail in [16]. The first code has information block length k=64 bits and code word length n=128 bits. The second code has information block length k=256 bits and code word length n=512 bits. Hence, both codes have code rate R c =k/n=1/2. A short overview of their main features is reported in the following. In particular, in Section 2.1, their structure is described, by specifying their parity-check matrices, while in Section 2.2, their weight spectrum properties are addressed.
Parity-check and generator matrices For the codes we consider, the parity-check matrix H is an array of M×M square sub matrices, where M=k/4=n/8. This is specified, for both codes, in Fig. 1.
Figure. 1. Parity-check matrices of the considered LDPC codes
In the figure, I M and 0 M denote the M×M identity and zero matrices, respectively, and Φ is the first right circular shift of I M . This means that Φ has a non-zero entry at row i and column j if and only if j=(i+1)
On the use of ordered statistics decoders for low-density ...
243
mod M. Moreover, Φ 2 represents the second right circular shift of I M , that is, Φ 2 has a non-zero entry at row i and column j if and only if j=(i+2) mod M, and so on. The ⌈ operator indicates element-wise modulo-2 addition.
As an alternative representation, a k×n generator matrix G can be obtained from the parity-check matrix H. The length-n codeword c corresponding to a length-k information block u can be then generated as c=u G. The matrices G for the two considered codes are reported in [16].
Weight distribution properties The Hamming weight of a codeword is defined as the number of its nonzero elements. The performance of a linear block code under optimum ML decoding is governed by its weight spectrum, that is the number of codewords of Hamming weight w for all integer 0≤w≤n. Unfortunately, even when the weight spectrum is perfectly known (which is a very favorable condition, normally unverified for LDPC codes), it is not possible to find an exact analytical expression of the error rate achieved by the ML decoder, even for very simple communication channel models. Thus, it is a common practice resorting to analytical bounds, an example of which is represented by the union bound that establishes an upper bound on the error rate of the considered code under ML decoding [20]. For a binary linear block code over the AWGN channel and BPSK modulation, let us denote by A w the multiplicity of the weight-w code words and by d min the code minimum distance. Then, the expression of the union bound for the CER is [21]
(1) The first term of the sum, corresponding to w=d min, is also known as the “error floor.” For sufficiently high values of E b /N 0, the error floor provides an excellent approximation of the performance of ML decoding. In this sense, it represents a first benchmark for any sub-optimum decoding algorithm: the closer the error rate offered by the decoding algorithm to the error floor, the smaller the gap between the sub-optimum decoder and the optimum ML decoder. As from (1), the computation of CER UB requires the knowledge of the complete weight spectrum of the code. It is known that for LDPC codes this may be a non-trivial task. For the considered codes, however, much work has been done to overcome this issue.
244
Mathematical Theory and Applications of Error Correcting Codes
In particular, the first and most significant terms of the weight distribution for the LDPC (128, 64) code have been specified as
(2) where the presence of the term A w x means that there are A w codewords with Hamming weight w. The multiplicities A 14,A 16 and A 18 are exact [22]; this part of the weight spectrum has been obtained through computer searches using a carefully tuned “error impulse” technique [23]. The other multiplicities are lower bounds on the actual values and have been obtained by using the approach proposed in [24]. The overall estimate is anyway sufficiently stable and allows to draw a reliable union bound, as will be done in Section 4. w
The most accurate evaluation, at least till now, of the weight distribution of the LDPC (512, 256) code has been reported in [22], where the estimated first terms of the spectrum have been specified as
(3) The multiplicities A 40 and A 42 appear to be exact, while A 44 and A 46 are approximate and, in general, there is not yet sufficient confidence on the reliability of the estimate, even as regards the value of d min. This does not allow to draw a sufficiently reliable union bound for the LDPC(512, 256) code.
DECODING ALGORITHMS As decoding algorithms we consider one OSD (the MRB algorithm) and three IAs (SPA, MS, and NMS). We investigate their performance when used alone or combined in the aforementioned hybrid decoding scheme.
MRB decoder Let us denote by 1 the length-n vector with all coordinates equal to 1. Then, let c=(c 0,c 1,…,c n−1) and x=(x 0,x 1,…,x n−1)=1−2 c∈{−1,+1}n be the transmitted code word and its baseband BPSK-modulated version (that is, before being modulated with the carrier frequency), respectively. Moreover, let be the corresponding received vector of soft
On the use of ordered statistics decoders for low-density ...
245
values after transmission over the AWGN channel and BPSK demodulation. The MRB decoder relies on a transformation from the original generator matrix, G, to a new matrix, G ⋆, based on the k most reliable independent bits corresponding to y. Upon reception of y, the reliability of a bit is measured in terms of the magnitude of its a priori LLR. More specifically, the a priori LLR for the ith codeword bit is defined as
where σ 2 is the variance of the Gaussian thermal noise. The reliability of bit c i is then defined as |L(c i )|. The MRB algorithm of order i may be summarized as follows: •
• •
Upon receiving the signal from the AWGN channel and demodulating it into the sequence y, find the k most reliable received bits and collect them in a length-k vector v ⋆. Perform Gauss-Jordan elimination on the matrix G, with respect to the positions of the k most reliable bits, to obtain a systematic generator matrix G ⋆ corresponding to such bits.1 Encode v ⋆ to obtain a candidate codeword c ⋆=v ⋆ G ⋆. Consider all (or an appropriate subset of) test error patterns (TEPs). By definition, a TEP, noted by e, is a binary vector of length k and Hamming weight w≤i. For each TEP e: Calculate .
•
Encode
•
• •
to obtain a codeword
If the Euclidean distance between and y is smaller than the one between x ⋆=1−2 c ⋆ and y, then update the candidate codeword as At the end of the process, the algorithm delivers the codeword c ⋆ which minimizes the Euclidean distance from y, limited to the codeword set associated with the considered TEP set. •
Hereafter, we denote by MRB (i) an instance of the algorithm with order i. Clearly, MRB (k) coincides with ML decoding, which is intractable in practice. Actually, the maximum number of TEPs to be tested, for each received sequence y, is equal to , and this number tends to become very large even for i≪k. So, in practice, only very small values of i can be considered.
Mathematical Theory and Applications of Error Correcting Codes
246
There exist several possible strategies to reduce the complexity of the MRB (i) decoder. Among them, our simulations confirm the effectiveness of the following sub-optimal strategies to stop the TEP analysis before it reaches : 1.
The number of patterns to be tested is reduced by properly choosing a reliability threshold A such that, if the distance between the current candidate codeword and y is lower than A, then the decoding algorithm outputs the candidate codeword without testing the remaining TEPs. 2. The TEPs are commonly generated in ascending weight order. However, it is possible to precompute a list of most likely TEPs [25], that is, a list containing the TEPs ordered according to the probability to yield codewords at small distances from y, regardless of their weight w≤i. The threshold criterion at the previous point can then be applied to the ordered list. One or both approaches allow to test an average number of patterns, denoted by N TEP in the following, significantly smaller than . As we will see in Section 3.3, this may have a significant impact on the decoding complexity. There are also other tricks that may be used to reduce the MRB decoder implementation complexity. A very effective one is based on the observation that a large number of TEPs have overlapping supports, where the support of a TEP is defined as the set of its nonzero coordinates. Due to the code linearity, we can compute the XOR of two TEPs and encode it through G ⋆. The resulting vector can then be added to the test codeword corresponding to the first TEP, in order to obtain the codeword corresponding to the second TEP. Since, for the small values of the order i we consider, the XOR of the two TEPs has a small Hamming weight, computing the corresponding codeword requires to sum a small number of rows of G ⋆, thus reducing the computational burden. The procedure can be, obviously, iterated.
Hybrid decoder The complexity of the MRB (i) algorithm, with an order i that allows to achieve good performance, may result too high for practical implementation. In this case, it is possible to resort to a hybrid approach, which consists of applying the MRB decoder downstream the iterative decoder, by invoking it only when the iterative decoder terminates reporting a failure, as it was
On the use of ordered statistics decoders for low-density ...
247
not able to find any codeword [26]. The procedure is summarized by the flow-chart in Fig. 2. We note that two possible error events characterize an iterative LDPC decoding algorithm, namely, detected error and undetected error. A detected error occurs when the decoder reaches the maximum number of iterations without converging to any valid codeword. On the other hand, an undetected error occurs when the decoder converges to a code word different from the transmitted one. Note also that every failure of the MRB decoder yields an undetected error, since a codeword is always found by this decoder (i.e., the MRB decoder is complete, as opposed to LDPC iterative decoders which are incomplete).
Figure. 2. Flow-chart for the hybrid algorithm (Y means “yes” and N means “no”)
As from Fig. 2, contrary to similar proposals appeared in previous literature, the MRB decoder is applied on the received sequence y and not on the output of the IA decoder. This permits us to circumvent the effects of the deteriorated soft values after a failure of the IA. Decoding complexity As mentioned in the previous sections, besides the error rate, another fundamental parameter for measuring the efficiency of a decoding algorithm is complexity. In case of the MRB algorithm, it is even the factor conditioning the applicability of the method. A possible definition of complexity is in terms of the average number of binary operations required for decoding a single codeword. More practical issues, like chip area requirements, power consumption, number of look-up tables, number of flip-flops, and memory occupation, should be considered as well. These factors, however, are strongly dependent on the hardware
Mathematical Theory and Applications of Error Correcting Codes
248
architecture and design choices, while we aim at performing a more general complexity assessment, for which the number of operations turns out to be a suitable metric. The average number of binary operations required by the MRB algorithm can be expressed by the following formula (4) In (4), we have taken into account that some of the operations are performed on real values; so, q is the number of quantization bits used to represent a real number. As we can see, the third term on the right-hand side (r.h.s.) is proportional to the number of TEPs; therefore, using N TEP instead of , as permitted by the application of the speed-up procedures described in Section 3.1, can yield a significant advantage. Equation (4) results from the evaluation of the computational effort required by the steps described in Section 3.1. In detail, the basic MRB decoding algorithm needs To order n real values. To perform Gauss-Jordan elimination on the k×n original generator matrix G to obtain the generator matrix G ⋆ in systematic form, in which the information bits coincide with v ⋆. • To perform the vector-matrix multiplication v ⋆ G ⋆. • To generate, on average, N TEP TEPs and perform the relevant calculations. Next, the average complexity of the hybrid approach results in • •
(5) where CIA is the complexity of the IA preceding the (possible) MRB decoding attempt, while α represents the fraction of decoding instances in which MRB is invoked, i.e., the rate of the calls to MRB. This is because, in the hybrid approach, the MRB algorithm is invoked only when the IA fails. To be more precise, looking at the flow-chart of the hybrid decoder in Fig. 2, it should be noted that the rate of calls to the MRB decoder equals the detected error rate of the IA. In fact, when an undetected error occurs for the IA, decoding terminates unsuccessfully without any call to MRB. So, in principle, α is different from the CER, as the latter captures both detected and undetected errors. However, since the undetected error rate of an IA is usually orders of magnitude smaller than the detected one (unless the minimum
On the use of ordered statistics decoders for low-density ...
249
distance of the code is very poor—this is not the case for the codes considered in this paper), we can assume α≈CER, this way making a negligible error. The expression of CIA in (5) depends on the adopted IA and can easily be obtained from the algorithm specification (details are omitted for saving space). For SPA, MS, and NMS, we have (6) (7) (8) respectively, where I ave is the average number of iterations and d v is the average column weight of the parity-check matrix. The considered codes are both characterized by d v =4. Finally, it must be noted that, starting from the previous expressions, the corresponding decoding complexities per information bit can be obtained by dividing the r.h.s. of (4)–(8) by k.
ERROR RATE VERSUS COMPLEXITY TRADEOFF EVALUATION In this section, the CER/complexity tradeoff offered by each of the decoding algorithms considered in Section 3 is assessed, when used to decode the two LDPC codes described in Section 2. As previously stated, the modulation format is BPSK. Moreover, a maximum number of iterations I max=100 has been used for each iterative algorithm. This value has been determined, from simulations, for the considered codes, as capable to obtain the best possible performance from the adopted IAs. This means that no significant extra gain is achievable by using a value of I max larger than 100. An explicit example will be presented in Section 4.2. On the other hand, by using (6), (7), and (8), but with I max in place of I ave, we see that the complexity in the worst case (that is, when the decoder reaches the maximum number of iterations) grows linearly with I max. As we will show in Section 5, the worst case complexity determines the worst case latency, the latter quantity being another design parameter that must be properly limited. As another argument to justify the limited gain induced by an increase in the maximum number of iterations, we can observe that in the region of medium/low error rates, I ave is usually much lower than I max, thus confirming that the maximum number of iterations is rarely reached (but it determines the maximum latency in the sense specified above). LDPC ( 1 2 8 , 6 4 ) code
250
Mathematical Theory and Applications of Error Correcting Codes
The CER curves for the LDPC(128, 64) code are shown in Fig. 3 in the ideal case in which all involved soft reliability values are unquantized, i.e., are represented with a very high precision (e.g., 32-bit floating point for each reliability value). The best performance is exhibited by the MRB algorithm with order i=4, either employed alone or in hybrid form. For the latter, we observe that performance is practically independent of the adopted IA; in fact, the CER curves by using SPA + MRB(4) or NMS + MRB(4) are superposed in the region of interest. The use of MRB allows to achieve a gain in the order of 1.6 dB at CER=10−5, over the IAs. Moreover, the gap to the union bound is very limited, in the order of 0.5 dB. The union bound has been plotted making use of (1) and (2) (i.e., including the terms up to weight 24).
Figure. 3. Performance comparison among the considered decoding algorithms for the LDPC (128, 64) code: unquantized case
To confirm the goodness of the hybrid algorithm, we have realized a further experiment, by simulating the performance of a mixed (purely theoretical and impractical) decoder that feeds the received sequence to the inputs of both decoders but keeps only the output of the successful decoder. We have verified that the CER performance of this “ideal” decoder is practically coincident with that of the “real” hybrid decoder over the whole range of error rate values of interest. To analyze the impact of quantization on the performance curves shown in Fig. 3, two quantization rules have been adopted, namely, a linear rule and a logarithmic rule. The input/output characteristics of these two quantizers are shown, with an example, in Fig. 4 a, b, respectively. While the linear quantization law is obvious, some explanations and comments about the
On the use of ordered statistics decoders for low-density ...
251
logarithmic one are reported in the Appendix. Further details can be found in [27]. Due to the large number of possible combinations of IA, quantization law, and number of quantization bits, we do not report the simulation curves for all considered decoding algorithms.
Figure. 4. Examples of a linear and b logarithmic quantization laws; q=4
In Fig. 5, the impact of quantization is shown for q∈{4,5,6} when using the hybrid algorithm, based on the SPA + MRB combination. The corresponding curves for the NMS algorithm are reported in Fig. 6. Looking at the figures, it is possible to conclude that q=6 is sufficient to reach practically the same CER of the ideal (i.e., unquantized) case. If q200. On the other hand, as mentioned above, doubling the maximum number of iterations the latency in the worst case is doubled as well and, taking into account typical design constraints (see Section 5 for details), the disadvantage is more significant than the advantage.
Figure. 8. Performance comparison among the considered decoding algorithms for the LDPC(512, 256) code: unquantized case
254
Mathematical Theory and Applications of Error Correcting Codes
Figure. 9. CER for the LDPC(512, 256) code, with SPA decoding in the unquantized case, by assuming different values of the maximum number of iterations I max
As regards the MRB algorithm, the simulations have been performed with an order i=3, when the algorithm is used alone, as a larger order becomes too complex to manage for k=256 and n=512. This limited MRB order has a negative impact on the error rate performance, which turns out to be unsatisfactory. On the contrary, when the MRB algorithm is employed in the framework of the hybrid decoder, it is possible to increase its order up to i=4 with beneficial effects in terms of CER. Consequently, the performance of the hybrid decoder (here implemented with the SPA) reveals to be the best one, although the gain over the SPA or the NMS algorithms used alone is much less remarkable than for the short code—being now in the order of 0.15 dB. Similarly to Figs. 5 and 6, in Figs. 10 and 11, the CER curves for the LDPC(512, 256) code are shown for a finite number of quantization bits. Specifically, Fig. 10 is relevant to the hybrid algorithm and Fig. 11 to the NMS algorithm, respectively. In both figures, results are shown for the two considered quantization laws with q∈{4,5,6}. The conclusions we can draw from this analysis are similar to those valid for the LDPC(128, 64) code. In this case, however, the loss resulting from the adoption of the linear law with q=4 is more pronounced (in the order of 0.8 dB). Hence, using q=4 is not advisable in this case, while q≥5 is enough to ensure a negligible loss. We also note that the NMS algorithm exhibits a sensitivity on the number of quantization bits that is slightly more evident than for the LDPC(128, 64) code.
On the use of ordered statistics decoders for low-density ...
255
Figure. 10. Impact of quantization on the performance of the hybrid algorithm (SPA + MRB(4)) for the LDPC(512, 256) code
Figure. 11. Impact of quantization on the performance of the NMS algorithm for the LDPC (512, 256) code
The complexity curves for the LDPC (512, 256) code, by assuming q=6, are illustrated in Fig. 12. The gap between the hybrid algorithm and the IAs obviously exist also in this case, but it is relatively less evident. In particular, at CER = 10−5, the hybrid algorithm and the SPA need almost the same average number of binary operations. At low E b /N 0 (or, equivalently, high CER), the MRB algorithm used alone is less complex than the hybrid algorithm. This is because, for the LDPC (512, 256) code, the plain MRB decoder uses order 3 while the hybrid decoder, for which the MRB algorithm is invoked more intensively when the channel quality is poor, uses order 4.
Mathematical Theory and Applications of Error Correcting Codes
256
Figure. 12. Average number of binary operations per decoding attempt for the LDPC(512, 256) code in the case q = 6
IMPACT OF LIMITED MEMORY The results presented in Section 4 have been obtained under the assumption that no constraints are put on the O/B available memory. Actually, in the optimal implementation of the MRB algorithm, an ordered TEP list is required. The size of this list can be very large, as it depends on the maximum number of TEPs that for the LDPC (128, 64) code and i=4, for example, results in NmaxTEP=679,121NTEPmax=679,121. Such a large number of TEPs requires the availability of more than 2.7 MB of memory (thanks to the sparse character of the TEPs, it is convenient to store only the positions of the set bits). This value may be significantly larger than the memory available for decoding O/B of a spacecraft: looking at recent missions, a typical size of the O/B memory is in fact 0.5 MB. For this reason, it is of paramount importance to investigate the performance degradation resulting from: •
the adoption of a non-ordered list (that occurs when the TEP list is generated on-the-fly, rather than storing it in a memory); • the adoption of an incomplete list (that occurs when only part of the list can be stored). In the following, we consider the hybrid decoding algorithm, as it is more suitable (than plain MRB) in view of practical implementation. For the sake of simplicity, the numerical analysis is focused on the LDPC(128, 64) code under the assumption of using an MRB decoder of order i=4. Moreover, according to the sub-optimal mechanisms described in Section 3.1, we use a stopping threshold A=24.5. Such a value of A has been heuristically optimized for the considered code.
On the use of ordered statistics decoders for low-density ...
257
As first step of the analysis, we investigate the tradeoff between performance and complexity with and without the ordered TEP list. This is done in Fig. 13, by neglecting the impact of quantization, by considering the NMS + MRB(4) hybrid algorithm, and by assuming that all TEPs can potentially be tested. It is clear from the figure that ordering has no impact on the performance if all the TEPs can be considered as possible candidates. This is because the order of the TEPs in the ordered TEP list turns out to be not substantially different from the order in which they are generated on-the-fly. More precisely, generating the TEPs on-the-fly, they are typically organized in “ascending weight,” that is from the smallest weight to the highest weight. Ordering, instead, looks at the TEPs probability to be chosen, organizing them from the highest probability to the lowest probability. The procedure for doing this is reported in [25].
Figure. 13. CER performance of the hybrid NMS + MRB(4) algorithm for the LDPC(128, 64) code with and without an ordered complete TEP list
In practice, ordering can be seen as a perturbation of the ascending weight rule that can cause some TEPs of weight j+z, with z integer greater than 0, but with a higher probability, to be processed before some TEPs of weight j. An explicative example is reported below.
Example 1 In case of i=4, the ordered TEP list is a collection of vectors with length 4, whose nonzero elements represent the positions of the symbols 1 in each TEP. Let us suppose that a portion of the ordered TEP list is
258
Mathematical Theory and Applications of Error Correcting Codes
.... 10 0 0 0 63 64 0 0 9000 62 64 0 0 62 63 0 0 61 64 0 0 61 63 0 0 60 64 0 0 8000 .... We see that patterns of weight 2, that is, with two nonzero coordinates, are intermingled with those of weight 1. For example, the pattern (63 64 0 0) is considered before the pattern (9 0 0 0). Similarly, the weight-1 pattern (8 0 0 0) is considered after some weight-2 patterns. The result shown in Fig. 13 is not surprising since, when the complete TEP list is considered, the perturbation induced by ordering does not affect the error rate performance (the assumption is that, potentially, all TEPs can be tested both with and without ordering). However, it may have an impact on the complexity. The latter statement is confirmed in Fig. 14, where the curve without TEP list ordering exhibits complexity values higher than those in the presence of ordering.
Figure. 14. Average number of binary operations per decoding attempt for the LDPC(128, 64) code in the case q=6 with and without ordered TEP list; the NMS + MRB(4) algorithm is used for decoding
On the use of ordered statistics decoders for low-density ...
259
Related with the complexity issue, it is also interesting to have a preliminary evaluation of the latency due to the MRB decoder. Latency is a measure of the time required for decoding and its value is normally subjected to restrictions, which are expected to be particularly severe when the TC link is used in emergency conditions. To have a first estimate of the average latency, let us consider (4), which provides the average complexity for the MRB algorithm, and remove from it the parameter q (number of quantization bits). This is because we realistically suppose that, in hardware implementation, during each clock cycle a vector of q bits is simultaneously processed. The first two terms in the resulting equation are due to the received word sorting and G ⋆ computing; as such, they are performed only once per decoding attempt. We call these terms the “fixed cost.” The third term, instead, depends on the number of TEPs and may be different for each decoding operation. We call this term the “variable cost.” As variable cost operations have to be performed for each TEP, they can be parallelized by assuming the availability of a number, noted by N Teu, of different “TEP evaluation units.” This way, the average number of vector operations to be performed at each unit results in
(9) where ⌈·⌉ represents the ceiling function. Denoting by f frequency, the average latency can be estimated as
clock
the clock
(10) Examples of average latency estimates are reported in Table 1, assuming f clock=100 MHz, for three different values of E b /N 0. From the last column we see that at E b /N 0=3.5 dB (which, according to Fig. 5, is sufficient to ensure CER ≈10−5 for the case of q=6), a clock frequency equal to 100 MHz and a number of TEP evaluation units equal to 100 yield an average latency in the order of 3 ms. Note that both these values of f clock and N Teu are feasible in a field-programmable gate array (FPGA) implementation. In absence of parallelization, i.e., N Teu=1, the average latency is about two orders of magnitude larger.
260
Mathematical Theory and Applications of Error Correcting Codes
Table 1. Average latency (in seconds) due to the MRB decoding process, by assuming f clock=100 MHz N Teu 1 10 100 1000 10,000
E b /N 0=1 dB 4.05 4.05·10−1 4.09·10−2 4.41·10−3 7.53·10−4
E b /N 0=2 dB 2.21 2.21·10−1 2.24·10−2 2.54·10−3 5.81·10−4
E b /N 0=3.5 dB 2.45·10−1 2.48·10−2 2.77·10−3 5.60·10−4 3.35·10−4
As previously pointed out, another case that deserves investigation is the one where we have a reduced number of TEPs. Even in this case, it is interesting to distinguish the case when we use an ordered list stored in memory from the case when the TEPs are progressively generated, starting from the ones with the lowest weight. Figure 15 shows the CER, under hybrid decoding (NMS + MRB (4)), when exploiting the ordering; Fig. 16 illustrates the corresponding results in the absence of ordering. The number of TEPs is assumed to be variable between 10,000 and =679,121NTEPmax=679,121. As expected, while the performance is independent of ordering when the complete list is considered (as shown in Fig. 13) this is no longer true for a reduced list size. More precisely, when an ordered TEP list is adopted, considering 200,000 TEPs is enough to ensure that practically no loss occurs. On the contrary, if the list is non-ordered, the same result is achieved by using 400,000 TEPs or more, whereas if the maximum number of TEPs is set equal to 200,000, there is a loss in the order of 0.25 dB.
Figure. 15. CER performance of the LDPC(128, 64) code with hybrid decoding for different sizes of the ordered TEP list (K =103).
On the use of ordered statistics decoders for low-density ...
261
Figure. 16. CER performance of the LDPC (128, 64) code with hybrid decoding for different numbers of non-ordered TEPs (K =103).
The maximum number of TEPs can be used, in turn, to estimate the maximum latency (i.e., the latency in the worst case), according to the method described before. In this case, there is no dependence on the E b /N 0 value, as the worst-case latency occurs when all TEPs are needed for a single decoding operation. Table 2 reports the maximum latency, considering f clock=100 MHz, for both the cases with and without ordering, by assuming 200,000 TEPs and 400,000 TEPs respectively. Table 2. Worst-case latency (in seconds) due to the MRB decoding process, with and without ordered list, by assuming f clock=100 MHz and a reduced number of TEPs N Teu 1 10 100 1000 10,000
Ordered list - 200,000 TEPs 8.40 8.40·10−1 8.43·10−2 8.73·10−3 1.17·10−3
Non-ordered list - 400,000 TEPs 16.79 1.68 1.68·10−1 1.71·10−2 2.01·10−3
For N Teu=100, when the list is ordered the latency is at most about 84 ms while, when the list is non-ordered, it is at most about 170 ms. By tolerating a maximum latency of this size, it is possible to generate the TEPs on-the-fly with a minimum impact on the performance.
262
Mathematical Theory and Applications of Error Correcting Codes
In Table 2, the worst-case latency has been determined by considering the MRB algorithm only. In the hybrid approach, the latency due to the IA must be taken into account as well. For fixing the ideas, in the following we suppose to apply the SPA but, obviously, the analysis can be repeated for any other IA. Combining (9) with (6), where a generic number of iterations I (i.e., not necessarily the average value) is considered and the number q of quantization bits is omitted, for the reasons explained above, the total latency results in
(11) where L IA represents the contribution due to the IA and L contribution due to the MRB algorithm.
MRB
the
and let us suppose that, Let us denote the worst-case latency by because of mission constraints, it is forced to not exceed a maximum value L max, i.e., (12) When SPA is used alone, (12) is satisfied by assuming the maximum admissible number of iterations, even larger than I max=100, although in Section 4.2 we have shown this may yield a negligible improvement. For the hybrid algorithm, instead, in order to satisfy (12), besides I we can choose the value of N TEP as well as the parallelization parameter N Teu. A numerical example is reported next.
Example 2 Let us assume L max=10−2 seconds and N Teu=100. Figure 17 shows the tradeoff between I and N TEP which allows to have . The maximum admissible number of iterations, for the assigned value of L max, is I=244 which corresponds to apply only the IA, since no margin exists for invoking MRB. According to (11), Fig. 17 can be used also for different N Teu’s by properly scaling the values of N TEP.
On the use of ordered statistics decoders for low-density ...
Figure. 17. Combinations I−N have LwTOT≈10−2LTOTw≈10−2 seconds
which
TEP
allows
263
to
The degrees of freedom offered by the choice of I and N TEP, for a given value of N Teu, can be used to optimize the error rate performance. In other words, we can search for the combination which allows obtaining the minimum value of E b /N 0 for a given target CER. An example is reported next.
Example 3 As in Example 2, let us assume L max=10−2 seconds. Moreover, we fix CER = 10−5. Table 3 reports the minimum values of E b /N 0 we have found, through a numerical search, for the case of N Teu=1 and N Teu=100, respectively, together with the corresponding optimal choice of I and N TEP. The choice of using SPA alone is also reported for the sake of reference. Table 3. Values of Eb/N0 required to achieve CER = 10−5 for different decoder configurations N Teu=1 4.75
N Teu=100 4.09
SPA used alone 5.10
I
170
20
244
N TEP
70
21,125
-
E b /N 0 [dB]
From the table, we see that the usage of the hybrid algorithm is advantageous with respect to the SPA used alone for both the considered cases, with a gain in the order of 0.35 dB when N Teu=1 and more than 1 dB when N Teu=100.
264
Mathematical Theory and Applications of Error Correcting Codes
CONCLUSIONS We have investigated the advantages resulting from the application of the MRB algorithm, used alone or in hybrid form, to the new short LDPC codes for TC space links. We have discussed the impact of quantization, showing that rather small numbers of quantization bits are necessary to obtain performance very close to that of the ideal, unquantized case. Special attention has been devoted to the evaluation of the complexity, expressed as the average number of binary operations required per each decoded codeword. Closed form expressions have been adopted for such a purpose. Our investigation has revealed that, as opposed to the common belief, the hybrid algorithm is a realistic and appealing option, and that it allows achieving, for both codes, the best known performance with an acceptable complexity. An optimal implementation of the MRB algorithm would require the availability of an ordered TEP list and a rather large memory, which may be unavailable O/B. Therefore, we have also investigated the implications of using a non-ordered list and/or an incomplete list. We have shown that an “on-the-fly” implementation is even possible with limited loss and estimated the required decoding latency. Although the proposed analysis refers to a very specific application, that is, space TC links, many findings are usual in a wider sense, e.g., for the application of short codes to machine-to-machine communication or for ultra-reliable communication with short codes.
ENDNOTE Since G is a full rank matrix by definition, its k rows being linearly independent vectors of length n>k, the Gauss-Jordan elimination always succeeds. However, in case the k columns of G corresponding to the initial k most reliable positions are not linearly independent, it may be necessary to replace some of the most reliable bits in v ⋆ with some other bit outside the initial set. 1
APPENDIX Given a real number x, its logarithmic quantization is performed by using the following rule
On the use of ordered statistics decoders for low-density ...
265
(13) where F is the so-called log-quantization factor. In our simulations, we have set F=0.67. Moreover
(14) being T the clipping threshold (i.e., the maximum value admitted) and d the quantization step. Both T and d are related to the number of quantization bits, q. The main advantage of the logarithmic rule, against the uniform one, is in the fact that the quantization levels are denser for small input values. Since the LLR values close to 0 are responsible for maximum decoder uncertainty, it is evident that reducing the quantization error on these values allows to reduce the decoding errors. Further details can be found in [27].
ACKNOWLEDGEMENTS The authors wish to thank Dr. Kenneth Andrews for helpful discussion on the weight spectrum estimation. They also wish to thank Prof. Ioannis Chatzigeorgiou for having suggested the test with the ideal decoder. This work was supported in part by the European Space Agency (ESA/ ESTEC) under the contract 4000111690/14/NL/FE “Next Generation of Uplink Coding Techniques – NEXCODE”. The final goal of the NEXCODE Project is the hardware and software implementation of LDPC decoding schemes for TC.
COMPETING INTERESTS The authors declare that they have no competing interests.
266
Mathematical Theory and Applications of Error Correcting Codes
REFERENCES JL Massey, in Advanced Methods for Satellite and Deep Space Communications (Ed. J. Hagenauer), Lecture Notes in Control and Information Science, 182, ed. by J Hagenauer. Deepspace communications and coding: a marriage made in heaven (SpringerHeidelberg and New York, 1992), pp. 1–17. 2. GP Calzolari, M Chiani, F Chiaraluce, R Garello, E Paolini, Channel coding for future space missions: new requirements and trends. Proc. IEEE. 95(11), 2157–2170 (2007). 3. T de Cola, E Paolini, G Liva, GP Calzolari, Reliability options for data communications in the future deep-space missions. Proc. IEEE. 99(11), 2056–2074 (2011). 4. F Chiaraluce, in Proc. 22nd Conference on Software, Telecommunications and Computer Networks (SoftCOM 2014). Error correcting codes in telecommand and telemetry for European Space Agency missions: an overview and new perspectives (Split, 2014). 5. TK Moon, Error Correction Coding: Mathematical Methods and Algorithms (Wiley, Hoboken, 2005). 6. CCSDS, TC synchronization and channel coding. Blue Book. CCSDS 231.0-B-2 (2010). 7. ECSS, Space data links—telecommand protocols, synchronization and channel coding. ECSS-E-ST-50-04C (2008). 8. LR Bahl, J Cocke, F Jelinek, J Raviv, Optimal decoding of linear codes for minimizing symbol error rate. IEEE Trans. Inf. Theory. IT-20(2), 284–287 (1974). 9. M Baldi, F Chiaraluce, R Garello, N Maturo, I Aguilar Sanchez, S Cioni, Analysis and performance evaluation of new coding options for space telecommand links—Part I: AWGN channels. Int. J. Sat. Commun. Netw. 33(6), 509–525 (2015). 10. J Hagenauer, E Offer, L Papke, Iterative decoding of binary block and convolutional codes. IEEE Trans. Inf. Theory. 42(2), 429–445 (1996). 11. M Fossorier, M Mihaljevic, H Imai, Reduced complexity iterative decoding of low-density parity check codes based on belief propagation. IEEE Trans. Commun. 47(5), 673–680 (1999). 12. J Chen, MP Fossorier, Near optimum universal belief propagation based decoding of low-density parity check codes. IEEE Trans. Commun. 50(3), 406–414 (2002). 1.
On the use of ordered statistics decoders for low-density ...
267
13. M Fossorier, S Lin, Soft-decision decoding of linear block codes based on ordered statistics. IEEE Trans. Inf. Theory. 41(5), 1379–1396 (1995). 14. Y Wu, CN Hadjicostis, Soft-decision decoding using ordered recodings on the most reliable basis. IEEE Trans. Inf. Theory. 53(2), 829–836 (2007). 15. M Baldi, F Chiaraluce, N Maturo, G Liva, E Paolini, A hybrid decoding scheme for short non-binary LDPC codes. IEEE Commun. Lett. 18(12), 2093–2096 (2014). 16. CCSDS, Short block length LDPC codes for TC synchronization and channel coding. Orange Book CCSDS 231.1–O–1 (2015). 17. M Baldi, M Bianchi, F Chiaraluce, R Garello, I Aguilar Sanchez, S Cioni, in Proc. IEEE 78th Vehicular Technology Conference (VTC Fall 2013). Advanced channel coding for space mission telecommand links (Las Vegas, 2013). 18. M Baldi, N Maturo, G Ricciutelli, F Chiaraluce, in IEEE 21st IEEE Symposium on Computer and Communications (ISCC 2016). On the error detection capability of combined LDPC and CRC codes for space telecommand transmissions (Messina, 2016), pp. 1105–1112. 19. M Baldi, N Maturo, F Chiaraluce, E Paolini, in Proc. 6th International Conference on Information and Communication Systems (ICICS 2015). On the applicability of the most reliable basis algorithm for LDPC decoding in telecommand links (Amman, 2015). 20. I Sason, S Shamai, Performance analysis of linear codes under maximum-likelihood decoding: a tutorial. Found. Trends Commun. Inf. Theory. 3(1–2), 1–225 (2006). 21. GC Clark, JB Cain, Error Correction Coding for Digital Communications (Plenum Press, New York, 1981). 22. K Andrews, Weight enumerators for LDPC codes. CCSDS 2015 spring meetings presentation, Pasadena (2015). 23. D Declercq, M Fossorier, in Proc. IEEE International Symposium on Information Theory (ISIT 2008). Improved impulse method to evaluate the low weight profile of sparse binary linear codes (Toronto, 2008), pp. 1963–1967. 24. X-Y Hu, MPC Fossorier, E Eleftheriou, in Proc. 2004 IEEE International Conference on Communications (ICC 2004), 2. On the computation of the minimum distance of low-density parity-check
268
Mathematical Theory and Applications of Error Correcting Codes
codes (Paris, 2004), pp. 767–771. 25. A Kabat, F Guilloud, R Pyndiah, in Proc. IEEE Global Telecommunications Conference (GLOBECOM ’07). New approach to order statistics decoding of long linear block codes (Washington, 2007), pp. 1467–1471. 26. M Fossorier, Iterative reliability-based decoding of low-density parity check codes. IEEE J. Select. Areas Commun. 19(5), 908–917 (2001). 27. M Baldi, F Chiaraluce, G Cancellieri, Finite-precision analysis of demappers and decoders for LDPC-coded M-QAM systems. IEEE Trans. Broadcast. 55(2), 239–250 (2009).
CHAPTER
14
Optimization of LDPC Codes over the Underwater Acoustic Channel
Shengxing Liu1,2 and Aijun Song3 1 Department of Applied Marine Physics and Engineering, Xiamen University, Xiamen 361102, China 2 Key Laboratory of Underwater Acoustic Communication and Marine Information Technology, Ministry of Education, Xiamen University, Xiamen 361102, China 3 Department of Electrical and Computer Engineering, University of Alabama, Tuscaloosa, AL 35487, USA
ABSTRACT To combat severe intersymbol interference incurred by multipath propagation of sound waves in the underwater acoustic environment, we introduce an Citation: Liu, S., & Song, A. (2016). “Optimization of LDPC Codes over the Underwater Acoustic Channel”. International Journal of Distributed Sensor Networks. https:// doi.org/10.1155/2016/8906985 Copyright: © 2016 S. Liu and A. Song. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
270
Mathematical Theory and Applications of Error Correcting Codes
iterative equalization and decoding scheme by iteratively exchanging soft information between a low-density parity check (LDPC) decoder and a decision feedback equalizer. We apply extrinsic information transfer (EXIT) charts to analyze performance of LDPC codes over the acoustic multipath channel. Furthermore, using differential evolution technique, we develop an EXIT-aided method to optimize LDPC codes for the underwater acoustic channel. Design examples are presented for two different realizations of the underwater acoustic channel that are generated by an acoustic ray tracing model. Computer simulations show that the optimized LDPC codes outperform its regular counterpart or Turbo codes under the same coding rate and block length, with gains of 1.0 and 0.8 dB, respectively, at the bit error rate of 10−5.
INTRODUCTION Underwater acoustic communications have attracted much attention in recent years due to their broad application perspectives in the civil and defense domains. Applications may include oceanographic data collection, ocean pollution monitoring, ocean exploration, and subsea tactical surveillance [1–4]. However, it is a challenging task to achieve reliable communication due to the difficulties imposed by the acoustic propagation physics. The major difference between the underwater acoustic and radio frequency electromagnetic channels is that the former is characterized by large multipath delays and Doppler dispersion. Most of the difficulties can be attributed to the low speed of sound 1500 m/s in the seawater, which is five orders of magnitude slower than the speed of electromagnetic waves [5, 6]. Decision feedback equalizers (DFEs) are commonly used to combat intersymbol interference associated with multipath fading in the underwater acoustic channel. An adaptive DFE based on the combined recursive least square (RLS) and second-order phase-locked loop was developed for high-rate underwater acoustic communication systems [7]. Experiments conducted in multiple environments demonstrated significant data rate and performance advancements. An adaptive multichannel combining and DFE equalization scheme was proposed in [8]. Near optimal spatial and temporal processing was proposed through joint minimum mean-squared error (MMSE) multichannel combining and equalization. Multichannel processing of underwater acoustic communication signals often leads to high computational costs. Complexity reduction was achieved by exploiting trade-offs between optimal diversity combining and beamforming in [9]. A sparse DFE structure that exploits the channel sparsity was developed in [10], to reduce receiver complexity.
Optimization of LDPC Codes over the Underwater Acoustic Channel
271
Powerful error correcting coding techniques are often employed to further reduce the bit error rate (BER) in the challenging underwater acoustic channel conditions. Due to channel variability resulting from environmental fluctuations such as ocean mixing and dynamic surface waves, the DFE alone cannot achieve adequate performance for practical communication applications. For example, a BER of less than 10−3 is required for digital voice communication systems to synthesize natural speech. Convolution codes and Reed Solomon block codes are commonly used for real-time underwater acoustic communication system due to their simplicity. These codes were still not adequate to satisfy the BER requirement [11, 12]. Spacetime trellis codes, layered space-time codes, and their combinations scheme were developed for high reliability and high data rate communications [13]. Turbo equalizer that jointly performs channel estimation, maximum a posteriori (MAP) equalization, and channel decoding was introduced for single-carrier coherent underwater acoustic communication [14, 15]. Several efforts have reported applying low-density parity check (LDPC) codes to the underwater acoustic environment, mostly for orthogonal frequency division multiplexing (OFDM) systems [3, 16]. In [16], nonbinary LDPC codes were proposed to enhance performance of uncoded orthogonal frequency division multiplexing (OFDM) communication for the underwater acoustic environment. In [3], LDPC decoding is coupled with soft minimum mean square error (MMSE) equalization for iterative detection on each subcarrier for multiple-antenna OFDM communications in the ocean. Irregular LDPC codes with an appropriate degree distribution may outperform Turbo codes under the same coding rate and block length [17]. However, LDPC codes exhibit a noise threshold phenomenon [18], when the iterative belief propagation decoding algorithm is used. If the channel noise level is smaller than, or below, a noise threshold, the BER converges to zero as the block length goes to infinity. Otherwise, an error floor exists. When the belief propagation algorithm is adopted to decode LDPC codes, extrinsic information is exchanged iteratively between the variable nodes and check nodes. Performance of LDPC codes converges as the iteration number increases. We can analyze the performance and, thus, determine their noise threshold by tracing evolution of the exchanged extrinsic information. Two types of methods, the density evolution algorithm [17] and extrinsic information transfer (EXIT) chart [19], can be used to calculate the noise threshold. The density evolution algorithm determines the threshold by tracing the evolution of the variables’ average distributions, under the
272
Mathematical Theory and Applications of Error Correcting Codes
assumption that the extrinsic information exchanged between the variable nodes and check nodes is independent random variables [17]. The density evolution algorithm has been widely used for analyzing and designing LDPC codes over different channels. Through the density evolution algorithm and related optimization techniques, multiple high-performance LDPC codes and their noise thresholds were given for the additive white Gaussian noise (AWGN) and binary symmetric channel in [17], and for flat Raleigh fading channel in [20]. However, these LDPC codes may not yield quality performance for intersymbol interference channels [21]. The density evolution algorithm was extended to investigate the limit of performance of LDPC codes over binary linear intersymbol interference channels with AWGN [22]. A simplified version of the density evolution algorithm, namely, the Gaussian approximation, was developed by approximating message densities as Gaussians (for regular LDPC codes) or Gaussian mixtures (for irregular LDPC codes) [23]. This is to reduce the computational load involved in calculation of noise threshold as well as in optimization of degree distributions in the density evolution algorithm. Further, these tasks are often difficult for common channel conditions. This simplification provides a faster solution to calculate the noise threshold and an easier way to design optimized LDPC codes for the AWGN channel. The EXIT chart was developed initially to analyze the performance of parallel concatenated codes by tracing evolution of average mutual information [19]. It was extended to analyze the performance of LDPC codes by matching the EXIT curves of the variable nodes decoder (VND) and those of the check nodes decoder (CND) [24]. Optimized LDPC codes were designed for the AWGN channel and for the multiple-input and multipleoutput (MIMO) setting [24]. High-quality LDPC codes via optimization of the EXIT charts were given for a flat Rayleigh fading channel in [25], for the Poisson pulse-position modulation channel in [26], and for high-dimension MIMO channels in [27]. Compared with the density evolution algorithm, the EXIT chart is simple in implementation. It can be readily applied to multipath environments, for example, the underwater acoustic channel. Here we focus our efforts on a single-carrier communication system in the underwater acoustic environment. The underwater acoustic channel generally contains both deterministic and stochastic properties. In a typical scenario, the deterministic characteristics are dependent on the sound frequency, surface and bottom acoustic properties, ocean sound speed
Optimization of LDPC Codes over the Underwater Acoustic Channel
273
profile, and location of transmitter and receiver. It means that the multipath or impulse response of the underwater acoustic channel may be quite different from one to another. The stochastic characteristics result from the rough and time-varying sea surface. We deal with the underwater acoustic channel at different transmission ranges. We use the BELLOP ray tracing model [28] to generate the impulse responses of specific channel realizations. We employ a DFE to process the received signal first and then adopt belief propagation algorithm to decode the information bits. To further improve performance, an iterative equalization and decoding scheme, termed as iterative DFE-LDPC structure, is proposed. In the DFE-LDPC structure, the soft information from the LDPC decoder is fed to the DFE as prior information. Since the DFE utilizes the channel information, as well as the soft information from the LDPC decoder, performance of the iterative scheme enhances with iterations. Limited investigations have been devoted to LDPC codes for single-carrier systems in the acoustic channel. One related effort was presented in [29], where soft-decision feedback equalization was combined with LDPC codes to provide robust detection performance for MIMO communications with different modulations schemes at different symbol rates, over different transmission ranges in the ocean. In this paper, we extend the EXIT chart to analyze the performance of LDPC codes over the underwater acoustic channel. Since performance of LDPC codes is strongly related to degree distributions and different impulse responses may have different appropriate degree distributions, we propose an EXIT-aided optimization for specific channel realizations. Optimization of LDPC codes often turns into minimization of a nonlinear cost function, where differential evolution has been shown to be effective and robust [30]. Differential evolution has been successfully applied to optimization of LDPC codes for both the AWGN [17] and flat Raleigh fading channel [20]. We show that this technique is also effective for the underwater acoustic channel. Design examples are given for two different realizations. Performance comparisons among the optimized LDPC codes, regular LDPC code, and Turbo code are provided over the two different channel realizations. The remainder of the paper is organized as follows. The underwater acoustic channel model is introduced and two impulse responses are given in Section 2. The iterative DFE-LDPC structure and its EXIT charts are presented in Section 3. Section 4.1 discusses the EXIT-aided method of code optimization by using the differential evolution technique.
274
Mathematical Theory and Applications of Error Correcting Codes
Section 4.2 presents design examples of LDPC codes. Section 5 provides the conclusion.
UNDERWATER ACOUSTIC CHANNEL MODEL Transmission characteristics of the underwater acoustic channels are affected by multiple factors, including the operating frequency, sound speed profile, sea surface conditions, bathymetry, sediment properties, and sourcereceiver geometry and its mobility. BELLHOP is a beam tracing model for predicting underwater acoustic sound propagation [28]. With known physical parameters of the ocean environment and source-receiver settings, both time-invariant and time-varying impulse responses can be simulated efficiently through the use of the model [28, 31]. We focus on investigating the effects of multipath propagation on the performance of LDPC codes. Only time-invariant impulse responses, therefore, are considered in this paper. In the simulating impulse response, we assume flat sea surface and ocean bottom in 100 m water depth, with sound speed profile shown in Figure 1, left [32]. We assume the ocean silt bottom with a sound speed of 1600 m/s and a density of 1.1 km/m3, and an attenuation coefficient of 0.8 dB/m. We assume a maximum sea surface wind speed of 10 m/s to infer the reflection characteristics. The transmitting and receiving nodes are positioned at 10 and 80 m above the seafloor. The acoustic operating frequency is 15 kHz.
Figure 1. Ray diagrams of the communication ranges: (a) UWAC1 and (b) UWAC2.
We consider two communication ranges, 1 and 2 km, to investigate the multipath structures and their impacts on equalization performance. The two associated impulse responses are referred to as UWAC1 and UWAC2, respectively, throughout the paper. Figures 1 and 2 show the ray diagrams
Optimization of LDPC Codes over the Underwater Acoustic Channel
275
and the two impulse responses for the ranges. As shown, path numbers increase from 7 to 11 as the communication range increases from 1 to 2 km. The delay spread increases from 33 to 72 ms. In both cases, the amplitude of the channel tap gets smaller at larger arrival delay, due to loss resulting from sea surface reflection and bottom attenuation. Compared with bottom attenuation, the reflection loss from the sea surface is more severe, due to the high acoustic operating frequency and a relatively high wind speed.
Figure 2. Impulse responses: (a) UWAC1 and (b) UWAC2.
Let the impulse response of the underwater acoustic channel be given by (1) where L is the number of channel taps and Al and τl are the amplitude and time-delay of the lth taps, respectively. Then, the received signal (𝑡)yt is (2) where 𝑥(𝑡)xt is the transmitted signal and 𝑤(𝑡)wt is the ocean ambient noise. The ocean ambient noise may include turbulent noise, shipping traffic noise, thermo noise, and wind driven noise [33]. We assume wt as the AWGN here.
ITERATIVE DFE-LDPC STRUCTURE AND ITS EXIT CHARTS We propose an iterative equalization and decoding scheme, referred to as iterative DFE-LDPC structure, to combat the sever multipath in the underwater acoustic channel. We obtain EXIT charts of the DFE-LDPC structure to analyze performance of the LDPC codes over the underwater acoustic channels.
276
Mathematical Theory and Applications of Error Correcting Codes
In the proposed structure shown in Figure 3, the received signal distorted by the underwater acoustic channel is first processed by a DFE and then applied by a LDPC decoder, during an initial iteration. At the next iterations, the soft decoding result from the LDPC decoder is fed back to the DFE as a priori information. Based on the received signal and the a priori information, the DFE improves its performance. The LDPC decoder then enhances its decoding performance, as a result. This iterative process goes on until the termination criterion is met: that is, the iteration number is greater than the maximum iteration number, or decoding is successful. In this structure, two different iterative processes exist. One is the iteration between the VND and the CND within LPDC decoding. The other is the iteration between the DFE and the LDPC decoder. Two iterations are referred to as the inner and outer iterative processes, respectively. We set the inner and outer maximum iteration number to 2 and 20, respectively, in our simulations. Figure 3 also illustrates the flow of information exchanging between the LDPC decoder and the DFE and those within the LDPC decoder.
Figure 3 Iterative DFE-LDPC structure with the information changing flow illustrated.
The DFE performs detection by considering all 2M modulation constellation possible hypotheses on the symbol s. The DFE computes the log-likelihood ratio (LLR): (3) where 𝑐𝑖ci is the ith coded bit mapped onto the vector symbol s for the channel output y and c′ is the a priori information from the LDPC decoder. The EXIT function of DFE is given by
Optimization of LDPC Codes over the Underwater Acoustic Channel
277
(4) where Eb and N0 are signal and noise power, respectively; IA,DFE and IE,DFE are the a priori and a posteriori mutual information for the DFE, respectively. The EXIT curve of the DFE is very complex, since it depends not only on the channel state, but also on the parameters of the DEF. A closed-form solution cannot be obtained. It is possible to obtain the EXIT curve of the DFE by Monte Carlo simulations. If we assume that both the information exchanged between the DFE and the VND and the information exchanged between the VND and the CND satisfy a Gaussian distribution, the EXIT function of the VND with degree dv is given by
(5) where 𝐼𝐴,VND and 𝐼𝐸,VND are the a priori and the a posteriori mutual information of the VND, respectively; 𝑅 is code rate of the LDPC codes; and 𝐽(⋅) is a function of mutual information [24]. The EXIT function of the CND with degree 𝑑𝑐 is given by (6) where 𝐼𝐴,CND and 𝐼𝐸,CND are the a priori and a posteriori mutual information of the CND, respectively. After one or more iterations, the a priori mutual information 𝐼𝐴,DFE of the DFE is given by (7) where IE,DFE is the mutual information from the last step iteration.
For irregular LDPC codes, the EXIT curve of the VND is the weighted average of all EXIT curves of the VND with degree 𝑑𝑖 (𝑖=2,3,…,𝑑𝑣,max) di i=2,3,…,dv,max: that is,
278
Mathematical Theory and Applications of Error Correcting Codes
(8) where dv,max is the maximal degree of variable nodes and λi is a parameter representing the proportion of edges for degree i to total edges in the bipartite graph corresponding to the variable nodes. Similarly, the EXIT curve of the CND is the weighted average of all EXIT curves of the CND with degree 𝑑𝑗 (𝑗=2,3,…,𝑑𝑐,max): that is, (9) where dc,max are the maximal degree of check nodes and ρj is a parameter representing the proportion of edges for degree j to total edges in the bipartite graph corresponding to the checks nodes. In Monte Carlo simulations, it is easy to obtain the EXIT curves of the VND through (5) and the EXIT curves of the CND through (6). Figure 4 shows the EXIT curves at the coding rate 𝑅=1/2R=1/2 and SNR 𝐸𝑏/𝑁0=7 over the two impulse responses: UWAC1 and UWAC2. In the calculation, the DFE has 100 taps which include 50 taps for feedback filter and 50 for forward filter. The recursive least squares (RLS) algorithm with a forget factor of 0.95 is adopted to update the DFE filter coefficients. As shown, the EXIT curves of the VND vary over the degree of variable nodes. Their patterns change differently for the types of impulse responses, even under the same SNR, coding rate, and DFE parameters. However, the EXIT curves of the CND are related only to the degree of the check nodes.
Figure 4. EXIT curves of the iterative DFE-LDPC structure over the two types of impulse responses: (a) EXIT curves of the VND and (b) EXIT curves of the CND.
Optimization of LDPC Codes over the Underwater Acoustic Channel
279
OPTIMIZATION OF LDPC CODES By matching the EXIT curves of the VND to those of the CND, we can analyze coding performance and obtain noise threshold for any kind of LDPC codes over the underwater acoustic channel. Furthermore, good LDPC codes can be found through the use of an optimization algorithm to search for the highest noise threshold in a degree distribution space. The searching process is very complex because there are many parameters to be determined simultaneously. The differential evolution is a fairly fast and reasonably robust method that optimizes the solution by iterating candidate solutions for a given measure of quality [30]. We use the differential evolution technique to search an optimized LDPC code here.
Differential Evolution Algorithm Considering the fact that the performance of a LDPC code is related mainly to the degree distribution of the variable nodes, and little to the degree distribution of the check nodes, we assume that the number of degree for the check nodes is 1 (or 2), and its degree is dc (or dc and dc+1) for simplicity. The number of degree for the variable nodes is greater than or equal to 3 and keeps invariable in the process of optimization. If the parameters 𝜆𝑖 (𝑖=2,3,) have been determined, then dc and its corresponding ρdc are determined 𝑣,max by the following equation:
(10) With the constraint in (10), the differential evolution algorithm for searching good degree distribution of the LDPC codes is described as follows.
Initialization (i)
(ii)
Choose distinct 𝐷 integers from set {2, 3, . . . , 𝑑V,max} as the degrees of the variable node. It is noted that the three numbers, 2, 3, and 𝑑V,max, must be chosen, and the other 𝐷-3 numbers are chosen randomly from the remaining numbers. For the initial iteration 𝑘=0, randomly generate 𝑀 (𝑀 ≥ 10D) D-dimensional vectors U𝑚,, 𝑚 = 0, 1, . . . , 𝑀 − 1, with sum of elements being equal to 1. For each vector U𝑚,, we obtain its corresponding noise threshold 𝜎𝑚,𝑘 of LDPC code
Mathematical Theory and Applications of Error Correcting Codes
280
(1)
by matching the EXIT curves of the VND to those of the CND. We label the largest 𝜎𝑚, as 𝜎best,𝑘, and its corresponding vector as Ubest,𝑘. Mutation. For each 𝑚 = 0, 1, . . . , 𝑀 − 1, randomly choose distinct four integers 𝑚1, 𝑚2, 𝑚3, and 𝑚4 from set {0, 1, . . . , 𝑀 − 1}, each different from the index. Define (11)
where F is the real constant to control the amplification of the differential variation. We choose 𝐹=0.5F=0.5 in our optimization. For each Vm,k, we obtain its corresponding noise threshold 𝜎′𝑚,𝑘 of LDPC code by matching the EXIT curves of the VND to those of the CND. We label the largest 𝜎′𝑚,′ as 𝜎′best,𝑘σ, and its corresponding vector as 𝐕best,𝑘V. (2) Evolution. For the k+1 iteration, if σm,k is smaller than 𝜎′𝑚,𝑘, Um,k+1 is set to 𝐕𝑚,𝑘Vm,k; otherwise, Um,k+1 is set to Um,k. If σbest,k is smaller than 𝜎′best,𝑘, 𝜎best,𝑘+1σbest,k+1 is set to 𝜎′best,𝑘; 𝐔best,k+1 is set to Vbest,k. Otherwise, 𝜎best,𝑘+1σbest,k+1 is set to 𝜎best,k; Ubest,k+1 is set to 𝐔best,𝑘. (3) Stop. If k0, then it is more likely that xi=0; conversely, if ri