342 18 7MB
English Pages XVI+458 [475] Year 2016
Nataša Živić Modern Communications Technology De Gruyter Graduate
Also of Interest Coding and Cryptography N. Živić, 2013 ISBN 978-3-486-75212-0, e-ISBN (PDF) 978-3-486-78126-7
Chaotic Secure Communication K. Sun, 2016 ISBN 978-3-11-042688-5, e-ISBN (PDF) 978-3-11-043406-4, e-ISBN (EPUB) 978-3-11-043326-5, Set-ISBN 978-3-11-043407-1
Computation and Communication Technologies S.T. Kumar, B. Mathivanan (Eds.), 2016 ISBN 978-3-11-045007-1, e-ISBN (PDF) 978-3-11-045010-1, Set-ISBN 978-3-11-045293-8
Communication and Power Engineering R. Rajesh, B. Mathivanan (Eds.), 2016 ISBN 978-3-11-046860-1, e-ISBN (PDF) 978-3-11-046960-8, Set-ISBN 978-3-11-046961-5
Nataša Živić
Modern Communications Technology
Author Dr.-Ing. habil. Nataša Živić University of Siegen Chair for Data Communications Systems Hoelderlinstrasse 3, D-57068 Siegen Germany [email protected]
ISBN 978-3-11-041337-3 e-ISBN (PDF) 978-3-11-041338-0 e-ISBN (EPUB) 978-3-11-042390-7 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2016 Walter de Gruyter GmbH, Berlin/Boston cover image: agsandrew/iStock/thinkstock Printing and binding: CPI books GmbH, Leck ♾ Printed on acid-free paper Printed in Germany www.degruyter.com
Special Thanks I would like to thank everybody involved in the writing, proofreading, material searching, consulting and publishing of this book. In particular, I would like to express my special thanks to Professor Christoph Ruland for his general support. I am much obligated to my esteemed colleague Obaid ur Rehman for his great help in preparing this manuscript, as well as my colleagues Iva Salom and Dejan Todorovic for their contribution about the audio signal in Chapter 2. I express my appreciation to all my colleagues from the Institute for Data Communications Systems in Siegen for their colleague fairness and support. My special thanks go to Amir Tabatabaei as well as for consultancy with Robin Fay and preparing of some figures by Tao Wu. I am very grateful to Professor Ljiljana Knezevic and her help in writing the book in English. I would also like to express my gratitude to my family for their patience, support in everyday life and tolerance for the time involved in the book preparation. At the end, I thank the publisher for his patience and professionalism during the time of the cooperation. The author Siegen, June 2016
Foreword This book is written using a decade-long experience in teaching “Fundamentals of Communications” and “Digital Communications Technology” at the University of Siegen, Germany. Thus, one part of the material is based on the scripts used for these lectures, as well as on the scripts for “Cryptographic methods and applications”. Another part of the presented material is based on the author’s working experience, which introduces a contribution to the book needed for engineering practice. Finally, several parts of the presented material are a result of work in the research field. There are numerous books about communications technologies providing a basic knowledge for students and engineers. These books cover more or less the topics which cannot be substituted in learning communications technologies. Having that in mind, this book is written in a way that these essential topics are also present, but not with too many details which can be found in already existing literature. For example, topics on modulation, line coding and transmission channel are placed in one chapter, instead of three separate chapters, which is a common practice in most of literature. Similarly, information theory and source and channel coding are also merged into one chapter. Instead, several topics which are not emphasized in most of the general books about communications technologies find more place and a greater emphasis in this book, as they are important for the state-of-the-art and possibly future development of communications. Therefore, one chapter is devoted only to the transmission over the wireless channel as wireless communications are dominating nowadays; another chapter is dedicated to wired transmission with accent to modern digital technologies and optical transmission, and a separate chapter addresses cryptography, which is inescapable in today’s communication systems. It is never easy to find the optimal amount of content and to introduce the needed background of mathematics and physics necessary for understandable and sufficient explanation of different topics, terms and concepts. This task is even more difficult when the technology from the past has to be jointly explained with the modern one, especially considering the fast progress and merging of communications technologies. It is up to the reader to estimate how far the book succeeded in the trial to put the basic knowledge together with the modern trends. The author Siegen, June 2016
Contents 1 Signals and Systems 1 1.1 Communication System 1 1.2 Data and Signals 3 1.2.1 Representation of Signals 7 1.2.2 Harmonic Analysis of Periodic Signals 11 1.2.3 Harmonic Analysis of Aperiodic Signals 20 1.2.4 Correlation and Convolution 22 1.2.4.1 Correlation of Periodic Signals 23 1.2.4.2 Convolution of Periodic Signals 26 1.2.4.3 Correlation of Aperiodic Signals 28 1.2.4.4 Convolution of Aperiodic Signals 29 1.2.5 Bandwidth and Filtering 31 1.3 Discretization 32 1.4 Compressive Sensing 36 1.4.1 Background 36 1.4.2 Compressive Sensing Model 39 1.4.3 Conditions for Signal (Sparse) Recovery in Compressive Sensing 39 1.5 Discrete and Digital Signals 41 1.5.1 Discrete Signals 41 1.5.1.1 Finite Impulse Response 43 1.5.1.2 Infinite Impulse Response 44 1.5.2 Digital Signals 46 1.6 Discrete and Fast Fourier Transform 47 1.6.1 Discrete Fourier Transform (DFT) 47 1.6.2 Spectrum Forming and Window Functions 48 1.6.3 Fast Fourier Transform (FFT) 51 2 Typical Communications Signals 57 2.1 Speech 57 2.1.1 Production and Modelling 57 2.1.2 Speech Channel Bandwidth and Power of Speech Signal 60 2.1.3 Current Values of Speech Signal 63 2.1.4 Coding and Compression 64 2.2 Audio 69 2.2.1 Sound and Human Auditory System 70 2.2.2 Audio Systems 73 2.2.3 Digital Audio Systems 76 2.2.3.1 Audio Coding 77 2.2.4 Audio in transmission systems 79
X Contents
2.2.4.1 2.2.4.2 2.3 2.3.1 2.3.2 2.3.2.1 2.3.2.2 2.3.2.3 2.3.2.4 2.3.2.5 2.4
File Formats and Metadata 79 Digital Broadcasting, Network and File Transfer 80 Image 81 Digital Image Processing System 82 Digital Image Processing Operations and Methods 83 Image Representation and Modelling 84 Image Improvement 84 Image Restoration 86 Image Compression 86 Image Analysis 92 Television 92
3 Random Processes 99 3.1 Probability Theory 99 3.1.1 Terms and axioms of the probability theory 100 3.1.2 Conditional Probability, Total Probability and Bayes’ Theorem 102 3.2 Random signals 104 3.2.1 Random Variables and Random Vectors 105 3.2.1.1 Distribution Function and Probability Density Function 106 3.2.1.2 Random Vectors 107 3.2.1.3 Conditional Probabilities of Random Vectors 108 3.2.2 Examples of Often Used Distributions 109 3.2.2.1 Uniform Distribution 109 3.2.2.2 Normal (Gaussian) Distribution Function 110 3.2.2.3 Exponential Distribution Function 111 3.2.3 Variance and Higher Order Moments 113 3.2.4 Moment Generating Function 116 3.2.5 Characteristic Function 118 3.2.6 Distribution of Function of Random Variable 119 3.2.7 Central Limit Theorem 121 3.3 Stochastic Processes 123 3.3.1 Ensemble, Stationarity and Ergodicity 123 3.3.2 Power Spectral Density and Wiener-Khinchin Theorem 126 3.3.2.1 White and Colored Noise 128 4 4.1 4.1.1 4.1.2 4.1.3 4.2
Information Theory and Coding 129 Information Theory 129 Coding Components of Communication System 129 Definition of Information 131 Entropy 132 Source Coding 136
Contents XI
4.2.1 4.2.2 4.2.2.1 4.2.2.2 4.3 4.3.1 4.3.1.1 4.3.1.2 4.3.1.3 4.3.1.4 4.3.1.5 4.3.2 4.3.2.1 4.3.2.2 4.4 4.5
Code Definition 138 Compression Algorithms 140 Huffman Coding 140 Arithmetic Coding 142 Channel Coding 144 Block Coding 145 Hamming Codes 147 Cyclic Codes 148 Cyclic Redundancy Check Codes 150 Reed Solomon Codes 152 Low Density Parity Check Codes 155 Convolutional Coding 159 Viterbi Algorithm 162 Turbo Codes 163 Concatenated Codes 171 Joint Source and Channel Coding 172
5 Digital Transmission 175 5.1 Model of a Digital Transmission System 175 5.2 Channel Model 175 5.3 Channel Capacity 182 5.4 Base-band Transmission 186 5.4.1 Line Coding 186 5.4.1.1 Non-Return-To-Zero (NRZ) and Non-Return-To-Zero Inverted (NRZI) 188 5.4.1.2 Return-To-Zero (RZ) and Return-To-Zero Inverted (RZI) 189 5.4.1.3 Alternate Mark Inversion and Inverted Alternate Mark Inversion 191 5.4.1.4 Manchester 192 5.4.1.5 Differential Manchester 192 5.4.1.6 High Density Bipolar n (HDBn) 193 5.4.1.7 Binary 3 Ternary / Modified Monitored Sum 43 194 5.4.1.8 Scrambling 195 5.4.2 Intersymbol Interference 196 5.4.3 Partial Response Signalling 203 5.4.4 Optimization of Transmission System 204 5.4.4.1 Optimum and Matched Filter 204 5.4.4.2 Correlation Receiver 205 5.4.4.3 Integrate & Dump Receiver 205 5.4.5 Equalization 206 5.5 Digital Modulation 207 5.5.1 Amplitude Shift Keying 208
XII Contents
5.5.2 5.5.3 5.5.4 5.5.5
Frequency Shift Keying 210 Phase Shift Keying 216 ASK/PSK 225 M-Quadrature Amplitude Modulation (M-QAM) 226
6 Multiplexing 229 6.1 Introduction 229 6.2 Space Division Multiplexing 230 6.3 Time Division Multiplexing 231 6.3.1 Synchronous Time Division Multiplexing 231 6.3.2 Plesiochronous Digital Hierarchy 233 6.3.3 Synchronous Digital Hierarchy 235 6.3.4 Aynchronous (statistical) Time Division Multiplex 238 6.4 Frequency Division Multiplexing 240 6.5 Orthogonal Frequency Division Multiplexing 242 6.5.1 Principles 242 6.5.2 OFDM Scheme 245 6.5.2.1 OFDM Transmitter 245 6.5.2.2 OFDM Receiver 247 6.5.2.3 OFDM Signal 248 6.5.3 Coded Orthogonal Frequency Division Multiplexing 249 6.6 Wavelength Division Multiplexing 252 6.7 Polarization Division Multiplexing 254 6.8 Orbital Angular Momentum Multiplexing 256 6.9 Code Division Multiplexing (CDM) 257 6.9.1 Code Sequences 257 6.9.1.1 Pseudonoise Sequences 258 6.9.1.2 Walsh Sequences 260 6.9.2 Spread Spectrum 261 6.9.2.1 Frequency Hoping 261 6.9.2.2 Direct Sequence 264 6.9.2.3 Time Hopping 268 6.9.2.4 Chirp Spread Spectrum 268 7 Wired Transmission Media 271 7.1 Introduction 271 7.2 Cables with Copper Conductors 271 7.2.1 Communication Lines 272 7.2.2 Applications of Copper Cables 284 7.2.2.1 Cables with Twisted Pairs 284 7.2.2.2 DSL Technology 289
Contents XIII
7.2.2.3 7.3 7.3.1 7.3.2 7.3.3 7.3.4 7.3.5 7.3.6
Coaxial Cables 293 Optical Cables 297 Propagation of Light through Optical Waveguides 297 Attenuation in Glass Fibers 305 Dispersion in optical fibers 310 Types of optical fibers and cables 313 Construction Structures of Optical Cables and Their Applications 318 Historical Milestones and Development Trends in Fiber Optics 322
8 Wireless Channel 329 8.1 Channel Characteristics 329 8.2 Multipath Propagation 331 8.3 Propagation Model 334 8.3.1 One-Way Propagation 334 8.3.2 Two-Way Propagation 335 8.3.3 N-Way Propagation 336 8.4 Fading Model 336 8.4.1 Rayleigh Fading 336 8.4.2 Rician Fading 337 8.5 Doppler Shift 339 8.6 Diversity 342 8.6.1 Diversity and Combining Techniques 342 8.6.1.1 Selection Combining 344 8.6.1.2 Switched Combining 345 8.6.1.3 Maximal Ratio Combining 346 8.6.1.4 Equal Gain Combining 348 8.6.2.1 Precoding 351 8.6.2.2 Spatial multiplexing 351 8.6.2.3 Diversity coding 352 8.7 Propagation and Path Loss in Free Space 352 8.7.1 Concept of Free Space 352 8.7.2 Path Loss 352 8.7.3 Path Loss in Free Space 353 8.7.4 Effective Isotropic Radiated Power 353 8.8 Path Loss Prediction Models 354 8.9 Software Defined Radio 354 8.10 Cognitive Radio 356 9 9.1
Cryptography 361 Basic terminologies 361
XIV Contents
9.1.1 9.1.2 9.1.3 9.1.4 9.1.5 9.1.6 9.1.6.1 9.1.6.2 9.1.6.3 9.1.6.4 9.2 9.2.1 9.2.2 9.2.2.1 9.2.3 9.3 9.3.1 9.3.2 9.3.3 9.3.3.1 9.3.3.2 9.4 9.4.1 9.4.2 9.4.3 9.4.4 9.4.5 9.4.6 9.5 9.6 9.6.1 9.6.2 9.6.3 9.6.4 9.6.4.1 9.7 9.7.1 9.7.2 9.7.3 9.7.3.1 9.7.3.2
Crypto ABC 361 Cryptographic Design Principles 362 Encryption/Decryption 362 Key Based Encryption 363 Symmetric Cryptography 364 Asymmetric Cryptography 365 Asymmetric Encryption 366 Digital Signatures 366 Man-in-the-Middle Attack 368 Certificate 368 One Way Collision Resistant Hash Function 370 Characteristics 370 Security of a One Way Collision Resistant Hash Function 371 Random Oracle and Avalanche Effect 373 Hash Functions in Practice 373 Block Cipher 374 Product Cipher 374 Padding 375 Block Ciphers in Practice 376 Advanced Encryption Standard 376 Lightweight Cipher PRESENT 377 Modes of Operations for Block Ciphers 379 Electronic Codebook (EBC) 379 Cipher Block Chaining (CBC) 381 Cipher Feedback (CFB) 382 Output Feedback (OFB) 383 Counter Mode (CTR) 384 Other Modes of Operation 385 Bit Stream Ciphers 386 Message Authentication Codes 388 Generation 388 MAC generation using symmetric block cipher 389 MAC Generation Using Dedicated Hash Function 390 Security Aspects 390 Length Extension Attack 390 Digital Signatures 392 Digital Signatures with Appendix 392 Digital Signatures with Message Recovery 394 RSA 395 Introduction 395 Generation of RSA key system 396
Contents XV
9.7.4 9.7.4.1 9.7.4.2 9.7.4.3 9.7.5 9.7.5.1 9.7.5.2 9.7.5.3 9.7.6 9.7.6.1 9.7.6.2 9.8 9.8.1 9.8.2 9.8.2.1 9.8.2.2 9.8.2.3
El-Gamal 398 Introduction 398 Authentication of Message 398 Verification of Message 399 Digital Signature Algorithm (DSA) 400 Introduction 400 Authentication of Message 400 Verification of Message 401 Elliptic curve digital signature algorithm (ECDSA) 402 Elliptic Curves 402 ECDSA 405 Random Numbers 407 Randomness 407 Random Number Generation 408 True Random Number Generation 408 Pseudo Random Number Generation 408 Cryptographically Secure Pseudo Random Number Generation 409
References 411 List of Acronyms 431 Index 447
1 Signals and Systems 1.1 Communication System Signals are electrical equivalents of data to be transmitted through a communication system. A complete communication system consists of two stations, each equipped with a transmitter and a receiver, or combined into a single device called transceiver (Fig. 1.1). The medium of signal transmission can be wired (see Chapter 7) or wireless (see Chapter 8).
Data station
Transmission line
Data station
Data communication system
Fig. 1.1: Elements of communication system.
A data station consists of a Data Terminal Equipment (DTE) and a Data Circuit Terminating Equipment (DCE). DTE converts user data into signals or reconverts received signals into user data. DCE is intermediate equipment between DTE and a data transmission circuit (Fig. 1.2). The boundary between DTE and DCE is called interface and is defined according to the properties of transmission lines and exchanged signals between DTE and DCE. Intermediate devices (e.g. error control device, synchronization devices etc.) can be added into interfaces. Generally, an interface is also a boundary for performance and achievement of a network provider, his ownership and responsibility. Interfaces are internationally standardized, e.g. by ITU-T: – V: Data Communication over the telephone network (e.g. V.24/V.28, V.10, V.11) – X: Data networks, open system communications and security (e.g. X.20, X.21, X.25, X.26, X.27) – I: Integrated Services Digital Network (ISDN) – G: Transmission systems in media, digital systems and networks – H: Audiovisual and multimedia systems – T: Terminals for telematic services – Z: Languages and general software aspects for telecommunication systems – etc.
2 | Signals and Systems
Data station Data Terminal Equipment (DTE)
Source, sink
Data-Circuit terminating Equipment (DCE)
Interface
Remote control unit
Signal Connection Transmission line converter unit
Error monitoring, synchronization
Data connection Transmission section Data communication system
Fig. 1.2: Elements of data station.
A DTE is a functional unit serving as a data source or a data sink and providing control function for data communication accordingly to the link protocol. A DTE can be a user himself or a device interacting with user, e.g. through a human-machine interface. DTE consists of: – Data source or data sink in a form of data producers (input devices), data processing devices and data consumers (output devices). – Controller with data preparation device, parallel-serial converter and serialparallel converter, error control device, address recognizer end device, synchronizer device, sending or receiving part, management and data transmission control. Examples of DTEs are terminals, memories, keyboards, printers, data concentrators and computers. A DCE (also called Data Communication Equipment and Data Carrier Equipment) performs functions such as: line clocking, conversion of signals from DTE in corresponding form for transmission (line coding and modulation, see Chapter 5) and, the opposite, conversion of transmitted signals into a form understandable for DTE (line decoding and demodulation, see Chapter 5). A DCE can be realized as: – Modem (modulator/demodulator) for broadband transmission – Data connection device for leased lines – Data remote control device for data lines – Network Terminator (NT) for ISDN and xDSL (see Chapter 7)
Data and Signals | 3
A DCE can consist of a signal converter, a connection device, synchronization device and automatic dialling equipment (if needed for a connection setup). Technical realization of a DCE depends on data rates, synchronous/ asynchronous transmission, lines grouping and speed switching, multiplexing, synchronous/asynchronous conversion and basis band or broadband transmission (see Chapter 5). Network provider A
Network provider B
DCE
DCE
DCE
DTE
DTE
DCE
Fig. 1.3: Interfaces.
Interfaces between two DTEs and between DCEs have to differ from interfaces between a DCE and a DTE. Different variants of interfaces are needed also for connection between two DTEs or two DCEs. Signalization protocols used on different interfaces can be the same, where several parameters have to be used differently.
1.2 Data and Signals Data to be transmitted from the transmitter to the receiver over the communication system can be various: speech, text, numbers, images, videos etc. Data have to be firstly converted into an electrical signal, i.e. into the form convenient for the transmission. Data is generally divided into two big groups: 1. Continuous data: that is continuous functions of time, whose values belong to the infinite continuous set; it is transmitted using analog communication systems. Examples: classic analog telephony, radio and TV diffusion etc. 2. Discrete data: that is not functions of time and its values, so-called symbols, belong to the finite set called alphabet; it is transmitted using digital communication systems. Examples: computer data etc.
4 | Signals and Systems
Signals, as electric equivalent of data, can be represented in a two-dimensional coordinate system. There are four groups of signals depending on their representation: – Signals with continuous values and time (analog signal) – see Figure 1.4 – Signals with discrete values and continuous time – see Figure 1.5 – Signals with continuous values and discrete time – see Figure 1.6 – Signals with discrete values and time (digital signal) – see Figure 1.7 Digital signals will be a subject of this book. s(t)
t
Fig. 1.4: Signal with continuous values and time (analog signal).
s(t)
t
Fig. 1.5: Signal with continuous values and discrete time.
Data and Signals | 5
s(t)
t
Fig. 1.6: Signal with discrete values and continuous time.
s(t)
t
Fig. 1.7: Signal with discrete values and time (digital signal).
Signals can be generally divided into: 1. Deterministic signals: they can be represented in a form of a defined time function and can be further divided into: a) Periodic signals: they repeat a pattern over identical subsequent periods, so-called cycles (Fig. 1.8) b) Aperiodic (non-periodic) signals: they never repeat (Fig. 1.9) 2. Random signals: they cannot be represented in a form of a defined time function (Fig. 1.10); hence, they are represented using statistic parameters (average values, quadratic values, moments of order n, probability density function etc.)
6 | Signals and Systems
s(t)
T
2T
3T
4T t
Fig. 1.8: Periodic signal.
s(t)
t
t
Fig. 1.9: Aperiodic signal.
s(t)
t
Fig. 1.10: Random signal.
User data and therefore also signals to be transmitted over a communication system have a random nature. Nevertheless, this randomness is problematic for communication, because: – The receiver can recognize only deterministic signals, and – Properties of a communication system cannot be analysed using random, but deterministic signals. Therefore, the aim of communication is to exclude randomness and with it connected uncertainty of the signal at the receiver.
Data and Signals | 7
1.2.1 Representation of Signals The most convenient signals in communications are periodic signals: they are deterministic and defined in one period. Furthermore, sine i.e. cosine functions (socalled harmonic functions) are the most convenient functions among periodic functions, as they are completely described using three parameters: – Amplitude U (maximal signal value) – Frequency f i.e. period T, where:
f = 1 / T [Hz = s −1 ]
(1.1)
Phase ϕ (shifting of the function transition through t = 0)
–
u (t ) = U sin(2πft ± ϕ )
(1.2)
The sine signal (1.2) is presented in Figure 1.11 depending on time t and therefore it is the time function of a sine signal. s(t)=Usin(2pft+j) U
t sinj -U T=1/f
Fig. 1.11: Sine signal.
An example of a digital periodic function is shown in Figure 1.12. s(t) U
-T
0
T
T
T
T
Fig. 1.12: Pulse signal.
t
8 | Signals and Systems
Beside a time representation of the signal (time function), where the signal is represented depending on time t, there is a frequency representation of the signal (frequency function), where the signal is represented depending on its frequency f. Frequency function of a signal is also called signal spectrum. There are two types of spectrum for a complete signal description: – Amplitude spectrum: dependence of signal amplitude on frequency – Phase spectrum: dependence of signal phase on frequency Harmonic functions, for example a sine function, can be mathematically represented by a pointer running in a counter-clockwise direction of the circle (Fig. 1.13). The pointer runs with a constant velocity ω and needs one period T of time to cross one circle of 2π:
ω=
Pointer
a
sina
(1.3)
y
Rotation of pointer
1
2π = 2πf T
Projection on y-axis
270° 90°
1
-1
180°
360°
a
Full cycle (one revolution of pointer)
Fig. 1.13: Relation between a circle function and a sine function.
This results in a time function u(t): u ( t ) = U sin( ω t ± ϕ )
(1.4)
If the phase ϕ is positive, the sine function is rushing ahead, and if ϕ is negative, the sine function is delayed compared to the case that ϕ = 0. Cosine function is a sine function with ϕ = π/2 (Fig. 1.14):
u(t ) = U sin(ωt ± π / 2) = ±U cos(ωt )
(1.5)
Data and Signals | 9
u sin
U
p _
wt
2 cos 2p
Fig. 1.14: Connection between sine and cosine.
The representation of harmonic functions using a pointer in a circle is important for the transition to the complex signal presentation:
u(t ) = U ⋅ e j (ωt +ϕ ) = U ⋅ e jωt ⋅ e jϕ
(1.6)
The instantaneous value of the voltage is a real part of u(t):
u(t ) = Re{u(t )} = Re{U ⋅ e j (ωt +ϕ ) } = U ⋅ cos(ωt + ϕ )
(1.7)
The Equation (1.6) can be written (Euler’s formula [Sti02]) as:
1 1 u (t ) = U ⋅ cos(ωt + ϕ ) = U ⋅ e j (ωt +ϕ ) + U ⋅ e − j (ωt +ϕ ) = 2 2 1 1 * U ⋅ e j ω t ⋅ e j ϕ + U ⋅ e − jω t ⋅ e − jϕ = c 1 ⋅ e jϕ + c 1 ⋅ e − jϕ 2 2
(1.8)
with the complex pointer:
1 1 1 c1 = Ue jϕ = U cosϕ + j U sin ϕ 2 2 2
(1.9)
and the conjugate complex pointer:
1 1 1 * c1 = Ue − jϕ == U cos ϕ − j U sin ϕ 2 2 2
(1.10)
The voltage signal u(t) is the sum of two conjugate complex pointers c1 and c1∗ in a complex plane, where c1 runs in a non-clockwise direction (ω) and c1∗ in a clockwise direction (–ω). Hence, the sum of them is in every moment real i.e. u(t) (Fig. 1.15). Formally, –ω presents negative frequencies which physically cannot be measured as they do not exist in nature. These are the consequence of a complex
10 | Signals and Systems
presentation of harmonic signals. With every negative frequency –ω appears a positive frequency +ω, as (1.8) shows. Therefore, spectral presentation can be one-sided, if only positive frequencies are presented but with the amplitude according to the sum of |c1| and |c1∗|, and two-sided, if positive and negative spectral components are presented (Fig. 1.16). Im t
j
c1ejwt
u(t)
wt+j -1
-(wt+j) * 1
1
Re
-jwt
c e
t
-j
Fig. 1.15: Pointers on a complex plane.
|cn| *
|c1 |
-w
|c1|
0
w
w
Fig. 1.16: Positive and negative frequencies.
Figure 1.16 shows a two-sided amplitude spectrum of a sine and cosine function. Obviously, the spectrum is very simple, as it has only components at a frequency ±ω. The aim of the modeling of transmission systems is to present signals and the system using very simple functions, i.e. harmonic functions: the input signal s(t), the transmission system and the output signal g(t) (Fig. 1.17) are to be described by overlapping of sine and cosine functions (Fig. 1.18). For this purpose Fourier transform is used.
Data and Signals | 11
1.2.2 Harmonic Analysis of Periodic Signals A periodic signal s(t) with a period T can be approximated by a function f(t) consisting of a direct current (DC) component, sine and cosine components:
f (t ) =
a0 ∞ 2πt 2πt ∞ + ∑ an ⋅ cos n ⋅ + ∑ bm ⋅ sin m ⋅ 2 n=1 T T m=1
(1.11)
The equation (1.11) is known as Fourier series [Fou08]. Coefficients a0, an and bm are Fourier coefficients and have to be calculated, so that the approximation of s(t) using f(t) gives a minimum error using the minimum error criterion: t0 +T
∫ ( s(t ) − f (t ))
dt = min
(1.12)
∂F ∂F = = 0 , for all n, m ∂an ∂bm
(1.13)
F=
2
t0
i.e. the following has to be fulfilled:
and:
∂2F > 0 , for all n; ∂an2
∂2F > 0 , for all m ∂bm2
(1.14)
The error function F is calculated as: t0 +T
F=
∫
t0
2
a0 ∞ 2πt 2πt ∞ − − ⋅ s ( t ) a cos dt − ∑ bm ⋅ cos m ⋅ n ⋅ ∑ n T T n=1 2 n=1
(1.15)
The coefficient a0 can be calculated as follows:
∂ ∂F = ∂a0 ∂a0 =
with:
∂ ∂a0
t0 +T
∫
t0
2
a s (t ) − 0 dt 2
t0 +T
∫ s(t )
t0
t0 +T 2
dt −
(1.16) t0 +T
∫ s(t )a dt + ∫ 0
t0
t0
a02 dt = 0 − 4
t0 +T
∫ s(t )dt +
t0
2a0T =0 4
12 | Signals and Systems
∂2F T = >0 ∂a02 2
s(t)
Transmission system
s(t)
(1.17)
g(t)
g(t)
t
t
Fig. 1.17: Modeling of a communication system.
2A __ cos(2pft) p 2A 2nd Harmonic: __ cos(2p.3ft) 3p 2A 3rd Harmonic: __ cos(2p.5ft) 5p
1st Harmonic:
Bit-pattern
Sum of the 1st, 2nd and 3rd Harmonics
A t
Fig. 1.18: Rectangular periodic signal presented by three overlapped harmonic functions.
From (1.16) the coefficient a0 is given as:
a0 1 = 2 T
t0 +T
∫ s(t )dt
t0
(1.18)
Data and Signals | 13
Similarly, coefficients an can be calculated as:
∂F ∂ = an ∂an =
∂ ∂an
t0 +T
∫
t0
2
2πnt s (t ) − an cos T dt
t0 +T t0 +T t0 +T 2 2πnt 2πnt − s t dt s t a dt an2 cos 2 + ( ) 2 ( ) cos ∫ dt n ∫ ∫ T T t0 t0 t0
(1.19)
t +T
0 2a T 2πnt = 0 − 2 ∫ s (t ) cos dt + n = 0 T 2 t0
with:
∂2F =T >0 ∂an2
(1.20)
From (1.19) the coefficients an are given as:
an =
2 T
t0 +T
∫
t0
2πnt s (t ) cos dt T
(1.21)
The development of the equation for calculation of coefficients bm is same as in case of an:
∂F ∂ = ∂bm ∂bm =
∂ ∂bm
t0 +T
∫
t0
2
2πmt s (t ) − bm sin T dt
t0 +T t0 +T t0 +T 2 2πmt 2πmt s t dt s t b dt bm2 sin 2 + − ( ) 2 ( ) sin ∫ dt m ∫ ∫ T T t0 t0 t0
(1.22)
t0 +T
2b T 2πmt = 0 − 2 ∫ s (t ) sin dt + m = 0 T 2 t0 with:
∂2F =T >0 ∂bm2
(1.23)
14 | Signals and Systems
Coefficients bm are then given as:
bm =
2 T
t0 +T
∫
t0
2πmt s (t ) sin dt T
(1.24)
The calculation of coefficients showed that every periodic function can be approximated by overlapping of infinite number of sine and cosine functions. This holds also for aperiodic function in one time-finite interval [t0, t0 + T]. A function fulfils Dirichlet’s conditions [ChCh05] if it has a finite number of jump discontinuities and if the function value at these jump discontinuities equals the middle value between the left and right boundary value. In case of an approximation of a function with jump discontinuities using Fourier transform, the Fourier series react with overshoots (Fig. 1.19) – the so called Gibbs phenomenon [Wil48] [Gib27][Gro12]. The time constant of ringing off of the overshoot depends on the number of overlapping sine and cosine functions. The amplitude of the overshoot decreases with the increase of overlapped sine and cosine functions and reaches a limit which is the same as in the case of an infinite number of overlapped harmonic functions.
t
t Fig. 1.19: Gibbs phenomenon.
Fourier series can be also presented using only sine or only cosine functions, instead of both sine and cosine [Veetal14].
Data and Signals | 15
The presentation of Fourier series in a sine form is given as:
f (t ) =
a0 ∞ 2πt + ∑ An ⋅ sin n ⋅ + ϕn 2 n=1 T
(1.25)
with coefficients An:
An = an2 + bn2
(1.26)
an bn
(1.27)
and the phase ϕn:
ϕ n = arctan
Similarly, the presentation of Fourier series in a cosine form is given as:
f (t ) =
a0 ∞ 2πt + ∑ An ⋅ cos n ⋅ −γn 2 n=1 T
(1.28)
with the same coefficients An as in the case of a sine form representation and the phase γn:
γ n = arctan
bn an
(1.29)
Another possibility for the presentation of Fourier series is complex Fourier series. For complex presentation of Fourier series, positive and negative frequencies are used. The approximation function f(t) is given as: ∞
f (t ) = ∑ c n ⋅ e j 2πnt / T
(1.30)
−∞
In order to calculate complex Fourier coefficients cn, the equation for a Fourier series (1.11) has to be written in the form of:
f (t ) =
∞ a0 ∞ a n b + ∑ ⋅ (e j 2πnt / T + e − j 2πnt / T ) + ∑ n ⋅ (e j 2πnt / T − e − j 2πnt / T ) 2 n=1 2 2 j n =1
a 1 ∞ 1 ∞ = 0 + ∑ (an − jbn ) ⋅ e j 2πnt / T + ∑ (an + jbn ) ⋅ e − j 2πnt / T 2 2 n=1 2 n=1
(1.31)
16 | Signals and Systems
As a–n = an (see (1.21)) and b–n = –bn (see (1.24)), the last equation can be further developed as:
f (t ) =
a0 1 ∞ 1 −1 + ∑ (an − jbn ) ⋅e j 2πnt / T + ∑ (an − jbn ) ⋅e j 2πnt / T 2 2 n=1 2 n=−∞
(1.32)
1 ∞ = ∑ (an − jbn ) ⋅ e j 2πnt / T 2 n=−∞ From (1.30) and (1.32) it follows that:
cn =
1 (an − jbn ) 2
(1.33)
The Equation (1.33) can be further developed using equations (1.21) and (1.24) as:
cn = =
1 T 1 T
t0 +T
∫
t0
1 2πnt s(t ) cos dt − j ⋅ T T
t0 +T
∫
t0
2πnt s (t ) sin dt T
(1.34)
t0 +T
∫ s(t )e
− j 2πnt / T
dt
t0
In case of n = 0, cn becomes:
c0 =
1 T
t0 +T
∫ s(t )dt =
t0
a0 2
(1.35)
Generally, it holds:
1 (an − jbn ) , n > 0 cn = 2 1 (a|n| + jb|n| ) , n < 0 2
(1.36)
Consequently, it can be written that:
c n = −c ∗−n
(1.37)
Data and Signals | 17
s(t) A
T -_ 2
T -_ 4
T _ 4
0
T _ 2
t
Fig. 1.20: Periodic digital signal.
A typical signal used for testing and analysing of transmission systems is a periodic digital signal of a period T (Fig. 1.20). Its spectral function can be determined in two ways, as shown in the previous text: 1. Using Fourier series with sine and cosine functions (equation (1.11)), where coefficients an and bn are calculated using equations (1.21) and (1.24): T /2
an =
T /4
2 2 2πnt 2πnt A cos s(t ) cos dt dt = ∫ T T T −T∫/ 2 T −T / 4
=A
(1.38)
sin(πn / 2) = Asi(πn / 2) πn / 2
where si-function is often used in communications and has the form:
si( x) =
sin( x) x
(1.39)
The si-function, which for positive values of argument represents the envelope of coefficients an, is shown in Figure 1.21: si(x) 1
7p
-__ 2 --7p -6p -5p -4p
Fig. 1.21: Si-function.
3p
-3p -2p
7p __
3p __
- __ 2
2
2
-p
0
p
2p
3p
4p
5p
6p
7p
x
18 | Signals and Systems
Coefficients bn = 0, as s(t) is an even function. In this way, the function f(t) given by (1.11) for approximation of a signal s(t) can be written as:
f (t ) = 2.
∞ a0 ∞ 2πt nπ 2πt A + ∑ an ⋅ cos n ⋅ ⋅ cos n ⋅ = + A∑ si 2 n=1 2 T T 2 n =1
(1.40)
Using complex Fourier series (1.30), where complex coefficients cn can be easily calculated for known coefficients an and bn using (1.33):
cn =
a A nπ 1 (an − jbn ) = n = si 2 2 2 2
(1.41)
In case that coefficients an and bn are not known, complex coefficients cn can be directly calculated using (1.34):
cn =
1 T
t0 +T
− j 2πnt / T dt = ∫ s(t )e
t0
T /4
1 A nπ Ae− j 2πnt / T dt = si 2 2 T −T∫/ 4
(1.42)
The approximation function f(t) is then given, according to (1.30), as: ∞
f (t ) =
∑c
n
⋅ e j 2πnt / T =
n =−∞
A ∞ nπ j 2πnt / T ∑ si ⋅ e 2 n=−∞ 2
(1.43)
which, after the summation of paired members for positive and negative n, equals the result given by (1.40). The graphic spectral presentation can also be done in two ways: 1. Using an amplitude and phase spectrum (Fig. 1.22 and Fig. 1.23) given by equations:
An =
a n2 + bn2
(1.44)
an bn
(1.45)
and:
ϕ n = arctg
Data and Signals | 19
2.
Using complex amplitude and phase spectrum (Fig. 1.24 and Fig. 1.25) given by following equations, respectively:
| c n |=
1 2 1 an + bn2 = An 2 2
ϕ n = − arctg
bn an
|an| A |a0|=2×DC component |a1|
|a3|
0
1
2
|a5| |a4|
|a2| 3
4
|a7|
|a6| 5
n
6
Fig. 1.22: Amplitude spectrum.
j
n
j
p p _ 2 0
p
-_ 2
j
2
6
j
j
1
5
j
j
0
0
4
1
2
3
j
3
-p Fig. 1.23: Phase spectrum.
4
5
6
7
j
7
n
(1.46)
(1.47)
20 | Signals and Systems
|cn| A |c0|=DC component _ |c-1| 2 |c1|
|c-7|
|c-6|
-7
-6
|c-5| -5
|c-4| -4
|c-3|
|c-2|
-3
-2
|c3|
|c2| 0
-1
1
2
3
|c4| 4
|c5|
|c7|
|c6|
5
6
7
n
Fig. 1.24: Complex amplitude spectrum.
jn j
j
-3
-7
2
j
-4
-7 -6
-5
-4
-2
-1
j
-1
-5
-6
j
j
2
j
-2
0 p -_ 2
6
j
1
5
j
j
0
-3
j j
j
p p _
4
1
2
3
j
3
4
5
6
7
j
n
7
-p
Fig. 1.25: Complex phase spectrum.
1.2.3 Harmonic Analysis of Aperiodic Signals Since the duration Ti of an aperiodic signal is finite, the spectrum of this type of signal cannot be expressed via the Fourier series coefficients as it is the case with periodic signals. Instead, every aperiodic signal can be (theoretically) extended i.e. replicated periodically by adding the version of it shifted in time by integer multiples of an assumed period T0. In this way, an artificial periodic signal is obtained, for which the Fourier coefficients can be found, as well as the envelope of these coefficients in frequency domain (as shown in Section 1.2.2). Further, the Fourier coefficients can be considered as the samples of this envelope (as illustrated, e.g. in Fig. 1.24) for the discrete (integer) values of n. In order to express the spectrum of the original aperiodic signal, the “period” T0 of the derived periodic signal can be (theoretically) increased, in which case the envelope of Fourier series coefficients becomes broader but with remained shape, i.e. if observed in frequency domain with samples at points nf0 = n/T0, the envelope stays the same while the samples (Fourier coefficients) are denser, since the distance between two neighbouring samples is 1/T0. If T0 is increased indefinitely, the artificial periodic signal becomes the original aperiodic one, the Fourier coefficients “vanish” and the spectrum of the aperiodic signal is represented by the “spectral
Data and Signals | 21
envelope” which is a continual function of frequency f. In fact, instead of Fourier coefficients, there are complex “spectral components” ej2πf for continual values of frequency (–∞ < f 0 t0 2πB where B = fs/2 = 1/2Ts, the output of such a filter is the reconstructed signal sr(t), whose spectral function is given as: ∞
S r ( jω ) = H ( jω ) ⋅ S s ( jω ) = Ts ⋅ S s ( jω ) = Ts ⋅
∑ s(nT )e
− jnωTs
s
(1.103)
n =−∞
Performing an inverse Fourier transform, the resulting signal sr(t) equals: ∞
sr (t ) = Ts ⋅
1
2πB
∑ s(nT ) 2π ∫ e ω π
j ( t −nTs )
s
n = −∞
−2 B
∞
dω = ∑ s(nTs ) n =- ∞
sin[2πB(t − nTs )] = 2πB(t − nTs ) (1.104)
∞
=
∑ s(nT )si[2πB(t − nT )] s
s
n =- ∞
The interpolation function (1.104) shows that every signal s(t) of a bandwidth B can be presented as an infinite sum of products of the signal in sampling moments, s(nTs), and a function si(2πB⋅(t – nTs)). The form of the si-function determines the form of the terms of the infinite sum: as the si-function has its maximum in t = nTs, i.e. in the sampling moment, the reconstructed signal sr(t) has its maxima of s(nTs) in the sampling moments nTs (as other si-functions equal 0 at the sampling moment nTs).
Discretization | 35
s(t)
0
t
Fig. 1.34: Signal s(t).
d (t) seq
-3Ts -2Ts -Ts
0
Ts
2Ts
3Ts
t
Fig. 1.35: Sequence of Delta-pulses δseq(t).
|S(jf)|
-B
0
f
B
Fig. 1.36: Amplitude spectrum of the signal s(t).
|Ss(jf)|
-2fs-B
-2fs
-2fs+B
-fs-B
-fs
-fs+B
-B
0
Fig. 1.37: Amplitude spectrum of the signal ss(t).
s s( t )
-3Ts -2Ts
-Ts
Fig. 1.38: Signal ss(t).
0
Ts
2Ts
3Ts
t
B
fs-B
fs
fs+B
2fs-B
2fs
2fs+B f
36 | Signals and Systems
1.4 Compressive Sensing There are many scenarios in digital communications and signal processing where a data bit stream with specific patterns or statistical properties result in a better compression and a very accurate reconstruction. One of such properties is sparsity achieved by the signal sparse approximation process. A signal approximation process which provides a sparse representation of the signal is a basis for some important techniques used in signal compression (see Chapter 4) and recovery algorithms [Daetal12]. In this regard, compressive sensing also known as compressive sampling (CS) has attracted significant attentions recently in the field of modern communication techniques, applied mathematics and computer science. Compressive sensing is an important theory which concerns signal recovery based on more limited number of observed samples or measurements than required in traditional/classic signal recovery theory stated by Shannon sampling theorem [Nyq28][Sha49][Kot33] [Whi15]. In the classic Shannon based signal recovery approach the signal is sampled using at least double highest frequency of the signal, compressed and stored or transmitted. However, in compressive sensing approach, compression and processing are integrated into one step. CS is built based on the fact that with the mathematical nonlinear optimization, the original signal can be recovered from its compressed sparse representation in a proper domain. The sparse representation of a signal means that the signal can be represented with only few nonzero coefficients. However, the representation must be a good approximation of the original compressible signal [Daetal12]. Nowadays, CS is established as the theory of efficient signal acquisition based on the low rate samples which reconstructs the original signal from small set of measurements [CaWa08]. The emergence of CS has happened in the recent decades; however, the applications area is already vast and growing. Some applications include Magnetic Resonance Imaging (MRI) [LiJa13], Radar [Yaetal13] and Machine Learning [SeNi08], for example.
1.4.1 Background Since the mathematical norms are the main building blocks of calculations in CS, at first the definition of a norm is mentioned: given a vector space V over a field F, a vector norm ||⋅|| is defined as a real-valued non-negative function on V with the following properties: For all k ∈ F, and all x, y ∈ V: a) ||x|| > 0, and ||x|| = 0 if and only if x = 0 b) ||k⋅x|| = ||k||⋅||x|| c) ||x + y|| ≤ ||x|| + ||y||
Compressive Sensing | 37
A normed vector space is a vector space on which a norm is defined and commonly shown by a pair (V, ||.||). Two important norms on the n-dimensional Euclidean space = ℝ frequently used in CS are as follows: / a) lp-norms: ‖ ‖ = (∑ | | ) , ∈ [ , ∞) | | b) l∞-norm: ‖ ‖ = , ,…,
Two important properties frequently used in CS are signal sparsity and signal compressibility. To formalize the concept of sparsity, the support of a vector must be defined: Definition 1. The support of an n-dimensional vector defined as the index of its nonzero elements i.e.: ( ) = { = ,…, :
∈
denoted by supp( ) is
≠ #}
(1.105)
The vector ∈ is called k-sparse when at most k of its elements is nonzero. The notion of k-sparse can be formalized as follows: ‖ ‖% = & '((
( )) ≤ *
(1.106)
where “card” indicates the cardinality of a set i.e., the number of elements of the set. The notation ‖ ‖# should not be mistaken for the norm of x defined above. This notation is used due to the following observation: +, ‖ ‖ = & '(( →#
( ))
(1.107)
Sparsity is not an easy property to achieve for a signal obviously. In fact, what is important in CS is the sparse representation of a signal in some domain, which introduces the concept of signal compressibility or compressible signals. Approximating a signal with a sparse representation is an efficient way for compressing the signal. To formalize the concept of signal compressibility, the following definition concerning the calculation of the error occurred by approximating with a sparse representation of a signal is necessary: Definition 2. The error of the best k-sparse approximation to a signal x in vector space V, is defined as follows [Whi15][Daetal12][FoRa13]: .* ( ) = ,/0 {‖ − 3‖ : 3 ∈ , 3 , * −
' 4}
(1.108)
Using the above definition, a compressible signal in a normed vector space V is a vector x such that its best k-sparse approximation decays quickly in terms of k. There are some special conditions in which a signal becomes a compressible signal
38 | Signals and Systems
[FoRa13]. Approximately sparse, relatively sparse or nearly k-sparse are different expressions used interchangeable in the literature to carry the concept of signal compressibility. Definition 3. Let 5 ∈ ℝ6× , A is said to satisfy the Null-Space Property (NSP) of order s if there exists a positive constant K such that the following inequality holds for all vectors x in the null space of A, so that & '((8) ≤ 9: ‖ where
8 denotes
the vector
8‖
≤ :‖
8; ‖
/√9
(1.109)
whose elements outside 8 are extended to zero.
NSP is a strong condition-required for the efficiency of signal recovery algorithms in CS. Another important property which is outstanding in the theory of CS is called Restricted Isometry Property (RIP). RIP is a property imposed on the measurement matrix which enables an accurate signal recovery by providing stability, along with other imposed conditions like NSP. For a measurement matrix, RIP is defined as follows [CaTa05]: Definition 4. A matrix A satisfies the RIP of order k if there exists a positive constant δk (δk < 1) such that the following inequalities hold for all k-sparse signal vectors x: ( − =* )‖ ‖ ≤ ‖5 ‖ ≤ ( + =* )‖ ‖
(1.110)
The introduced constant δk in the latter definition is known as the restricted isometry constant. The RIP firstly has been introduced by Candès and Tao in [CaTa05] which later became a baseline of signal recovery algorithms discussed in CS theory. Especially, it has a key role in sparse signal recovery when the measurements are contaminated with noise [Whi15][Dav10]. One of the main pillars in CS is the problem of the design of measurement matrix satisfying RIP condition which is necessary for the sparse signal recovery algorithm. It is shown that the randomized matrices with some special properties are proper basis for such a design. Such a design can be achieved by generating a matrix whose entries are chosen according to a Gaussian related statistical distribution, for example. However, it is linked with some other basic notions like coherence and spark (smallest number of linearly dependent columns of a matrix) [Daetal12][TrGi07][DoEl03].
Compressive Sensing | 39
1.4.2 Compressive Sensing Model Encoder The typical compressive sensing model concerns with reconstructing a sparse vector x in = ℝ from its m measurements presented by a system of linear equations as: ?=5
(1.111)
where A ∈ ℝ6× and y ∈ ℝ6 . The matrix A is called measurement matrix or equivalently sensing matrix and y is the output vector. Since the number of measurements (m) is usually much smaller than the original vector dimension (n), the above system of linear equations is an underdetermined system. So, additional conditions are necessary for recovering the original signal vector x. x might also be represented by x = Ψα, where Ψ is a basis, whose columns are orthogonal, i.e. α consists of the coefficients of x to the basisΨ. Is also called a dictionary, which maps the compressible input vector α to a sparse vector x. So, instead of (1.111) the equivalent underdetermined system of equations y = AΨα, can be considered. One important problem in CS is the problem of generating the measurement matrix A and imposing the sparsity condition on x such that the original signal x can be recovered accurately. In fact, the extraction of the measurements on the original signal vector x must result in a representation of x containing necessary and enough information of the signal for an accurate recovery.
Decoder In order to reconstruct the signal x from the compressed measurements y, an underdetermined system of linear equations has to be solved. Generally, such systems have an infinite number of solutions, which makes it computationally infeasible to find x with no further information than A, y. In compressive sensing, the sparsity constrains on x relax the problem, since only solutions with a small number of nonzero coefficients are qualified. Properties like the NSP or RIP ensure that the systems used in compressive sensing have a unique sparse solution. Therefore, a compressive sensing decoder has to find the sparsest solutions to the linear system.
1.4.3 Conditions for Signal (Sparse) Recovery in Compressive Sensing There are many algorithms in the literature for the sake of sparse signal recovery which formulate the problem in different ways. Originally, a sparse or compressible solution for a standard CS problem is sought by solving the following optimization problem:
40 | Signals and Systems
,/‖3‖#
3
@A4&B BC 53 = ?
(1.112)
The objective function of the aforementioned optimization problem is not convex which creates a big barrier for seeking the solution. In fact, it has been shown that the above l0-minimization and its generalization are NP-hard problems. So, the following l1-minimization problem is tried to be solved instead: ,/‖3‖
3
@A4&B BC 53 = ?
(1.113)
The objective function in the new optimization problem represents a convex function which facilitates solving the problem by the classic optimization methods. Also, the solution to the latter problem gives more efficient approach to the main CS problem model. However, when the measurements are contaminated with noise, both problems will be subjected to the constraint ‖53 − ?‖ ≤ D. So, in general, two categories of algorithms exist for sparse signal recovery in CS, which recover the signal from noiseless measurements or in the presence of noise. The conditions for a successful signal recovery from noisy measurements are different from noiseless ones. In both categories the RIP condition plays an important role. For the noiseless case it is mentioned and proven in [Daetal12] and [Can08] that, if the measurement matrix A satisfies the RIP condition of order 2k, with RIP constant = * < √ − , F of problem then the inequality ‖F − ‖ ≤ G.* ( ) /√* holds for the solution H (1.113) where C is a constant. It can be concluded that for any k-sparse signal x, under the above condition for A, the exact signal x can be recovered with only few measurements. This reconstruction is perfect due to the exact sparse signal recovery. Also, it is shown in [Whi15] that the RIP condition can be replaced by the weaker condition NSP in noiseless setting where the same error bound can be achieved. To tackle real world problems in CS, signal recovery from noisy data attracts more interest in general. The above results can be extended to this case as well. As mentioned before, when the measurements are contaminated with noise, the following minimization problem is considered: ,/‖3‖
3
@A4&B BC ‖53 − ?‖ ≤ D
(1.114)
Several promising approaches exist in the literature in the presence of some special types of noise [Can08][HaNo06]. However, the first results through solving the problem (1.113) proven in [Can08] will be mentioned here. In fact, under the assumption of noiseless case, the solution F of problem (1.114) satisfies the following inequality: ‖F − ‖ ≤
G .* ( ) √*
+G D
(1.115)
Discrete and Digital Signals | 41
where G and G are two constants depending on = * . The constants are rather small, although the signal recovery in this case is not as simple as in the noiseless case and requires more prior information about the signal [Whi15][Daetal12][Can08] [HaNo06]. The mentioned approaches are among the first results for solving the CS model. The l1-minimization recovery algorithms provide a powerful tool for an accurate and efficient signal recovery. They are benefitting from convex optimization methods mainly. However, there are other approaches for signal recovery in the literature applying in CS [Whi15][Daetal12][FoRa13]. The two important classes of such algorithms are known as greedy algorithms [BlDa09] and combinatorial algorithms [Shetal09].
1.5 Discrete and Digital Signals 1.5.1 Discrete Signals Discrete signals are, as shown in Section 1.2, signals with values known in discrete time moments, whereby their amplitude can be analog or discrete. Discrete signals are the result of sampling of analog signals, or they are discrete by nature. The discrete signal s(n) is represented as a sequence of n numbers i.e. amplitudes of a signal in n discrete moments. If a discrete signal s(n) is passed through the system which outputs a discrete signal sout(n), the system is called “discrete system”. Linear, time invariant systems which are stable and causal are of the biggest importance for communications. Linear systems have properties of: – additivity
sout1 (n) + sout 2 (n) = T [ s1 (n) + s2 (n)] = T [ s1 (n)] + T [ s2 (n)]
(1.116)
where sout1(n) and sout2(n) represent discrete system responses to input signals s1(n) and s2(n), i.e.:
sout1 (n) = T [ s1 (n)], –
sout 2 (t ) = T [ s 2 (n)]
(1.117)
homogeneity:
k ⋅ sout (n) = T [k ⋅ s (n)] = k ⋅ T [ s (n)],
k = const.
(1.118)
42 | Signals and Systems
Additivity and homogeneity can be generalized using superposition as:
sout (n) = T ∑ ki ⋅ si (n) = ∑ ki ⋅ sout i (n) i i
(1.119)
Discrete systems are stable iff a finite input sequence produces a finite output sequence, i.e. |s(n)| ≤ K ⇒ |sout(n)| ≤ K (for a constant K and n ∈ Z). Discrete systems are stable if the output signal sout(n0) depends only on input signals in a current and previous moments s(n ≤ n0), which means that there is no output before an input exists. Linear time invariant (LTI) systems [MaIn11] are systems which are both linear and time invariant. In the case of a Dirac-impulse at input of the linear timeinvariant system, its response is as follows:
sout (n) =
∞
∞
i = −∞
i = −∞
∑ s(i) ⋅ T [δ (n − i)] = ∑ s(i) ⋅ h(n − i) = s(n) * h(n)
(1.120)
where “∗” represents a convolution operation. In the case that a linear time invariant system is causal as well, it can be written that: ∞
sout (n) = ∑ h(i ) ⋅ s (n − i )
(1.121)
i =0
because h(n) = 0 for n < 0. In the case that a linear time invariant system is stable as well, it can be written (for a finite input sequence s(n)) that:
sout (n) =
∞
∞
∞
i = −∞
i = −∞
i = −∞
∑ s(n − i) ⋅ h(i) ≤ ∑ h(i) ⋅ s(n − i) ≤ K ∑ h(i) < ∞
(1.122)
∞
∑ h (i ) < ∞ i.e. it is sufficient that the system impulse response fulfils that:
i = −∞
.
Discrete causal liner time invariant systems are divided into two big groups: – Finite Impulse Response (FIR) systems [MaIn11], described as: N −1
sout (n) = ∑ h(i ) ⋅ s (n − i ) i =0
(1.123)
Discrete and Digital Signals | 43
–
Infinite Impulse Response (IIR) systems [MaIn11], described as: ∞
sout (n) = ∑ h(i ) ⋅ s (n − i )
(1.124)
i =0
1.5.1.1 Finite Impulse Response A system with finite impulse response i.e. FIR filter is defined by the set of N coefficients h(0), h(1),…, h(N – 1) according to the convolution sum in (1.123), whose direct realization is shown in Figure 1.39. s(n)
h(0)
T
s(n-1)
h(1)
s(n-i)
...
T ...
...
T
h(i)
T
s(n-N+1)
h(N-1)
...
...
sout(n)= S h(i)s(n-i)
Fig. 1.39: Direct realization of FIR filter.
As the name suggests, when a Dirac impulse is brought at the input of a FIR filter, the output signal sout(n) will have finite length N, whereby sout(n) = h(n). The direct consequence is that a FIR filter is always stable, i.e. the condition (1.122) is always satisfied. Beside stability, another useful property is that it is very easy (with the proper choice of coefficients h(n) by design) to achieve a linear phase characteristic of a FIR filter in frequency domain. Namely, the frequency response H(jω) of a FIR filter defined as the Fourier transformation of its impulse response h(n): ∞
H ( jω ) =
∑ h( n) ⋅ e
− jn ω
N −1
= ∑ h ( n ) ⋅ e − jnω
(1.125)
n=0
n = −∞
will have a linear phase if h(n) is symmetric or anti-symmetric around middle point M = (N – 1)/2, i.e. if one of the following two conditions is fulfilled:
h ( n ) = h ( N − n − 1)
or
h ( n) = − h( N − n − 1)
(1.126)
For example, in case of even N and symmetric impulse response, (1.125) becomes:
44 | Signals and Systems
∞
H ( jω ) =
∑ h(n) ⋅ e − jnω = e
−j
N −1 N −1 ω 2
n = −∞
=e
−j
N −1 ω 2
∑ h(n) ⋅ e
N −1 − j n− ω 2
n =0
(1.127)
N −1 N −1 N −1 N −1 ∑ h ( n ) cos n − ω ω + j ∑ h ( n ) sin n − 2 2 n=0 n =0
In (1.127) the last sum (i.e. the imaginary part) equals zero because the sine function within the sum is anti-symmetric and h(n) is symmetric around the middle point (N – 1)/2. Therefore, the last expression can be written in the following form:
H ( jω ) = A(ω )e
−j
N −1 ω 2
= A(ω )e − Kω
(1.128)
where A(ω) is a real function of ω and K is a constant, i.e. the phase ϕ(ω) = Kω of the frequency response H(jω) is a linear function of ω. In the similar way the linearity of the phase characteristic can be proven for the other three types of FIR filters – when N is even and h(n) anti-symmetric, and when N is odd and h(n) is symmetric or antisymmetric around (N – 1)/2. Linear phase is a desirable characteristic of linear systems since all frequency components of a processed input signal are delayed equally in time (in case of FIR filters the delay is N/2 sample periods), which means that there is no additional distortion (i.e. dispersion) of the signal. On the other hand, other properties of FIR filters are not as superior as it is the case with the stability and linear phase. For example, better selectivity (or sharpness) of the frequency response H(jω) can be achieved only with the greater values of length N, which (as the consequence) demands a greater number of coefficients to be stored in memory and more processing power for the realization of such FIR filters.
1.5.1.2 Infinite Impulse Response Unlike FIR filters, IIR filters can achieve much better characteristics of frequency response with a smaller number of coefficients. This is obtained by a feedback from the output since the output signal sout(n) depends not only on input s(n), but also on the delayed samples sout(n – i) as shown in Figure 1.40. Therefore, the stability condition (1.122) is not necessarily satisfied. IIR filters are also called recursive filters.
Discrete and Digital Signals | 45
sout(n)
T
sout(n-1)
a1
s (n-i) ...out
T ...
...
T
ai
...
...
T
...
T
s(n-1)
...
s(n)
b1
sout(n-M+1)
aM-1
...
b0
T
bN-1
bi
... s(n-i)
...
T
T
s(n-N+1)
Fig. 1.40: Realization of IIR filter.
Due to the recursion, the length of the response h(n) to Dirac impulse is not finite in general, but in the case of properly designed IIR filters, the amplitudes of the impulse response decrease to zero after a certain number of sample intervals. In line with (1.124) the frequency response H(jω) of IIR filter is equal to the Fourier transform of h(n), but it cannot be expressed as a finite sum like in (1.125) for FIR filters. Instead, the frequency response of an IIR filter can be defined by coefficients am and bn (Fig. 1.40): N −1
B ( jω ) = H ( jω ) = A( j ω )
∑b
n
⋅ e − jnω
n =0 M −1
∑a
m
⋅e
(1.129)
− jmω
m =1
Since the denominator A(jω) in (1.129) may take the values very close to zero (in the complex plain) at certain frequencies, the frequency response (transfer function) can have very good characteristics with regard to the sharpness and selectivity, i.e. much better than a FIR filter with the similar number of coefficients. On the other hand, it may cause the instability of the system, especially when the errors by the numeric representation of coefficients take part.
46 | Signals and Systems
1.5.2 Digital Signals Digital signals are, as shown in Section 1.2, signals with discrete values in discrete time moments. The simplest digital signals are binary signals, having only two values which can be represented by two bits: “0” and “1” (Fig. 1.41). If exactly 1 bit per period T is transmitted, the transmission rate equals 1/T bit/s. The unit used for the transmission rate is [bit/s]. s(t) 1
0
1
T
0
2T
0
3T
1
4T
0
5T
...
1
6T
...
t
Fig. 1.41: Binary signal.
If a multiple-level transmission is used, e.g. with 4, 8, 16,… (generally 2n) levels where one transmitted symbol consists of n bits (Fig. 1.42), the transmission rate increases to n/T bit/s. The unit used for the symbol rate is [symbol/s] or [baud]. The connection between [baud] and [bit/s] is given as: [bit/s] = ld(n) [baud]; only in the case of a binary transmission (n = 2) are [baud] and [bit/s] equal. s(t) 3 2 1 0
T
2T
3T
4T
5T
6T
...
t
Fig. 1.42: Multi-level signal.
Some transmission arts are multi-level, but with number of levels different from 2n (e.g. ternary transmission (see Chapter 5) with the data rate of ld(3)/T = 1.58 bit/s).
Discrete and Fast Fourier Transform | 47
1.6 Discrete and Fast Fourier Transform 1.6.1 Discrete Fourier Transform (DFT) Discrete Fourier Transform (DFT) is used for the conversion of the sampled function from its original time domain to frequency domain. DFT is defined on a set of N complex equidistant samples f(iT) = {f(0), f(T), f(2T),…, f[(N – 1)T]}, with i = 0, 1,…, N – 1, of the function f(t) on the interval (0, NT), where T is a sample period [Veetal14]: N −1
FD ( jnΩ) = ∑ f (iT ) ⋅ e − jnΩiT , i = 0,1,2,..., N − 1
(1.130)
i =0
The function f(t) is not unambiguously determined via finite set of N complex ~ samples and therefore the new time finite function f (t ) is introduced:
f (t ) , ~ f (t ) = , 0
0 ≤ t ≤ NT otherwise
(1.131)
whose Fourier transform is given as: ∞
~ F ( jω ) = NT ∫ f (t )e − jωt dt
(1.132)
0
If nΩ is used instead of ω, iT instead of t and T instead of dt, the following applies: N −1 ~ F ( jnΩ) ≅ ∑ f (iT )e − jnΩit T
(1.133)
i =0
After comparing (1.130) and (1.133), it can be written that:
~ F ( jω )
≅ TFD ( jnΩ)
(1.134)
ω = nΩ
i.e. continual and discrete Fourier transform are equivalent if the signal f(t) is represented as a sequence of N equidistant samples in a time limited in interval (0, NT) which can be periodically spread to get harmonic frequencies nΩ = 2πn/NT, i.e. a periodical signal with a period NΩ.
48 | Signals and Systems
Inverse Discrete Fourier Transform (IDFT) is calculated as:
f (iT ) =
1 N
N −1
∑F
D
( jnΩ)e jΩTin
(1.135)
i =0
where DFT and IDFT form a Fourier transform pair.
1.6.2 Spectrum Forming and Window Functions DFT is a powerful tool for the spectrum analysis of discrete signals, especially when implemented using some of the Fast Fourier Transform (FFT) algorithms (see Section 1.6.3). Still, there are some issues caused by the discontinuities at the ends of the original signal which is to be analysed, since DFT by definition works with the limited set of input values, i.e. with N samples of the original signal. One of the issues is “spectral leakage” expressed through the appearance of the new frequency components (side lobes) after DFT of the signal which really contains only one spectral component (i.e. pure sine signal at a certain frequency). In order to reduce these “leakage aberrations” in the DFT/FFT output, the so called spectrum forming of an input signal is usually performed by the use of a window i.e. a weight function w(n) [MaIn11]. A window function smoothly decreases to zero at each end, so by multiplying an input signal x(n) and a window w(n) the negative effects of sudden changes of x(n) at the beginning and the end are mitigated or minimized. A formed signal y(n) after the “windowing” of x(n) is given as:
y ( n ) = w( n ) ⋅ x ( n )
(1.136)
and its DFT as the convolution of DFTs of a window function and an input signal: N −1
Y ( n ) = ∑ W (i ) X ( n − i )
(1.137)
i =0
There are many different window functions and some of the most used are: Hamming, Hanning, Blackman, Bartlett, Kaiser, Poisson or Dolf-Chebyshev window functions [Nut81][Har78][BlTu58][KaSc80][GaHe87][Dol46][RaGo75]. Which of many window functions will be used depends on the concrete application, since each window has its specific properties. The purpose of windowing in time domain is to analyse the periodic behaviour of the function in a short duration, e.g. for the extraction of pitch or formant frequencies from a speech signal, which is considered stationary only in a short time interval (see Chapter 2). Windowing can also be per-
Discrete and Fast Fourier Transform | 49
formed in frequency domain when only a specific subset of frequencies is to be analyzed, i.e. when the zooming into the finer details of signal is necessary. The length N of a (time-domain) window is the trade-off between the resolutions in the time and frequency domain. A greater N gives a better frequency resolution (after DFT with the same length) i.e. close frequency components are better separated. On the other hand, the use of a longer window is a poorer choice when the time detection of spectral changes is important, and in those cases it is better to use a narrower window (i.e. smaller blocks of analysed data). Since according to (1.137) the “windowed input signal” Y(n) in frequency domain is the convolution of the spectrum of an input signal X(n) and the DFT of window function W(n), in the resulting spectrum Y(n) the leakage frequencies from X(n) (around some real spectral component) will be more or less suppressed, whereby the level of these suppressions and the frequency selectivity highly depend on the shape of W(n), i.e. on the shape of the window function in time domain w(n). Regarding this, the main two properties of window functions are the width of the main lobe and the attenuation of side lobes in the frequency characteristic W(n). While the width of the main lobe affects the selectivity (the narrower lobe, the better selectivity), the attenuation of the side lobes affects the suppression (of unwanted leakage frequencies). In an ideal case, W(n) should have as much as possible narrow main lobe and as much as possible low side lobes, but in reality the choice of a window function is always the compromise between these two factors. For an illustration, some of the window functions are listed below, while Figure 1.43 presents their shapes in time and frequency domains (obtained in MATLAB for the window length of N = 64), which can be easily compared with each other regarding the mentioned properties: – Rectangular window (Dirichlet):
1 , 0 ≤ n ≤ N − 1 w( n) = 0 , otherwise –
(1.138)
Triangular window (Bartlett):
N −1 2N , 0≤n≤ N −1 2 2N N +1 , w(n) = 2 − ≤ n ≤ N −1 2 1 N − , otherwise 0
(1.139)
50 | Signals and Systems
Hanning (Hann) window:
–
2πn 0.5 − 0.5 cos w( n) = N −1 0
, 0 ≤ n ≤ N −1 , otherwise
Hamming window:
–
2πn 0.54 − 0.46 cos w( n) = N −1 0
, 0 ≤ n ≤ N −1
(1.141)
, otherwise
Blackman window:
–
2πn 4πn + 0.08 cos 0.42 − 0.5 cos w( n) = N −1 N −1 0
, 0 ≤ n ≤ N −1
Frequency domain 0 Rectangular window
0.9
Bartlett (triangular window) -20
Magnitude (dB)
Bartlett (triangular)
0.6 0.5
Blackman
0.4
Hanning
-30 -40 -50 -60
0.3
-70
0.2
-80
Hamming
0.1 0
Rectangular window
-10
0.8 0.7
(1.142)
, otherwise
Time domain 1
Amplitude
(1.140)
10
20
Hamming
-90 Blackman
Hanning 30
40
50
60
-100
Samples
0.05
0.1
0.15
0.2
0.25
0.3
Normalized Frequency (´p rad/sample)
Fig. 1.43: Window functions in time and frequency domains.
The rectangular window assumes that there is “no window” i.e. y(n) = x(n), and it is presented in Figure 1.43 in order to be compared to other window functions. Beside “fixed” window functions, as shown above, there are many generalised versions of them with changeable parameters, which contribute to their flexibility for the use in different applications (e.g. for the mitigating of Gibbs phenomenon by
Discrete and Fast Fourier Transform | 51
the design of FIR filters using the window method). One of such the “parametric” functions is a Kaiser (i.e. Kaiser-Bessel) window with a parameter β:
[
I β 1 − (n − α ) 2 / α 2 0 w(n) = I 0 [β ] 0
]
, 0 ≤ n ≤ N −1
(1.143)
, otherwise
where α = (N – 1)/2 and I0[⋅] is the zero order modified Bessel function of the first kind. Parameter β obtains a high level of flexibility since many other popular window functions can be approximated or exactly generated by the proper choice of its value. In Figure 1.44 several shapes of the Kaiser function as well as their frequency characteristics are shown for different value of β. Similarly as in Figure 1.43, the compromise between the main lobe width and the attenuation of side lobes is noticeable. Time domain
Frequency domain
1
0 b=0
0.9
-20
0.8
Amplitude
Magnitude (dB)
b=0 (Rectangular window)
0.7
b=3
0.6 0.5
b=6
0.4
b=3 -60
b=6
-80
b=9
0.3
-40
b=9 0.2 b=10
-100
0.1 0
b=10 10
20
30
40
50
60
-120
0.05
Samples
0.1
0.15
0.2
0.25
0.3
Normalized Frequency (´p rad/sample)
Fig. 1.44: Kaiser window functions in time and frequency domains.
1.6.3 Fast Fourier Transform (FFT) Fast Fourier Transform (FFT) is a group of algorithms for efficient DFT calculation. The operations needed for calculation of DFT in N equidistant points are: – N2 complex multiplications and – N⋅(N – 1) complex additions. The quadratic increase of the number of operations with the number of samples is not suitable for a DFT calculation. Therefore, many FFT algorithms are developed
52 | Signals and Systems
for a DFT calculation using less number of operations. These more efficient algorithms exploit the existence of many redundant calculations when the equation (1.130) is applied directly. If in (1.130) Ω is replaced by 2π/NT, the discrete Fourier transform X(n) of a discrete signal x(i) becomes: N −1
X (n) = ∑ x(i ) ⋅ e
− jn
2π i N
(1.144)
i =0
and it can be further simplified as: N −1
X (n) = ∑ x(i ) ⋅ WNni
(1.145)
i =0
where WN = e–j2π/N. The last DFT expression can be rewritten in the form of two sums, where the first one summarises the members with even, and the other one the members with odd indices: N −1 2
X (n) = ∑ x(2i ) ⋅ W
2 i⋅n N
N −1 2
+ ∑ x(2i + 1) ⋅ WN( 2i +1) n
i =0
(1.146)
i =0
Since:
WN2i⋅n = e
−j
2π ⋅2 i⋅n N
=e
−j
2π ⋅i⋅n N /2
= W Ni⋅n/ 2
(1.147)
from (1.146) it follows: N −1 2
N −1 2
i =0
i =0
X (n) = ∑ x(2i ) ⋅ WNi⋅n/ 2 + WNn ∑ x( 2i + 1) ⋅ WNi⋅n/ 2
(1.148)
= A(n) + WNn ⋅ B (n) where A(n) and B(n) represent the sums with half the number of members compared to (1.145). The previous procedure can be performed further, i.e. recursively on the new derived sums A(n) and B(n) and so on, until the number of summarization members in the lowest decimation level is minimal, e.g. equals 2. In this way, the complexity level of a computation is decreased from O(N2) to O(N⋅lg N). One of the most used FFT algorithms is the Decimation-in-time algorithm (also known as the radix-2 Cooley–Tukey algorithm [CoTu65]) which works with data blocks of N elements, where N is the power of 2 (the most common is N = 256). In such a case, for the deepest recursion level in (1.148), i.e. in the first stage of the
Discrete and Fast Fourier Transform | 53
algorithm, there are N/2 parallel 2-element DFTs where the inputs and the multiplication coefficients of each DFT are simply: A(1)(n) = x(0), B(1)(n) = x(1) for n = 0, 1 and WNn = W2n = e–jnπ, i.e. W20 = 1 and W21 = –1. So, the DFT output {X(0), X(1)} of a basic 2element block with input data {x(0), x(1)} equals:
X (0) = A(1) (0) + W20 B (1) (0) = x(0) + W20 x(1) = x(0) + x (1) X (1) = A(1) (1) + W21 B (1) (1) = x(0) + W21 x(1) = x(0) − x(1)
(1.149)
The computations from (1.149) are performed in the basic FFT element called “butterfly”, which has two inputs and two outputs. The symbolic representation of a butterfly in a general case is shown in Figure 1.45.
WN k
B
A+WNk.B
1
A
-WNk
1
A-WNk .B
Fig. 1.45: Butterfly diagram.
In each succeeding stage of the algorithm, there is a half of the number of parallel DFT blocks compared to the previous stage, but each of these blocks has a doubled number of inputs and outputs to which the equation (1.148) is applied. Further, due to the circular symmetry of coefficients WNn = e–j2πn/N (along the unit circle in the complex plane), there are N/2 different absolute values |WNn| in each block. For example, when N = 8, the first four coefficients are:
W80 = 1,
W81 = e − jπ / 4 ,
W82 = e − jπ / 2 ,
W83 = e − j 3π / 4
(1.150)
while the last four coefficients are negative versions of the first four:
W84 = e − jπ = −1 = −W80 W85 = e − j 5π / 4 = − e − jπ / 4 = −W81 W86 = e − j 3π / 2 = − e − jπ / 2 = −W82
(1.151)
W87 = e − j 7π / 4 = − e − j 3π / 4 = −W83 This property decreases the total number of calculations of coefficients, as well as the fact that the same (absolute) values of coefficients appear in different stages of the algorithm (e.g. W82 = W41 = e–jπ/2), making the FFT algorithm highly efficient. Finally, the implementation of (1.148) in the third stage of FFT (for N = 8) consists of four butterfly calculations:
54 | Signals and Systems
X (0) = A(3) (0) + W80 B (3) (0) 1st : X (4) = A(3) (0) + W84 B (3) (0) = A(3) (0) − W80 B (3) (0) X (1) = A(3) (1) + W81 B (3) (1) 2 nd : X (5) = A(3) (1) + W85 B (3) (1) = A(3) (1) − W81 B (3) (1)
(1.152)
X (2) = A(3) (2) + W82 B (3) (2) 3rd : X (6) = A( 3) (2) + W86 B (3) (2) = A(3) ( 2) − W82 B (3) (2) X (3) = A(3) (3) + W83 B (3) (3) 4th : X (7) = A(3) (3) + W87 B (3) (3) = A(3) (3) − W83 B (3) (3) The diagram of 8-point FFT is shown on Figure 1.46. Similarly, the same pattern with butterfly calculations is used in the higher stages of FFT for greater N (e.g. when N = 256).
1
x(0)
A(2)(0)
1 1
1 -1
x(4)
1
x(2)
A(2)(1)
-1
B(2)(1)
x(1)
1
A(2')(0)
1
1 1
1
x(3)
W4 1
A(2')(1) B(2')(0)
-1
W43 =-W41
-1
4 x 2-point DFT (Stage 1)
B(2')(1)
1 1
(3)
A (1) A(3)(2) A(3)(3)
1
X(0) W80=1
W40=1
1 1 W4 1
-1
B(3)(1) B(3)(2)
W43 =-W41
2 x 4-point DFT (Stage 2)
(3)
B (3)
X(1)
1
X(2) 1
X(3)
B(3)(0)
1
1 1
x(7)
1 1
1 -1
x(5)
W40=1
B(2)(0)
x(6)
A(3)(0)
1
X(4) W8 1
X(5) W8 2
-W82
W8 3 W87=-W83
X(6) X(7)
1 x 8-point DFT (Stage 3)
Fig. 1.46: Diagram of the 3-stage Decimation-In-Time FFT (DIT FFT) algorithm (N = 8).
The represented FFT algorithm can be used also in the opposite direction, i.e. for the fast calculation of Inverse DFT (IDFT). In this case, from N discrete spectral values {X(0), X(1),..., X(N–1)} the time sample values {x(0), x(1),..., x(N–1)} can be obtained using the definition of IDFT:
Discrete and Fast Fourier Transform | 55
x(n) =
1 N
N −1
∑ X (i) ⋅ e
jn
2π i N
(1.153)
i =0
Comparing (1.153) with the definition of DFT given by (1.145), it is easy to show that if the conjugated complex values {X*(0), X*(1),..., X*(N–1)} are brought to the inputs of a FFT module, the conjugated time samples {x*(0), x*(1),..., x*(N–1)} multiplied by N will appear at the output. Namely, (1.153) can be written as:
1 x( n) = N
2π − jn i N −1 * ∑ X (i ) ⋅ e N i =0
*
(1.154)
which is equivalent to: N −1
Nx * (n) = ∑ X * (i ) ⋅ e
− jn
2π i N
{
}
= DFT X * (n)
(1.155)
i =0
i.e.:
1 x( n) = DFT X * (n) N
{
}
*
(1.156)
So, it only remains to conjugate the results of FFT and divide each of them by N. When FFT is used in this way, it is called the Decimation-In-Frequency FFT algorithm (DIF FFT).
2 Typical Communications Signals 2.1 Speech The most significant type of signal in human communication is speech. In order to obtain a quality speech communication over a longer distance between two persons, a communication system (i.e. telephony) has to fulfil some requirements which are derived from the properties of the speech signal. On the other hand, due to efficiency and economy, the system introduces some restrictions which should not affect significantly the quality of the communication. Since every person has different voice characteristics, a proper design of a system for the transmission and/or processing of speech has to consider the speech signal from the statistic point of view. Once produced as a wave of moving air particles, human speech, as any other sound, can be converted by microphone into electrical domain and further transmitted in a real time by a transmission system, or it can be recorded by voice recorder and processed i.e. reproduced in an audio system. No matter whether the application which deals with speech is a real time or off-line, the speech can be treated as any other signal, i.e. as a random process s(t) with its specific statistic properties (see Chapter 3). Observed in this way, s(t) may correspond for example to voltage u(t) or current i(t) of the electrical signal from a microphone, while the power of speech signal, defined as p(t) = s2(t), may correspond to the power of the electrical signal delivered to the input resistance R of an operational amplifier, i.e. p(t) = u2(t)/R = i2(t)R.
2.1.1 Production and Modelling Speech signal is created at the vocal cords that vibrate as the air flows from the lungs toward the vocal tract, and produced at speaker’s mouth it gets to the listener’s ear or to a microphone as a pressure wave. Consequently, the properties of the produced sound depend on the physiology and the volume of vocal tract as well as on the way the speaker pronounces words (which is also significantly determined by the person’s vocal tract physiology). Since the vocal tract is the cavity (between the vocal cords and the lips), it spectrally shapes the periodic input from vocal cords in the same manner as the resonator cavity of a musical wind instrument gives the characteristic timbre of the instrument. There are two major classes of speech signals: vowels and consonants. In a vowel production the air flow causes periodic vibration of vocal cords, and the fundamental frequency of these vibrations is known as the pitch of the sound. The pitch (also denoted as F0) is followed by 4–5 higher frequencies called formants
58 Typical Communications Signals
(F1, F2, F3, F4, F5). The typical pitch of male speech (shown in Fig. 2.1) is ~ 85–155 Hz and of female is ~ 165–255 Hz. While the vowels are always “voiced” i.e. created by a periodic source (vibrations of vocal cords), consonants may also be “un-voiced” i.e. produced by an aperiodic source (with noisy or impulsive nature); in the word “shop”, for example, the consonants “sh” and “p” are generated from a noisy, i.e. impulsive source, while the source of the vowel “o” is periodic.
Fig. 2.1: Time representation of a pitch of male speech.
The basic sounds of speech are called phonemes. A phoneme can be any of the perceptually distinct units of sound in a specified language that distinguish one word from another, for example “b” , “d” , “p” and “t” in the English words “bad” and “pat”, or “a” in the word “March”. A typical speech utterance is a series of vowel and consonant phonemes with spectral and temporal characteristics that vary with time. The above described way of the speech production (glottal airflow input and vocal tract airflow output) can be approximated by a linear system (filter) with resonant frequencies (formants) which change with different configurations of vocal tract (corresponding to different resonant cavities and thus – to different phonemes). In general, the frequencies of the formants decrease as the length (i.e. volume) of vocal tract increases, so an average male speaker has lower formants than a female, and a female has lower formants than a child. Under the assumption of the linearity and the time invariance (LTI assumption, see Chapter 1) of a vocal tract during the speech production, the resulting speech sequence s(t) can be presented as the convolution of the respective excitation sequence e(t) and the vocal tract filter sequence h(t):
s(t ) = e(t ) * h(t ) = ∫ e(t )h(t − τ )dτ t
0
(2.1)
Speech | 59
or, in frequency domain, as the product of the spectral characteristic (Fourier transform) of excitation E(ω) and the transfer function H(ω) of the filter (as a linear system):
S(ω) = E(ω)H (ω)
(2.2)
Amplitude
E (w ) H (w )
F0
w1 F1 w 2
p
w3 F2 w 4 × × × FM
w
Fig. 2.2: Glottal source harmonics, vocal tract formants and spectral envelope
Figure 2.2 illustrates the relation between the glottal source harmonics at frequencies ω1, ω2,…, ωN (and pitch F0) and the resulting spectral envelope S(ω) with the characteristic formant areas around frequencies F1, F2,…, FM (M is typically 4 or 5). On the other hand, the mentioned simplified model with LTI assumption (which is correct for steady vowels) does not represent the speech signal properly, since the source is not strictly time invariant and also the system (vocal tract) can interact with the “source signal” in a complex and nonlinear way. As a result, the spectrum of a speech signal contains many frequencies and these components change continuously with time, i.e. the duration of certain spectral component is as short as 10–30 ms. This non-stationary nature of speech can be mathematically represented (approximated) as: K (t )
s (t ) =
∑ A (t ) cos(ω k
k =1
k
(t ) + ϕ k (t )) (2.3)
The spectrum of a real speech signal for a continual voiced phoneme (vowel) is shown in Figure 2.3 (obtained using the Sound Forge programme application), where the formant areas can be noticed.
60 | Typical Communications Signals
-47
Pitch at 107 Hz -71
-95
-119
dB -143 58
372
685
998
1.311
1.624
1.937
2.250
2.563
2.876
3.189
3.502
3.815
4.128
4.441
4.754
5.064
5.380
5.693
6.006 Hz
Fig. 2.3: The spectrum of “Aaaaa”.
Although non-stationary, speech is often considered as a random process which is stationary within short time intervals (~20 ms). By analysis, speech signal can be divided in groups of short sound segments (taken equidistantly) which have some common acoustic properties, and in this sense it may be considered as a cyclostationary i.e. quasi-periodic process. Typical speech signal in time domain (also obtained in Sound Forge) is shown in Figure 2.4. 0.1 0.08 0.06 0.04 0.02 0 −0.02 −0.04 −0.06 −0.08 −0.1 0
20
40
60
80
100
120
140
160
180
200
t [ms]
Fig. 2.4: Real voiced sound (“Aaaaa”): 200 ms, 8820 samples.
2.1.2 Speech Channel Bandwidth and Power of Speech Signal The spectrum of a speech signal occupies the frequency range from under 100 Hz to as much as 12 kHz, but the vast majority of the power is concentrated between 300 and 3400 Hz (as shown in Figure 2.5). It is considered that the range of 300 Hz— 3400 Hz contains a sufficient quantity of information for the reconstruction of a speech signal, so that in classical communication systems the bandwidth reserved for speech transmission (also known as a telephony channel) is limited to 3.1 kHz.
Speech 61
Fig. 2.5: Power spectrum of an average speech signal.
The attenuation characteristics of a real and an ideal telephone channel are shown in Figure 2.6. The speech signal is theoretically undistorted if all spectral components of the signal are within provided bandwidth and if the magnitude of the signal is less than the one maximally allowed by the system.
Fig. 2.6: Frequency characteristics of a) real; b) ideal telephone channel.
The value of 3.1 kHz has not been taken as the standard bandwidth of the speech (telephony) channel by chance. Namely, from the very beginning of the development of telephony, thorough investigations and experiments had being performed regarding the influence of the bandwidth on the characteristics of a reconstructed speech signal. These experiments included a great number of speakers and listeners, in order to obtain statistically correct results. The main characteristics which had been investigated were average spectral power density and the articulation of a speech signal. The spectrum of a speech signal is changeable, as mentioned, in a very short time intervals, and its nature vary between continual and discrete. Continual spectrum corresponds to vowels and the discrete one to consonants. An analytical ex-
62 Typical Communications Signals
pression for the average power spectrum (spectral power density) of speech can be found in [GaJo03] as an acceptable approximation of the characteristics shown in Figure 2.5:
S ( f ) = f1
1 1 s2 + 2 2 2 2 f1 + ( f 2 + f ) π f1 + ( f 2 − f )
(2.4)
� 2 is the average power of a speech signal. The where f1 = 181.5 Hz, f2 = 1475 Hz and ��� proposed expression is often used by analysis and design of the systems for speech processing. The average power of a speech signal in a time interval T is defined by:
P0 =
1 T
∫ p(t )dt
T
(2.5)
0
where p(t) is the current power of signal. If the interval T is long enough, the average power P0 from (2.5) can be equalized with ����2 from (2.4). Beside the average power, the peak power Pε is also a significant characteristic of speech. Peak power is defined via parameter ε, so that in ε % of time (within an interval T) the current power p(t) is above the value Pε, i.e.:
ε [%] =
Fig. 2.7: Peak power of a speech signal.
100 ⋅ ∑τ i T i p(t ) ≥ Pε
(2.6)
Speech 63
The ratio between the peak and the average power of a speech signal is called peak factorνε:
ν ε = 10 log
Pε P0
(2.7)
When observed on a set of many consecutive short intervals T, the maximal average power during the uttering of a syllable (phoneme) can be found, also by the use of (2.5). This maximal value varies for different syllables in a person's speech, and the dynamic, i.e. the logarithm power ratio between the most and the least powerful syllables can be even 70 dB. In case of an average person, the dynamic of speech is about 40 dB and this is also an important parameter for the design of any system which deals with speech processing and transmission.
2.1.3 Current Values of Speech Signal The current values of a speech signal in adjacent instances of time are not mutually independent, and their connection can be expressed by an empirical expression for the autocorrelation function of a speech signal:
[
R(τ ) = s 2 e 2πf1|τ | cos 2πf 2τ
]
(2.8)
� 2 is the average power of a speech signal. It can where f1 = 181.5 Hz, f2 = 475 Hz and ��� be concluded from the Figure 2.8, where the normalized autocorrelation function of a continual speech signal (i.e. without breaks in speech) is shown, that current values are correlated only within a narrow interval of ~ 0.5 ms.
Fig. 2.8: Autocorrelation function of a speech signal.
64 | Typical Communications Signals
The distribution (i.e. the probability density function pdf, see Chapter 3) of current values can be approximated by the Gamma distribution: −
3
pdf ( s ) =
8πσ s s
e
3s 2σ s
(2.9)
where s is the current value of a speech signal and σs is the standard deviation. A simplified approximation is given by the Laplace probability density function:
pdf ( s ) =
1 2σ s
−
e
2s
σs
(2.10)
Both Gamma and Laplace distributions are shown in Figure 2.9. -1
10
Laplace distribution
-2
10
pdf Gamma distribution
-3
10
Distribution of speech signal
-4
10
-5
-4
-3
-2
-1
0
1
2
3
4
5
s/ss
Fig. 2.9: Probability density function of speech signal.
2.1.4 Coding and Compression Speech communication is relatively inefficient way of communication since many breaks (pauses) are included. Firstly, in a common conversation between two people, while one person talks the other one is listening; so, in average, the communication channel is used in a half of the whole time in each direction. Secondly, there are pauses between words, when the speaker takes a breath or when the lips are closed. And finally, when the speaker is preparing a new sound (phoneme) which is
Speech 65
going to be pronounced, there are pauses known as occlusions. This happens because the vocal tract takes different positions while uttering vowels and consonants and this transition lasts a while. The duration of occlusions is different for different phoneme transitions, but in the most of cases it is between 60 and 100 ms. In communications, the speech activity coefficient is defined as the part of time during which the speech channel is really active. It has been experimentally found that the average of the speech activity is 3/8. This “inefficiency”, also called the speech interpolation, is well exploited in communication and audio systems, since there is a plenty of space for implementation of various techniques for speech compression or for the simultaneous transmission of many speech signals over a limited bandwidth. In case of the “classic” digital transmission of a speech signal in telephone systems, the standard Pulse Code Modulation (PCM) technique is used (see Chapter 6), which assumes sampling of analog speech signal every 125 µs, i.e. with the rate of 8000 samples/s. The samples are digitized using 8-bit A/D conversion, which gives a digital speech signal with the bit rate of 64 kbit/s and with 28 = 256 amplitude quantization levels. Since the perception of human hearing is such that the amplitude variations of a louder sound are less noticeable then the amplitude variations of a less intensive sound, all 256 amplitude quantization levels are not uniformly distributed along the amplitude i.e. dynamic range. Instead, the coding is realized in a way that the biggest resolution is in the amplitude range closely above and below zero value, while in the areas with greater amplitudes (both positive and negative) the distribution of quantization levels is coarser. Such a transformation from a continual amplitude range to the “constricted” range of discrete quantization levels (i.e. 8-bit digits) is achieved by applying a logarithmic companding function on the amplitude samples before A/D conversion in a separate device (compander). The other way is to use a 12-bit A/D convertor and the appropriate mapping of obtained 12-bit digits into 8-bit digits. The well-known, and so far the most used standard in digital telephony for encoding of speech signal – the ITU-T recommendation G.711 (see Chapter 6), provides two options for the companding function: the so called A-law and µ-law. The µ-law is used in North America and Japan, while the A-law is used in the rest of the world. These functions are defined as follows:
1 A⋅ | s | , for : | s |< sgn( s ) 1 + ln( A) A YA _ law ( s ) = 1 ln( | |) 1 A s + ⋅ sgn( s ) , for : ≤| s |≤ 1 1 + ln( A) A
(2.11)
66 | Typical Communications Signals
Yµ _ law ( s) = sgn(s)
ln(1 + µ ⋅ | s |) ln(1 + µ )
(2.12)
where s is the amplitude of a speech sample. The A-law and µ-law companding functions are shown in Figure 2.10 for the standard values of parameters: A = 87.6 and µ = 255. 1 0.8 0.6
A-law (A=87.6) m-law (m=255)
0.4 0.2 0 −0.2 −0.4 −0.6 −0.8 −1 −1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Fig. 2.10: A-law and µ-law companding functions.
As it can be seen, both functions are very similar to each other; while the A-law implemented function has a little bit smaller dynamic range, the minimal non-zero sample value of the µ-law that can be coded is smaller than that of the A-law. The described treatment of the sampled speech signal lowers the signal to quantization noise ratio on one hand, while on the other hand the dynamic range of a digitized signal is significantly smaller, as well as the number of bits necessary for coding. In other words, the compression of speech is achieved using only 8 bits per sample (also the necessary bandwidth is smaller) but with the effectiveness of the 12-bit resolution (for smaller amplitudes), and the cost is paid in the certain amount of distortion of speech signal (on which the human auditory system is less sensitive anyway). The greater level of speech compression can be achieved by encoding the difference between amplitudes of adjacent samples, in which cases less than 8 bits per “sample difference” are used. There are a few variants of this compression technique, known as Differential PCM (DPCM), for different numbers of bits used for encoding. The simplest version is Delta-Modulation (see Chapter 5), where only one bit is used, i.e. where a positive difference is encoded with “1” and a negative one with “0”. In all DPCM variants the penalty for increased compression is paid in a lower quality and/or intelligibility of the reconstructed speech – the less bits are used, the quality is poorer.
Speech | 67
Also Adaptive DPCM (ADPCM) belongs to the group of differential speech compression techniques, which exploits the earlier mentioned autocorrelation properties of speech. Namely, since each new PCM sample is partly correlated to a few previous samples, these samples can also be used for the prediction of the amplitude of the last sample, and the difference (i.e. error) between the real sample and its predicted value which is about to be coded is smaller. With the use of prediction, the dynamic range of encoded differences is smaller than that of “pure DPCM”, so ADPCM gives better results, i.e. the quality of reconstructed speech is better for the same number of bits used for encoding. The bit rate of (A)DPCM coded speech also depends on the sample rate and the number of bits used for coding. For example, the series of 4-bit ADPCM samples with the sampling rate of 8 kHz gives the bit rate of 32 kbit/s, while the 2-bit encoding gives 16 kbit/s, so one standard PCM 64 kbit/s speech channel can carry two 4bit or four 2-bit ADPCM channels. The 2-, 3-, 4- and 5-bit ADPCM signals with the bit rates of 16, 24, 32 and 40 kbit/s are defined in the recommendation ITU-T G.726 [ITUG726]. There is also an ITU-T G.722 standard [ITU-G722] (or G.722 codec or “split-band ADPCM”) which defines ADPCM signals at 48, 56 and 64 kbit/s, where the analog speech signal is by filtering divided onto two sub-channels (low- and highfrequency band) before sampling, then each sub-channel is separately digitized using ADPCM, and finally, the two resulting bit-streams are multiplexed in one. The above mentioned speech compression methods (PCM, ADPCM) belong to the so-called waveform-based techniques, since they aim to remove the redundancy in the waveform of a speech signal. As already shown, their implementation is relatively simple, but their compression ratios are low. Further, for the bit rates less than 16 kbit/s, the level of quantization errors is too high and the speech quality is bad. The other two groups of speech compressing methods are parametric-based and hybrid coding techniques. The parametric-based methods rely on the production of speech in the vocal tract of a speaker, i.e. they are based on the parameters of voiced speech in the short time intervals (with the durations of about 20 ms) when these parameters are stable and when the speech signal can be taken as stationary. During these intervals (speech segments), the spectrum of a speech signal is characterised either as voiced or unvoiced and the vocal tract is represented as a digital filter (under the before mentioned LTI assumption). Instead of encoding the sample amplitudes (or their differences), a stable segment of speech is firstly analysed using the parametric-based techniques in order to extract its source parameters such as the nature (voiced or unvoiced), pitch period and the signal energy (gain), and then the coefficients of a digital filter which simulates the vocal tract cavity. After that, the encoding of the provided parameters is
68 | Typical Communications Signals
performed and the created bit stream is sent to the channel. At the receiving side the original speech is reconstructed (synthesized) on the basis of the received information as the output of the digital filter applied on the excitation sequence (which is also generated from the received information on e.g. pitch period and gain). The parametric-based methods (i.e. codecs) give a low quality of synthesized speech and their implementation is more complex, but the intelligibility is relatively good and the compression ratio is better than in case of ADPCM. A typical representative of the parametric-based voice encoders (vocoders) is Linear Prediction Coding (LPC) vocoder [JaNo84][Deetal93] which gives bit rates from 1.2 to 4.8 kbit/s. Since parametric-based vocoders provide low quality of synthesized speech due to the very small set of simple parameters which are coded and transmitted (voiced/unvoiced, pitch period etc.), the enhancement is achieved by combining of both parametric- and waveform-based methods. The difference made by the use of these hybrid coding techniques is that the information on the source (excitation) of a voiced segment (that is to be coded) is not the pitch period anymore. Instead, a “waveform-like” excitation signal is used, such as “multi-pulse excitation” or “codebook excitation”. The most prominent of the hybrid coding techniques is Codebook Excitation Linear Prediction (CELP) [ScAt84] which gives a satisfying quality of speech proportional to the provided bit rates in range from 4.8 to 16 kbit/s. CELP codecs are used in mobile, wireless and satellite communications and the most of modern speech codecs are based on CELP such are codec G.729 [ITU-G729], codec G.723.1 [ITU-G723.1], Adaptive Multi-Rate audio codec (AMR) [RFC4867] or SILK codec [IETF09] developed by Skype Ltd. Among many speech codecs, Internet Low Bitrate Codec (iLBC) [Anetal02] is very popular in Voice over IP (VoIP) applications. It is a version of a linear predictive coding with the possibility of choosing between speech segment lengths of 20 ms with the bit rate of 15.2 kbit/s and speech segment lengths of 30 ms with the bit rate of 13.3 kbit/s. All above mentioned speech compressing methods from PCM to different CELP variants are aimed at the standard 4 kHz wide speech signal with sampling rate of 8 kHz. In mobile and VoIP applications, due to the higher fidelity, the compressing techniques implemented on the so called Wideband (WB) speech with the frequency range of 0–7 kHz and the sampling rate of 16 kHz are even more used. At the moment, there are three compression methods for WB speech: – Waveform compression based on sub-band (SB) ADPCM (e.g. G.722 [ITU-G722]) – –
Hybrid compression based on CELP (e.g. G.722.2 or AMR-WB [ITU-G722.2]) Transform compression coding (e.g. G.722.1 [ITU-G722.1])
Audio 69
2.2 Audio1 Sound was the first means of human communication and to this day it remains the basic and the most powerful way for direct transmission of information in everyday life, conveying not only the bare meaning but also emotions. At its primary level, sound communication pertains to human voice (speech), but more generally there are other types of sounds (such as music) that need to be transmitted from the sound source to the listener. Audio communications present any form of transmission of the sound information that is based on hearing. Audio systems provide means for such a transmission, i.e. they represent an interface between a source of the sound information and a listener who is dislocated from it either in space or in time, or both [Mij11]. Audio processing, recording and transmission have been part of communication and entertainment systems for more than a century [Spetal07]. From the very early stages of this field, bandwidth issues associated with audio recording, transmission, and storage occupied engineers. Historically, the era of audio communications started at the end of nineteen century with the inventions of the telephone by A. G. Bell and the phonograph by T.A. Edison. These technological breakthroughs have laid foundations of the entire field and attested that an audio channel could indeed be recorded and transmitted as a continuous mechanical, magnetic, optical, or electrical analogue of the sound vibrations – the principle that is still almost universally employed in telephony, broadcast, and audio recording. On the other hand, the advent of new technologies introduced a multitude of problems: inadequate recording sensitivity and playback volume, noise, distortion, speed irregularities (such as wow and flutter), nonlinear and limited frequency response, media wear, etc. Although most of these problems were tackled and to some degree ameliorated in decades that followed, the true solution had to wait for the invention of the digital audio. This new era effectively began in 1982 with appearance of the first audio compact disc (CD). Since then efforts have focused on the reduction of the required data rate (audio coding) and on extending the realm of audio from single isolated channels to full three-dimensional (spatial audio) systems. In spite of the progress of digitalization, some parts of the audio chain are still analog, e.g. transducers such as microphones and loudspeakers, and associated interface electronics [Ros07]. Eventually, complete digitalization is precluded by the analog nature of the human ear itself (although the first steps toward a direct interface to human nervous system has already been made in some hearing aid devices). Requirements of novel communication systems and technologies continuously influence development of audio systems.
1 Contributed by: Iva Salom and Dejan Todorovic
70 Typical Communications Signals
Completely different field of communications uses sound as an information carrier in such a way that digital data is encoded in the properties of the transmitted sound. There are numerous methods and protocols designed for this type of information transmission that significantly depend upon the used frequency range and the type of medium used to transmit sound signal.
2.2.1 Sound and Human Auditory System By its physical definition, sound is a phenomenon that is represented by mechanical vibrations transmitted through an elastic medium. The other, perceptual definition is that sound is an auditory sensation in the ear. This definition does not include the variety of sounds that cannot be perceived. On the other hand, in audio communications, where the sound information is processed and transmitted to the listener, the human ear is the information receiver and the main measurement instrument. Thus, in this context it is usually assumed that the sound is transmitted through air. Sound waves in the air (and in all fluids) are longitudinal waves, meaning that the oscillatory movements of the air particles are along the direction of wave propagation. Various parameters can be used to characterize the sound. In acoustics (the science of sound) sound waves are most commonly described by the pressure variations, called acoustic or sound pressure pa, expressed in pascals (Pa). Because of the wide range of the pressure stimuli and a very large dynamic range of the human ear, the sound pressure is expressed on a logarithmic decibel scale (dB). The sound pressure level is defined as: SPL (dB) = 20 log �
��
�ref
�
(2.13)
where pa is the acoustic pressure (the root mean square value) and �ref = 2 ∙ 10−6 Pa is the reference acoustic pressure defined as the threshold of hearing in the most sensitive region of healthy human ear [Kut09]. Hearing process, which is carried out by the human auditory system, involves transformation from the physical characteristics of sound to their perceptual attributes and creation of a complex subjective sensation, called the sound image. The hearing process is presented on the block diagram in Figure 2.11 [Mij11].
Audio 71
Fig. 2.11: Block diagram of the hearing process [Mij11].
Human ear as a part of the auditory system is a receiving interface for sound communication. Anatomically and physically it is fully adapted for perceiving sound stimuli in the air medium. The ear comprises the outer ear consisting of the pinna, the ear canal and the eardrum, the inner ear being represented essentially by the cochlea with the vestibular apparatus, and the middle ear, in between of the two, which transfers the sound vibrations from the outer ear to the inner ear. All these parts together convert airborne sound waves into mechanical and fluid vibrations, analyse their temporal and frequency content and transduce components into electrical and neural signals for transmission to the central nervous system. These signals are then further processed in the cochlear nucleus, brain-stem nuclei and higher levels of the brain, leading finally to their subjective recognition. The range of frequencies and the range of sound intensities (the dynamic range) to which the human auditory system responds are quite remarkable, as shown in Figure 2.11 [Kut09]. On average, human auditory system is capable of hearing sound frequencies from 16 Hz to around 20 kHz (the upper limit being very individually dependent and decreasing with age). However, the ear is not uniformly sensitive throughout this range – the maximum sensitivity being between 3 and 4 kHz and falling steeply when approaching the perimeters of the audible spectrum. The span of audible sound intensities is even more impressive, as it covers more than 120 dB. Effective range of sound intensities depend on the type of sound information transmitted: regions of audible sound for speech and music are shown in Figure 2.12. The dashed curve in the figure represents the "threshold of pain" giving the upper limit of the sound intensity (defined as the pressure level of a sine tone at the given frequency causing an unpleasant or painful sensation, that can lead to temporal or even permanent hearing loss).
72 Typical Communications Signals
Fig. 2.12: Average audible sound pressure level range and frequency range, with approximate speech and music regions.
Not only does human auditory system respond to stimuli within these wide ranges, but it is capable of precisely identifying many of the perceptual sound properties, such as: the pitch, timbre, duration, loudness and direction of a sound. Each of these perceptual attributes is often a complicated function of one or more physical attributes: frequency content, time, amplitude and, indirectly, spatial distribution of the sound. For example, in general terms, pitch is the perceptual response to the frequency content of a source. Yet, it is far more complex than a simple mapping of frequency to a perceptual scale, and can be affected by many parameters such as the loudness of and the timing of the signal. Well-known phenomenon is also the virtual pitch, when the frequency of the perceived pitch is actually not present in the objective sound spectrum (for example, missing fundamental) [Ros07]. Furthermore, the ear exhibits a curious duality with respect to the degree it can differentiate between frequencies. On the one hand, certain frequency-dependent processes, such as the so called masking, appear to indicate a grouping of the frequency range into somewhat coarse sub-bands, called critical bands. The masking is a perceptual phenomenon in which the audibility of a sine tone in isolation may be modified in the presence of other spectral components. Frequency masking involves shifting upwards of the hearing threshold of a tone presented simultaneously with other sound (masker) of similar frequency range, while time masking may occur when masker sound appears shortly before or after the first tone. On the other hand, frequency discrimination of individual sine waves on the basis of their per-
Audio 73
ceived pitch has a typical resolution limit corresponding to the frequency change of about 0.7 % (just noticeable difference) [Ros07][Kut09]. Though understanding of the underlying physiological mechanisms leading to these phenomena is missing, they are nevertheless of utmost importance for functioning of a class of contemporary digital audio compression algorithms, so called perceptual coding.
2.2.2 Audio Systems The general block diagram of a communication system can be applied to an audio system, as presented in the Figure 2.13.
Fig. 2.13: Block diagram of an audio system: A – Acoustic domain, Synth – generator of audio signal, SP&C – Signal Processing and Coding, CH – Transmission channel, D&SP – Decoding and Signal Processing, ADS – Audio Data Storage.
With the basic limits of auditory characteristics reviewed above, consideration can be given to the requirements for high-quality audio systems. The domain of sound information is based on the frequency and dynamic range of the auditory system, shown in Figure 2.12, as well as on its resolution. This domain can be transformed to the domain of information of audio signals, as shown in Figure 2.14 [Mij11]. The upper limits of the dynamic range and bandwidth are defined by the physical constraints of the channel, while the lower limits of the dynamic range are determined by the noise level. As shown in Figure 2.14, dynamic range is the amplitude range (in dB) from the highest to the lowest signal amplitude an audio device or channel can handle. In cases where there is effectively no such minimum signal other than zero level, the minimum useable level may be taken as the noise level of the channel. If the noise level of the channel is substantially independent of signal level, the dynamic range of the channel may alternately be referred to as the signal-to-noise ratio (SNR). If the
74 Typical Communications Signals
noise is signal dependent, the audibility of the noise will depend on the instantaneous signal-to noise ratio, which is likely to be quite different from the dynamic range.
Fig. 2.14: Domain of information of analog and digital audio signals [Mij11].
In order to preserve speech intelligibility and overall audio quality as well as to protect the transmission path, the proper levelling of the audio signal should be applied. Looking at the audio signal, several properties can be emphasized: noise floor, RMS value and peak value. Due to stochastic nature of audio signal, RMS and peak level lie in wide range of values. In analog domain, the linearity and low noise floor are essential for good audio dynamics, while in digital domain it is the number of bits per word. If the signal level exceeds certain value, it may result with audible distortion. Similarly, if the audio signal level is comparable to the noise floor, the information could be damaged or completely lost. There are a number of standards that introduce audio level scales and there is an existing tendency to unify those scales. Thus, cinema and broadcasting professionals have formed workgroups with goals to define and unify the audio levels standards in practice. One of the most advanced solutions was the result of EBU PLOUD group that introduced the Loudness level as the most natural way of sound recording, transmission and reproduction [Cam10]. As described in ITU-R BS.1770 and ITU-R BS.1771, frequency weighting (K-weighting) and true peak measurement should be applied ([RBS1770-4][RBS1771-1]). Target loudness level of –23 LUFS (Loudness Unit Full Scale) for each programme material (music, movie, commercials, etc.) gives the impression of equal loudness, keeping the valuable dynamic range for the audio signal [R128]. Maximum true peak level of an audio signal, which should not exceed the level of –1 dBTP (dB True Peak) is recommended to
Audio 75
comply with the technical limits of the complete signal chain. Headroom between the programme loudness and the true peak grants the natural quality of the audio.
Fig. 2.15: Peak normalisation vs. Loudness normalisation of a series of programmes.
Noise in audio system is, generally, unwanted signal. It reduces the dynamic range of the usable signal from its lower end and influences the overall signal quality. Noise can appear in any part of the audio chain and accumulates in the output signal. There are several efficient noise reduction systems, such as Dolby A, B and C, Telcom C-4, dbx I and II, dbx/MTS TV compressor, Dolby SR and S [Ros07]. A noise reduction system usually consists of a compressor/encoder and an expander/decoder. The compressor boosts low-level signals above the noise floor, while the expander restores the signals to their appropriate level, avoiding the appearance of noise in signals. Similar principles are used in digital low-bitrate coders. However, if the output signal contains audible noise, it is difficult task to remove the noise without noticeable loss of quality. In state-of-the-art digital audio systems, by using a low noise electronic components and topology and by employing high bit rates of quantization and coding it is possible to keep the noise level under the threshold of hearing during the reproduction. Audio signal is fairly sensitive to distortion. If the distortion exceeds certain level, the reproduced sound becomes unpleasant, tiresome and loses its intelligibility. Every nonlinearity of the audio chain results in some level of distortion. Typically nonlinearities can be found in transducers, A/D and D/A converters and amplifiers. Acceptable level of total harmonic distortion is less than 2 %, and 0.1 % for intermodulation distortion (interaction of two or more spectral components with a nonlinearity).
76 Typical Communications Signals
2.2.3 Digital Audio Systems As in other field of communication, digitalization was introduced to audio, in accordance with well-known laws of sampling and quantization with respect to audio signal characteristics, bringing numerous advantages. To fully appreciate these advantages it is important to keep in mind the difficulties the analog audio technology had to comply with the traditional audio specifications, whereas the digital technology conforms to these requirements fairly easy. The digital techniques used for recording, reproduction, storage, processing, and transmission of digital audio signals entail entirely new concepts alien to the analog audio technology [Poh05]. Regardless of the fact that a number of audio enthusiasts strongly argue that certain parts of the analog audio chain cannot be replaced, and in spite of some substantial anomalies of the digital systems, it is beyond doubt that there are many advantages of the digital technology such as distortion and noise immunity, higher sound quality, dynamic range, lesser power consumption, smaller dimensions, reduced costs, age and temperature immunity, increased reliability, robustness, flexibility, etc. Additionally, digital signal processing provided new designs for which there was arguably never a truly effective analog electronic solution. As it was presented in Figure 2.14, domain of information of analog audio signal is represented by frequency range and dynamic range. Corresponding counterparts in digital domain are sampling frequency and quantization resolution (bits per sample). It is commonly accepted that the introduction of the CD in 1982 marked the beginning of the digital audio era. Conventional CD and Digital Audio Tape (DAT) systems are typically sampled at either 44.1 or 48 kHz using pulse code modulation with a 16-bit sample resolution [Spetal07][Poh05]. This results in uncompressed data rates of 705.6/768 kb/s for a monaural channel, and 96 dB dynamic range. However, listed specifications were soon judged as insufficient by most of audiophiles, which pushed the development of new standards, such as the DVD-Audio (DVD-A) and Super-Audio-CD (SACD), improved both in respect of the resolution of 24 bits per sample, and sampling rate of 96 kHz or more. Such progress certainly resulted in expanding of the effective bandwidth over the 20 kHz limit (and thereby providing additional safety margin to avoid alias distortion), avoiding possible intermodulation distortion, and lessening of potential filter distortion. On the other hand, it demanded higher channel capacities. In the early years of digital audio communications, limited bandwidth and storage spaces were always a significant issue. This was the main motivation for design of many types of compression coding for speech and audio that consequently have found important applications in cellular phones systems, music delivery over networks, and portable players.
Audio 77
2.2.3.1 Audio Coding The goal of audio coding and audio compression is to reduce the number of information bits required to store or transmit audio data, while still maintaining high level of audio quality, that is, without introducing noticeable changes of the sound in the process. Basic building blocks of an audio coder are: analysis filter bank, perceptual model, quantization and entropy coding of spectral coefficients, and bitstream multiplexer [Spetal07]. In general, digital audio coding algorithms can be divided into two categories: lossless and lossy (also called perceptual) [Spetal07][Ros07][Poh05]. The lossless coding applies usual data compression methods that can reproduce original sound data with bit-for-bit accuracy, without resorting to any psychoacoustic or perceptual considerations. The goal of lossy methods, on the other hand is only to preserve and convey all of the perceptual aspects of audio content, while being allowed to introduce changes in sound data (aimed at reducing the data size) which correspond to inaudible differences. In other words, lossy methods dominantly rely on specific psychoacoustic effects, such as using masking curve (that specifies the threshold of hearing at a given frequency as a function of accompanying nearby frequency amplitudes) to remove all inaudible components of the sound that fall below the curve. The primary audio encoding scheme – the basic PCM itself does not contain any methods of compression. However, there are more advance quantization methods that embed mechanisms for some level of redundancy removal, such as DPCM, delta modulation, and ADPCM. Perceptual audio coding algorithms flourished in early 1990s, when several large workgroups and organizations such as the International Organization for Standardization/International Electro-technical Commission (ISO/IEC), the International Telecommunication Union (ITU), AT&T, Dolby Laboratories, Digital Theatre Systems (DTS), Lucent Technologies, Philips, and Sony were set to task of developing and defining novel audio coding standards. Table 2.1 lists chronologically some of the prominent audio coding standards [Spetal07]. The development of these audio codecs soon also led to the emergence of several multimedia storage formats, some of them listed in Table 2.2. ISO/IEC had formed the Moving Picture Experts Group (MPEG) in 1988 in order to develop data reduction techniques for audio and video. There are several highly successful MPEG standards [Poh05]. Of particular interest is the MPEG-1/-2 layer III (MP3) algorithm that appeared in 1993 and has since become the major standard for the compression and transmission of music in the Internet era. The MPEG advanced audio coding (AAC) standard, boasting with somewhat better overall sound quality for the same compression rate, subsequently appeared as a potential successor but (as of yet) was not able replace MP3 as the most widely used standard.
78 Typical Communications Signals
Tab. 2.1: List of perceptual and lossless audio coding standards/algorithms [Spetal07].
Standard/algorithm ISO/IEC MPEG-1 audio Philips’ PASC (for DCC applications) AT&T/Lucent PAC/EPAC Dolby AC-2 AC-3/Dolby Digital ISO/IEC MPEG-2 (BC/LSF) audio Sony’s ATRAC; (MiniDisc and SDDS) SHORTEN Audio processing technology – APT-x100 ISO/IEC MPEG-2 AAC DTS coherent acoustics The DVD Algorithm MUSICompress Lossless transform coding of audio (LTAC) AudioPaK ISO/IEC MPEG-4 audio version 1 Meridian lossless packing (MLP) ISO/IEC MPEG-4 audio version 2 Audio coding based on integer transforms Direct-stream digital (DSD) technology
Tab. 2.2: The most popular audio storage formats.
Audio storage format
Appeared in
Developed by
Compact disc
1982
Sony and Philips
Digital audio tape (DAT)
1987
Sony
Digital compact cassette (DCC)
1992
Philips and Panasonic
MiniDisc
1992
Sony
Digital versatile disc (DVD)
1996
Philips, Sony, Toshiba, and Panasonic
DVD-audio (DVD-A)
2000
DVD Forum
Super audio CD (SACD)
1999
Sony and Philips
Pure Audio Blu-ray
2009
Blu-ray Disc Association
High Fidelity Pure Audio
2013
Sony, Universal Music Group
Audio 79
The need for streaming audio applications influenced development of various techniques such as combined speech and audio architectures, as well as joint sourcechannel coding algorithms that are optimized for the packet-switched Internet, Bluetooth, and in some cases wideband cellular network. Multimedia applications such as online radio, web jukeboxes, and teleconferencing require transmission of real-time wireless audio content that can be accomplished with audio compression algorithms for high-quality audio at low bit rates with robustness to bit errors. MPEG-4 standard provided integrated family of algorithms for high-quality audio (and video) at low bit rates with provisions for scalable, object-based speech and audio coding, allowing operation over the Internet and other networks, and wired, wireless, streaming, and digital broadcasting applications. Audio codecs have further evolved in a few directions. One of the goals was to enhance behavior for some specific, otherwise critical types of input signals, such as transient and tonal solo instruments. Another direction was to improve algorithm properties for stereo and surround sound, e.g. MPEG Parametric Stereo (PS), MPEG Surround, MPEG Spatial Audio Object Coding (SAOC), and, as a more complex system, MPEG-H 3D Audio. Codecs that combine both audio and speech codecs into one, have also become common nowadays. Examples include MPEG-D Unified Speech and Audio Coding (USAC) or 3GPP’s codec for Enhanced Voice Services (EVS) [AES15]. The requirement of real-time audio data streaming, especially in the context of mobile communications, produced a demand for compressed audio formats adjusted to variable bit rates, such that are uninterrupted and function as seamlessly as possible in conditions of temporary but sudden bit rate reduction. An example of such format is MPEG-3D Audio coding format in combination with the MPEG DASH delivery format.
2.2.4 Audio in transmission systems 2.2.4.1 File Formats and Metadata In order to provide means for exchange of audio data and compatibility between different platforms, audio streaming, multimedia transmission etc., as well to simplify the system integration, several audio file formats have been developed (e.g. WAV, AIFF, MPEG etc.) [Poh05]. Defined file formats contain the essence – the content data, and the metadata – addressed locations in the file where specific information about the file is stored. Metadata can comprise data such as sampling frequency, bit resolution, number of channels, type of compression, title, copyrights, information on synchronization, etc. AES standard for digital audio AES41 specifies the method by which a metadata
80 Typical Communications Signals
set is embedded in the audio samples – audio-embedded metadata, presented in Table 2.3. Tab. 2.3: Metadata types [AES41].
Type
Encoding scheme
010
Undefined
110
MPEG-1 Layer I or MPEG-2 low sampling frequency (LSF) Layer I
210
MPEG-1 Layer II or MPEG-2 LSF Layer II
310
MPEG-1 Layer III or MPEG-2 LSF Layer III
410
ISO/IEC 14496-3:2005 minimal metadata
510
Dolby E audio minimal metadata
610
EBU loudness, true-peak, and downmix metadata
710 to 3110
Reserved for future standardization
2.2.4.2 Digital Broadcasting, Network and File Transfer The final goal of every audio production certainly is to reach the intended listeners. It means that the communication channel of any kind should be included in signal transmission path. The transmission path can be either analog or digital, but in digital era of communications, audio is definitely moving towards digital broadcasting and Information Technology (IT) Networking, leaving the analog transmission and recordings to the history (except for some rare and rather esoteric examples of premium quality analog audio). Another way to reach the end-users is to use the file transfer. File transfer method is particularly important in exchange of audio between the production and distribution companies. Storage media based on IT, including data disks and tapes, penetrated all areas of audio production for radio broadcasting. This technology offers significant advantages in terms of operating flexibility, production flow and station automation. A minimum set of broadcast related information must be included in the file to document the metadata related to the audio signal. Recommendation ITU-R BS.646 defines the digital audio format used in audio production for radio and television broadcasting [RBS646-1]. The quality of an audio signal is influenced by signal processing experienced by the signal, particularly by the use of non-linear coding and decoding during bit rate reduction processes. Future audio systems will require metadata associated with the audio to be carried in the file. Recommendation ITUR BS.2088-0 defines the long-form file format for the international exchange of audio program materials with metadata [RBS2088-0]. Broadcasting platforms for radio and TV are being improved and changed all the time. Current state-of-the-art in audio broadcasting are DAB, DAB+ and DMB
Image 81
technologies. The new DVB T2 digital television broadcasting with its Multiple Physical Layer Pipe (MPLP) and FEF (Future Extension Frame) options, gives more possibility of extending the services transmission to mobile platforms. First experiments with DVB T2 radio broadcast show that performances of DVB T2 Lite can be considered superior to DAB+ in terms of audio quality, number of channels, signal reception quality and overall transmission cost efficiency. Another platform for delivering the digital audio to the listeners is DRM30/DRM+ (Digital Radio Mondiale) with similar overall score as DVB T2 and certain benefits that the platform could bring. DVB T2 and DRM30/DRM+ radios are still absent in the market. IT Networking is a huge, standards-driven industry. It means that using the modern networking technology keeps its parts future-proof. IT Networking is concerned ubiquitous, fast, cost effective and incredibly reliable and it has a deep integration with computer and mobile device applications. The convergence of Audio and IT Networking is immensely logical step and audio transmission is irretrievably migrating to IT Networks.
2.3 Image “An image is worth a thousand words”, is a well-known proverb by Confucius. Images help in the communication of ideas, events or memories very efficiently and effectively. They represent objects by carrying information about them. Some sources of images are cameras, satellites, communications devices, memory media, radars, sonars, medical devices etc. Images are represented using continuous functions f(x, y) of two variables, which give space coordinates of a point in the image. The value of the function f(x, y) at the point (x, y) represents the color at that point. Such a 2-dimensional single plane image may represent a monochrome (black-white) or a gray image. Digital monochrome image has discrete space coordinates and intensity and therefore a finite number of elements – so called picture elements or pixels (pels). Usually, a digital image is represented by matrix of pixels, whereby indices of rows and columns show the pixel position on an image, and values of a pixel represents its gray level. A grayscale image has typically the pixel values represented by a single byte (8 bits) and therefore its range varies from 0 to 255, where 0 is black and 255 is white and all the other values in between represent the various gray levels. For colored images, additional planes are required to represent the colors, e.g., one plane for Red, one for Green and one for Blue. The combination of the three colors can produce a wide range of colors, e.g., if each color is represented by 8 bits, then red, green and blue combined together can produce 224 = 16 million colors. Modern image processing applications create and store images as raster images or vector images but sometimes also both raster and vector combined together.
82 Typical Communications Signals
Raster image: Raster image, also called bitmap or scalar image is an image made up of and represented by a collection of pixels. Each object in the image is represented by a set of pixels. For example, a line is represented by pixels which approximate the line from start point till end point. Thus, the memory needed for storage is proportional to the dimensions of the objects in the image. Vector image: Vector image is an image created using mathematical equations and forms to represent the objects in the image. For example, to represent a line only the start point and the end point are needed, whereas to represent a circle, its centre point together with its radius is needed for storage and processing. Raster images may provide the possibility for editing to a fine grained level. However, the images become blurred when zoomed. Vector images, on the other hand, provide the ability for editing the individual objects in the image. Scaling of the objects is easily possible to get as the mathematical equations can be used to plot the modified objects without the loss of information. Vector images also require less memory for storage. Digital Image Processing is interdisciplinary, unifying many disciplines such as communications, computer science, electronics, physics, psychology, artificial intelligence etc.
2.3.1 Digital Image Processing System The basic structure of a digital image processing system is shown in Figure 2.16:
LAN
Disc Memory
Communication with operator
Input devices Computer (application)
Output devices
Digitalization
Monitor
Fig. 2.16: Structure of a digital image processing system.
Image 83
The central part of the system is an application in hardware or software for image processing, which is connected to other parts like: memory, video monitor, camera with a digitalization device, input and output devices, local computer network and devices for communication with an operator. Memory is used for the storage of digital images on magnetic (discs and rarely tapes) and optic media (CD-ROM and digital optical disc). A video monitor is used for image presentation to the operator or end user. Cameras with a digitalization device or cameras with an integrated A/D converter enable input of digital images into computer. Other input devices are used for alternative ways for input of images into computer (scanners, analog and digital video recorders, non-standard cameras, sensors of medical devices etc.). Image can be input also in a digital form using memory media, local computer network and Internet. Output devices depend on application: devices for hard image copies like laser and ink jets, video printers for photos, devices for slides, analog and digital video recorders, devices for saving of images on analog optical disc etc. Devices for communication with an operator are usually computer monitor, keyboard and a mouse (rarely joystick, graphic table and similar).
2.3.2 Digital Image Processing Operations and Methods There are three different levels of digital image processing operations: 1. Low level operations: noise cancelation, contrast reparation and image sharpness, 2. Medium level operations: edge detection, segmentation (image division into objects), object description for computer processing and classification, 3. High level operations: understanding of classified objects, simulating the visual part of the human cognitive system. Digital image processing methods can be divided further into the methods for: 1. Image representation and modelling 2. Image improvement 3. Image restoration 4. Image compression 5. Image analysis
84 Typical Communications Signals
2.3.2.1 Image Representation and Modelling Image representation means the description of pixels, depending on the source of image, i.e., on the part of the spectrum represented by the used camera (visual part of the spectrum, infra-red, ultra-sound, roentgen etc.). The most important property of image representation is fidelity of an object presented by image, which depends on a quality of camera sensors, as well as on the used sampling frequency and A/D conversion (see Chapter 1). Image representation also involves methods of image transformation used for a better description of image content, improvement of image quality etc. Image modelling is connected to image representation: Usually, images are modelled statistically, using statistic characteristic of the 1st and 2nd order. This kind of modelling is based on an assumption that image is stationary, even in nonstationary cases. Modelling is also a part of image analysis (see Section 2.3.5), whereby each image is concerned as a set of objects and modelling itself defines connections between these objects. Image representation and modelling become more complex in case of image sequences, whereby time also has to be considered as a parameter and the 2dimensional image function becomes 3-dimensional.
2.3.2.2 Image Improvement Image improvement aims easier usage of information carried by image, rather than improvement of image content. This means changing of particular properties of an image, in order to improve further image analysis or its monitor presentation. The aim is to make the image visually more pleasing and information extraction easier for the viewer. There are numerous methods for image improvement which are often interactive or adaptive, as they depend on image production and application. Some methods treat each pixel independently on other pixels, and some of them treat pixels independent of their surrounding area. These methods are mostly simple for usage and applied in a real time. Typical methods for image improvement are contrast improvement, noise removal, edge detection, de-blurring and image sharpening etc. Contrast improvement: Contrast improvement or contrast stretching, stretches the range of pixel intensities of an image, so that they cover a wider range in the output. Contrast makes an object distinguishable from other objects in the image. One way to achieve contrast stretching is by using histogram equalization. Image histogram in simple words is the histogram of the frequency of occurrence of each pixel value. Histogram equali-
Image 85
zation is the method of increasing the global contrast and spreading the intensity values over the histogram. Noise removal: Noise elimination reduces or eliminates the noise in an image. Error correcting codes can be used to identify and eliminate noise in an image. However, images have certain special kinds of noise due to pre-processing, storage or transmission, which can be reduced using noise filters. Such noises include “Salt & pepper” noise, Gaussian noise, blurring etc. Methods of mean filtering, median filtering, Gaussian filtering, etc. are used to reduce the image noise. Figure 2.17 shows the effect of Salt and pepper noise of magnitude 0.2 on the “Lena image”. The median noise reduction filter is applied on the noisy image and the resultant image is free from the “Salt & Pepper” noise, as also shown in Figure 2.17.
Fig. 2.17: Noise reduction.
Edge detection and image sharpening: Edges in the image also play a vital role in identifying and differentiating objects from each other. Edge detection is based on detecting discontinuities in the image or sharp changes in brightness levels. Edge detection methods use differential operators for detecting changes in the gradients of the colour levels in the image. Edge detection methods include Prewitt, Canny, Sobel, Laplacian, Laplacian of Gaussian edge detector etc. [Umb10]. Figure 2.18 shows the result of applying the Prewitt, Canny and Sobel edge detection operators on the “Lena image”. Edge enhancement, also known as image sharpening, is the process by which the edges in the image are enhanced by adding the edges to the blurred version of the image which is to be sharpened. The resulting image has sharper edges than original and therefore visually more pleasing than the original image.
86 Typical Communications Signals
Fig. 2.18: Edge detection using different operators.
2.3.2.3 Image Restoration Image restoration aims at improving the image quality in cases of known image degradation or if image degradation can be modelled. Unlike methods for image improvement, methods for image restoration are much more complex and cannot be realized in a real time. Some typical examples of image restoration methods are correction of geometric distortion of an optical part of a camera, elimination of a noise of a known statistical model, correction of image sharpness decreased by the movement or defocusing of camera etc. Image restoration is different from image enhancement. Image enhancement does not consider the process which has led to image degradation. Image enhancement only makes the image visually more interesting. Image restoration on the other hand improves the scientific quality of the image using techniques in the spatial or frequency domain. Some of the methods used for image restoration include Inverse Filtering, Wiener Filtering, Wavelet Restoration and Blind De-convolution.
2.3.2.4 Image Compression Image compression involves the techniques for reducing the memory required to store images. The reduction of memory for storage also reduces the bandwidth required to transmit the image, e.g., over the Internet. Different techniques and algorithms are available for achieving image compression. JPEG is the most commonly used technique over the Internet and is discussed here.
Image | 87
JPEG The efforts toward the standardization of the image compression technique started in 1982 which resulted in the Joint Photographic Experts Group (JPEG) [PeMi93] in 1987 and the first standard for the image compression in 1992 [AcTs04]. The JPEG is called as ISO/IEC IS 10918-1 [ISO10918-1] and ITU-T Recommendation T.81 [ITU-T81] due to mutual collaboration between International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC) [PeMi93][AcTs04]. The JPEG contains a family of image compression methods applicable to both grayscale and color images.
DCT coder
DCT
Source image (8x8 blocks)
Quantizer
Entropy coder
Quantization table
Huffman table
Compressed data
(a)
DCT decoder
Compressed data
Entropy decoder
De-quantizer
Huffman table
Quantization table
IDCT
Reconstructed image (8x8 blocks)
(b)
Fig. 2.19: JPEG a) Compression; b) Reconstruction.
JPEG is a compression technique which can be applied via four modes of operations [AcTs04] in which one mode is a lossless compression while the other three are lossy compression. The Discrete Cosine Transform (DCT) and a quantization procedure are used in the lossy compression modes. The most common widely used form of the JPEG still image compression standard is a DCT-based lossy compression algorithm referred to as JPEG baseline [JPEG2000][RaJo02]. The encoding process of the JPEG compression algorithm splits an image into 8×8 non-overlapping blocks. In case of a color image, the Red, Green and Blue planes are subjected individually to the splitting and further operations. The Discrete Cosine Transform (DCT) is calculated on each individual block. The DCT of each block is then quantized by dividing it by the elements of a pre-defined quantization matrix. Different quantization matrices are pre-defined, each one of which
88 Typical Communications Signals
essentially dictates the quality and the amount of compression. This is because based on the chosen quantization matrix and the division, different number of elements of the result will be zero. More zeroes mean a good compression but at the cost of reduction in the reconstruction quality. The non-zero elements of the output are then zigzag scanned and the DC element is subjected to Differential Pulse Coded Modulation (DPCM), whereas, the Alternating Current (AC) elements are subjected to Run Length Encoding (RLE). The output of DPCM and RLE is then further subjected to Arithmetic or Huffman coding to represent them by smaller number of bits. The decoding process performs the same steps in reverse order. The process of JPEG compression is shown in Figure 2.19. The JPEG compression standard attracted almost all business and academic users for more than a decade. However, it is not a perfect match with new features of communication technology, Internet and multimedia applications [RaJo02] and information security. So, a new optimized compression standard with higher efficiency and more interoperability and adaptability in networks, communication channels and noisy mobile environments should be developed [AcTs04]. Thanks to Discrete Wavelet Transform (DWT) properties, new classes of still image compression algorithms have been proposed based on the DWT. JPEG 2000 JPEG 2000 compression technique has been developed based on the interesting mathematical characteristics of the DWT and became a comprehensive standard, referred to as ISO/IEC 15444-x and ITU-T recommendation T.800 in 2000 [RaJo02]. JPEG 2000 standard consists at the moment of 14 parts as follows [RaJo02]: – Part 1. Core coding system [T800] – Part 2. Extensions [T801] – Part 3. Motion JPEG 2000 [T802] – Part 4. Conformance [T803] – Part 5. Reference software [T804] – Part 6. Compound image file format [IT15444-6] – Part 7. Has been abandoned – Part 8. JPSEC [T807] – Part 9. JPIP [T808] – Part 10. JP3D [T809] – Part 11. JPWL [T810] – Part 12. ISO [IT15444-12] – Part 13. The representation of the entry level encoder based on the specification of part 1 [IT15444-13] – Part 14. JPXML [IT15444-14]
Image | 89
The JPEG 2000 image compression standard can compress an image and encode the compressed data so that it can be decoded in many ways to meet the requirements of a number of wide applications [PeMi93][JPEG2000]. This is thanks to the unique features of the JPEG 2000 comparing to the JPEG standard [PeMi93][AcTs04] [JPEG2000]: 1. Low bit rate performance 2. Supporting both lossless and lossy compression 3. Supporting region of interest coding 4. Providing object-based functionality and 5. Spatial and quality scalability The whole compression technique consists of three phases: 1. image preprocessing, 2. compression and 3. compressed bit stream formation (see Figure 2.20). Further details can be found in [AcTs04][TaMa02].
Source image
Preprocessing
Forward 2D-DWT
Quantization
Compressed image
Tier-2 encoding
Tier-1 encoding
Rate control
Fig. 2.20: JPEG 2000 Compression.
Image preprocessing: This phase includes three main steps. The first step is image tiling in which the original input image is partitioned into non-overlapping equal rectangular blocks, each called a tile. The size of each tile depends on the original size of the image and can vary from one single pixel to a whole image [RaJo02]. The second step is DC level shifting. The purpose of this optional step is to ensure that the input image samples have a dynamic range symmetrically scattered around zero [AcTs04]. The last step in this phase is multicomponent transformations. The purpose of this step is to reduce the possible correlations (i.e., decorrelation) between the multiple components in a multicomponent image. This step is a key function in reducing the redundancy and increasing the compression rate and efficiency of the compression technique.
90 Typical Communications Signals
technique. Two types of transformation are supported by the JPEG 2000 standard, as indicated in the first part of the standard ISO 15444-1 [ISO15444-1]. The first transformation is Reversible Color Transformation (RCT) which is applied in both lossless and lossy image compression. The second transformation is Irreversible Color Transformation (ICT) and applied just in a lossy image compression. Image compression: Image compression is the core of coding system of the JPEG 2000 which is specified in part 1. The compressed code of the image is generated in this phase. Like the first phase, the compression phase consists of three main steps. The first step is to apply a two-dimensional DWT on the preprocessed image. The DWT decomposes each component into several sub-bands with different resolution levels depending on the parameters of the applied DWT. Each sub-band is then quantized independently when a lossy compression is intended. Applying the DWT is the main feature of compression technique of JPEG 2000 which highlights the difference between the compression phases in JPEG and JPEG 2000. In the first decomposition level four sub-bands LL1, HL1, LH1 and HH1 are generated. Four other sub-bands are generated by wavelet decomposing of LL1 as LL2, HL2, LH2 and HH2. The higher level of decomposition can be achieved in a similar way by reapplying wavelet decomposition on LL2, LL3, etc. Technical background and properties of the DWT can be studied in [Dau90][RaBo98] with details. The second step in the lossy compression phase is quantization. Quantization is one of the main common techniques for information loss in encoding viewpoint. The quantization function quantizes the wavelet coefficients in each sub-bands so that it can be implemented regarding to the expected compression rate and reconstruction quality of the image. In the JPEG 2000 standard, the quantization is applied according to the following equation: ���� (�,�)��
�� (�, �) = ������� (�, �)�⌊
��
⌋
(2.14)
Before proceeding with the entropy coding, a unique feature of the JPEG 2000 called Region Of Interest (ROI) has to be accomplished. Region of interest is one of the most important features of the JPEG 2000 standard. It allocates different compression rates to different regions of an image depending to their importance level which is indicated and specified by the user. In some applications the user might desire to encode certain part of the image referred as Region Of Interest (ROI) at the higher level of quality, comparing to the rest of the image which is called background [RaJo02]. So, the compression rate concerning the ROI parts must be adjusted during the compression procedure according to the desired quality and performance specified by the user.
Image 91
The general method for performing ROI encoding is as follows [RaJo02][AtFa98][NiCh99]: – The ROI area of the image is specified, detected and identified in the image by the user; – a binary mask in the wavelet domain will be generated in which a value one is assigned to the each coefficient of the ROI portion and a value zero elsewhere (background); – after ROI shape adjustment, the bit-planes of the coefficients corresponding to the ROI mask will be shifted up and the ones corresponding to the background part will be shifted down according to some desired values. The ROI part is an interesting feature of the JPEG 2000 however; it might increase the computational cost of the encoding/decoding algorithm due to the necessity of mask generation. To overcome this drawback, JPEG 2000 uses a technique called the MAXSHIFT method as adopted in Part 1 of the standard [T800][Chetal00]. The main idea of the MAXSHIFT method is to shift the wavelet coefficient in the ROI so that the smallest coefficient in the ROI portion is greater than the largest coefficient in the image background. The specification of the MAXSHIFT method can be studied in [T800] in more details. After the quantization and the ROI encoding, the quantized DWT elements are converted into sign-magnitude represented before the entropy coding step [AcTs04]. The third and the last step in the compression phase of the JPEG 2000 is entropy encoding. The purpose of the entropy encoding of the quantized wavelet coefficients is to compress the data code-blocks in each sub-band. This step generates the compressed bit stream in the JPEG 2000 and consists of two main coding steps indicated as Tier-1 coding and Tier-2 coding. In the first coding step, Tier-1, the code-blocks of the bit planes are arithmetically encoded independently [RaJo02]. This step consists of complicated coding procedures. The details of this step can be studied through [AcTs04]. The second coding step is done through the tier-2 bit stream organization. Compressed bit stream formation: The structure of the compressed bit stream in the entropy coding stage must be organized in order to make the encoding part handier to use. Tier-2 coding is responsible for organizing the formation and structure of the bit stream in the entropy coding block. Tier-2 represents the information of each compressed bit-block after Tier-1 coding step in the forms of constructs known as layers and packets. The precise definition of each construct along with the detailed specification of tier-2 part can be studied in [AcTs04][T800]. The above mentioned phases review the main building blocks of the JPEG 2000 core coding system from a very general viewpoint. However, there are other important features of the JPEG 2000 which are especially concerned with the implementa-
92 Typical Communications Signals
tion, efficiency and performance of the compression technique. One such issue is the Rate Control. Rate control is a process in the JPEG 2000 in which an optimal compressed image with a desired (allocated) encoding bit rate is generated. A common method to achieve this optimality is to minimize the introduced errors due to quantization and truncation operations [AcTs04] using Mean Square Error (MSE) as a metric. Different approaches have been mentioned in the literature for performing rate control in the JPEG 2000. For example, one method is simply the extension of the idea of using q-table applied in the JPEG image compression [PeMi93][RaJo02] [RoSt95]. Another method is based on compression algorithm called Embedded Block Coding with Optimized Truncation (EBCOT) [Tau00]. The JPEG2000 standard does not limit further development of the phases and building blocks toward improvement and better optimization and applications adaptability. The JPEG 2000 standard does not limit further development of the phases and building blocks toward improvement and better optimization and applications adaptability. In fact, the standard is continuously being contributed (e.g., [JPEG2000][Feetal15][Ohetal13]) by further expansions and developments which match the new applications and outperform the former parts.
2.3.2.5 Image Analysis Image analysis consists of several methods which simulate human perception, in order to investigate quantitative properties of objects in the image, as well as their mutual relationship. Applications of image analysis are broad: from simple inspection of automatic production, automatic medical image analysis, to complex computer vision. Typical methods of image analysis are edge detection, segmentation, objects identification etc. While the image processing operations make the images look better, image analysis tries to identify the content of the image. Image analysis include segmentation, which is the process of grouping together similar parts of the image. This includes the segmentation of individual objects in an image. Edge detection is used as one of the many methods to segment the objects in an image.
2.4 Television For more than 80 years of development, television technique was being constantly changed, especially in the area of electronic components. But for a very long period of time the basic principle of producing, transmission and reproducing of TV signal was the same, i.e. TV signal was analog. A slow change of the general concept was the consequence of many factors, such the huge number of analog TV sets at homes
Television 93
of people around the world and the insufficient development of technology which had to support the digitalization process. The transition into digital standards in all parts of TV systems required the maintaining the old TV service with a lot of spectrally inefficient channels, each with the broadcast frequency range of 7 MHz. Once when all the necessary conditions were fulfilled, the fully digitalization of television has started. Beginning in the second half of 2000s, this process is in its ending phase now, so that the analog TV broadcast signal is shut down in the vast majority of countries today. Nowadays, this type of TV signal can still be found in the coaxial cables of CATV operators, but even there it will be soon replaced by a digital signal. Unlike a speech signal, which is today (in analog form) still present in the relatively short terminal segments of communication and audio systems – e.g. in the analog local loops of the Plain Old Telephone Networks (PSTN) or in the cables between (analog) microphones and (pre)amplifiers, the “moving pictures” are digitized at the very first point of their creation – in digital video cameras. Further, different compressing/decompressing and coding techniques can be applied to this signal during the transmission, as well as “packing” together with other digital signals into “containers” with the standardised bandwidths (this process is called multicasting). At the other end of television systems – in TV receivers, an analog signal can still be reproduced (if delivered), while digital signals (which are becoming dominant) are supported by default. One issue that still could arise is the fact that some of modern TV receivers may not support the digital standard delivered by the broadcaster, and in those cases the convertors (i.e. the so called set-top boxes) should be used for the conversion from one digital standard (e.g. DVB-T2) to the one which can be reproduced in the receiver. During the transition of television from totally analog to fully digital, analog parts of TV systems were being gradually replaced by the digital correspondents that satisfied the specific demands at the time. The consequence is that many different standards of a digital TV signal were being developed. In fact, in the today’s “digital age” the term “TV signal” is not quite suitable – a more appropriate term would be “video stream”, since all modern PCs, notebooks, gadgets etc. are able to reproduce most of the video contents available in various digital standards. Many new multimedia services are present on the markets today – e.g. the “IPTV service” delivered via digital local loops of telecom operators, then “internet television” that can be watched via internet on computers or smart phones, etc. In other words, the “classic TV receivers” which use the broadcasted terrestrial or satellite signals received by antennas or the signals provided by CATV operators, are not the only devices that can be used for the reproduction of a TV signal anymore.
94 Typical Communications Signals
In spite the fact that in different countries different digital TV standards are used, the main concept of creating the original signal is the same, and the variations in standards are mostly related to three main factors: – Image resolution – Aspect ratio – Scan mode (progressive or interlaced) The raw digital video signal is created in video cameras by the Charge-Coupled Device (CCD) [BoSm74][BoSm74-2], or by the Complementary Metal-Oxide Semiconductor (CMOS) active pixel sensors [Dietal94]. For each of three basic colours (red, green, blue) there is one sensor onto which the filtered image (of the appropriate colour), captured by the object glass of the camera, is projected through the optic system within camera. The sensor consists of a two-dimensional array of photosensitive elements i.e. pixels, each of which accumulates a certain amount of charge, i.e. gives a voltage value proportional to the intensity of light which is dropped on that particular pixel. The image resolution depends on the number of pixels in sensors – if created by a digital camera, or on the way an analog image is scanned (i.e. analysed) – if the conversion from analog to digital content is applied. In a digital camera, the two-dimensional image, represented by the matrix of charges accumulated in the capacitive elements of pixels, is converted into the sequence of voltage values by the use of a shift register (i.e. the array of concatenated capacitive elements of all pixels) and a charge amplifier at the end of the shift register. The obtained voltage values after the amplifier are then sampled, digitized and stored in an internal memory. This process is repeated periodically, e.g. 60 times per second, so that in this way the memory accumulates a sequence of images in a digital format, which de facto represents a digital video signal. The raw data (video signals) in memory from three sensors for each of the basic colours can be combined internally (in camera) or externally, i.e. converted into one of the standardized formats for digital video, in which the digitized sound signal of the captured scene from internal or external microphone(s) is usually also included. In order to keep the compatibility with a huge legacy of analog TV formats and movies saved on film tapes and other video contents which also had to be transmitted and/or reproduced by new digital TV equipment, numerous techniques for the conversion between different analog and digital formats have been developed. In various standardised video formats there are more different combinations of horizontal and vertical resolution of images, expressed in the number of pixels in horizontal and vertical lines of analysed images. These values are mostly determined by the video/TV equipment manufacturers and international standard organizations with regard to numerous factors such as the desirable quality of reproduced picture, aspect ratio, compatibility with analog TV standards, bandwidth
Television 95
demands in transmission, memory requirements of system elements and many other technical (and financial) reasons. Typical resolutions are: 1920×1080, 1280×720, 720×576, 720×480, 640×480. Aspect ratio of a digital TV image is the ratio between horizontal and vertical line resolution. It depends on the technology used in the country where the signal is broadcasted. The two most implemented aspect values are 16:9 and 4:3. The scan mode by image analysis was also object of many disputes between manufacturers of consumer electronics and computers, as well as film industry and public interest groups. In the so called “progressive scan mode” the lines of an image are scanned sequentially from top to bottom, while in the “interlaced mode” first the even-numbered lines are scanned and then the odd-numbered lines. The formats implementing progressive mode have letter “p” in the name of the standard, while the formats with interlaced scan are designated with “i”. A shorted list of some actual digital TV formats with main details is given in Table 2.4. Tab. 2.4: Actual Digital TV (DTV) standards.
Video format
Aspect ratio
Resolution
480i
4:3 or 16:9 or 3:2
640x480, 704x480, 720x480, 852x480
576i
4:3 or 16:9
480x576, 544x576, 704x576, 720x576, 768x576
480p
4:3 or 16:9 or 3:2
640x480, 704x480, 720x480, 852x480
576p
4:3 or 16:9
480x576, 544x576, 704x576, 720x576, 768x576
720p
16:9
1280x720
1080i
16:9
1920x1080
1080p
16:9
1920x1080
2160p (4K UHD)
16:9
3840x2160
4320p (8K UHD)
16:9
7680x4320
Standard-Definition TV (SDTV)
Enhanced-Definition TV (EDTV)
High-Definition TV (HDTV)
Ultra-High-Definition TV (UHDTV)
One of the greatest advantages of digital over analog TV, and perhaps the most important advance since the colour TV was introduced in 1950s, is the efficient and flexible usage of spectrum resources in the transmission i.e. in the broadcasting of digital contents.
96 | Typical Communications Signals
Namely, the development of digital broadcast platforms, which include modern techniques of data coding, compression, multiplexing and modulation of many different digital audio and video formats, has enabled the possibility that more TV and radio programmes can be distributed (i.e. broadcasted through air, transmitted via internet, delivered by CATV, IPTV etc.) using a single transmission channel with a predetermined bandwidth. For example, in analog TV each 7 MHz wide transmission channel was being occupied by only one programme (analog TV signal), while in e.g. HDB-T2 broadcast (see Table 2.5) a channel wide 8 MHz can transmit many digital streams (programmes) in different digital formats with the overall bit rate of up to ~ 50 Mbit/s. The transition onto digital TV broadcasting enabled the huge spectrum resources that can be used for new appliances – one of them is e.g. 4G/LTE mobile. The “multicast” of more digital formats through one transmission channel is achieved by splitting the digital data streams into a large number of slower digital streams. Each of these slower streams modulates a set of closely spaced adjacent subcarrier frequencies, rather than the scenario where one fast data stream modulates a single carrier of one radio frequency (RF) channel. The broadcasting of digital TV signals can be performed through air at radio frequencies (terrestrial and mobile), via satellite or through cable. Different standards as the platform for digital TV broadcast are used in different countries. A platform denotes the first two layers (physical and data layer) of a broadcasting system. The following groups of digital TV broadcast standards (platforms) are used worldwide: – Digital Video Broadcast (DVB) – for terrestrial (DVB-T/T2), satellite (DVBS/S2/S2X/SH), cable (DVB-C/C2) and mobile/handheld (DVB-H/NGH) broadcast – Advanced Television Systems Committee (ATSC) – for terrestrial (ATSC-/2.0/3.0) and mobile/handheld (ATSC-M/H) broadcast – Integrated Services Digital Broadcasting (ISDB) – for terrestrial (ISDB-T), terrestrial international (ISDB-T International), satellite (ISDB-S), cable (ISDB-C) and mobile/handheld (1seg) broadcast – Digital Terrestrial Multimedia Broadcast (DTMB) – for terrestrial (DTMB-T) and mobile/handheld (CMMB) broadcast – Digital multimedia broadcasting (DMB) – for terrestrial (T-DMB) and satellite (SDMB) broadcast Among the standards for terrestrial digital broadcasting the most used one is DVB (in Europe, Australia, New Zealand, parts of Asia and Africa and in a few other countries), ATSC (in North America), ISDB (in South America) and DTMB (in China). DVB standards (listed in Table 2.4).
Television 97
Tab. 2.5: DVB standards and their characteristics.
DVB standard
Modulation (modulation schemes)
Available bitrates in a 8 MHz channel
DVB-T
OFDM, COFDM (16-QAM, 64-QAM, QPSK)
4.976–31.668 Mbit/s
DVB-T2
OFDM (QPSK, 16-QAM, 64-QAM, 256-QAM)
7.444–50.324 Mbit/s
DVB-(S)H (for handhelds) COFDM with SH-A or DMT with SH-B (QPSK, 16-QAM)
up to 9.9 Mbit/s (for 5 MHz LMS channel)
DVB-C
Single Carrier QAM (16-QAM, 32-QAM, 64-QAM, 128-QAM, 256-QAM)
25.6–51.3 Mbit/s (32.1–64.1 Mbit/s for 10 MHz channel))
DVB-C2
Absolute OFDM (16- to 4096-QAM)
up to 83.1 Mbit/s
DVB-S
Single Carrier QPSK (BPSK, QPSK, 8-PSK, 16-QAM)
DVB-S2 (DVB-SX)
Single Carrier QPSK with Multiple Streams (BPSK, QPSK, 8-PSK, 16-APSK, 32-APSK + 64-,128- and 256-APSK with DVB-SX)
DVB-SH
QPSK, 16-QAM
3 Random Processes 3.1 Probability Theory The founder of a probability theory is Russian mathematician Kolmogorov, who introduced the first axioms in 1933 [Kol33]. Probability can be easily explained using a random experiment, which fulfills following conditions: – The experiment is repeated in the same conditions – The result of the experiment is random – The experiment is repeated a huge number of times. The last condition is important for achieving the statistical regularity of the experiment: different average results of the experiment converge to fix values. Statistical regularity cannot be mathematically proved. The probability of appearance of the event A is calculated as a ratio between the number of appearance of an event A, N(A), and total number of events n:
P( A) =
N ( A) n
(3.1)
As 0 ≤ P(A) ≤ 1, there are two extreme values: – P(A) = 0: the event A never happens (impossible event) – P(A) = 1: the event A always happens (secure event). Statistically regular experiment is the one whose average probability of an event A can be expressed as:
P( A) = lim n→ ∞
N ( A) n
(3.2)
A typical example for digital communications is a definition of a Bit Error Rate (BER) as a ratio of a number of erroneous bits and total number of bits: e.g. BER = 10–6 means one erroneous bit out of 106. Another typical example is gambling: elementary events are appearance of any side of the square, i.e. the probability of any of 6 possible events (appearance of 1, 2, 3, 4, 5 or 6) equals 1/6.
100 | Random Processes
3.1.1 Terms and axioms of the probability theory Operations used in the probability theory [Gha05] are: 1. Junction of events A⋂B is an event that can be realized if both events A and B are realized. Instead of A⋂B, AB is often used for the junction. 2. Union of events A⋃B is an event that can be realized if at least one of events A or B is realized. If A and B are mutually exclusive, it can be written instead A + B. 3. The complement of an event A is an event AC or A̅ and is defined as:
P( A ) = lim
n →∞
n − N ( A) n
(3.3)
4. If an event A implicates an event B, then A ⊂ B. Consider a set Ω and a set of its subsets. The function P defined on subsets of a set Ω is called probability on a set Ω if following axioms are fulfilled: 1. P(Ω) = 1 (3.4) 2. 0 ≤ P(A) ≤ 1 for A ⊂ Ω (3.5) 3. P(A1 ⋃ A2 ⋃ …) = P(A1) + P(A2) +… (3.6) if events A1, A2,… (a finite number of them) are mutually exclusive. Probability functions have following properties: – Probability of an empty set:
–
–
–
P(∅) = 0
(3.7)
P(A\B) = P(A) – P(AB)
(3.8)
Difference between sets A and B:
Symmetric difference between sets A and B: P(A∆B) = P(A ⋃ B) – P(AB)
(3.9)
A ⊆ B ⇒ P(A) ≤ P(B)
(3.10)
Implication:
Probability Theory 101
–
If A1, A2,…Ak are mutually exclusive: k k P Ai = ∑ P ( Ai ) i =1 i =1
–
–
(3.11)
If A and B are dependent events: P(A ⋃ B) = P(A) + P(B) – P(AB)
(3.12)
Conditional probability of an event A, under the condition that the event B happened, whereby P(B) ≠ 0, is defined as:
P( A | B) =
P( AB ) P( B)
(3.13)
P( AB) P( A)
(3.14)
i.e. as:
P( B | A) = –
Events A1, A2,…Ak are statistically independent if: k k P ∏ Ai = ∏ P( Ai ) i =1 i =1
–
Boolean inequality is given as:
(3.15)
∑ P( A ) n
P(A1 ⋃ A2 ⋃…⋃ An) ≤ –
i =1
(3.16)
If A1, A2, A3,… is infinite array of events, so that A1 ⊆ A2 ⊆ A3 ⊆ …, the conditions of continuity of probability are defined using the following relations:
∞ P Ai = lim P( An ) i =1 n→∞
–
i
(3.17)
and, if A1 ⊇ A2 ⊇ A3 ⊇…, then:
∞ P Ai = lim P( An ) i =1 n→∞
(3.18)
102 Random Processes
3.1.2 Conditional Probability, Total Probability and Bayes’ Theorem Probability of the junction of events P(A⋂B) is often written as the probability of the product of events P(AB), but also as a joint probability P(A, B). Considering two events A and B, whereby it is known, that the event B did happen, the probability of A is called conditional probability of A under the condition of B (shortly: “probability of A if B…”) and marked as P(A | B):
P( A | B) =
P( A, B) P( B)
(3.19)
if B happened i.e. P(B) ≠ 0. Similarly, if A happened i.e. P(A) ≠ 0:
P( B | A) =
P( A, B) P( A)
(3.20)
Consequently, the joint probability is:
P( A, B) = P( A) P( B | A) = P( B) P( A | B)
(3.21)
The equation (3.21) can be generalized using mathematical induction:
P ( A1 , A2 ,.., An ) = P ( A1 ) ⋅ P ( A2 | A1 ) ⋅ P ( A3 | A1 , A2 ) ⋅ ⋅ ⋅ P ( An | A1 , A2 ,.., An−1 )
(3.22)
If A and B are independent events:
P( A | B) = P( A)
(3.23)
P( B | A) = P( B)
(3.24)
P( A, B) = P( A) P( B)
(3.25)
i.e.:
and further:
expressing in this way the mathematical definition of (statistically or stochastically) independency of events. Generally:
P( A1 , A2 ,..., An ) = P( A1 ) ⋅ P( A2 ) ⋅ ⋅ ⋅ P( An )
(3.26)
Probability Theory 103
If the total system of events (so called hypotheses) is Bi (i = 1, 2,…, n), any event A can be calculated using the total probability law:
P ( A) = P( Bi ) P( A | Bi )
(3.27)
Probabilities P(B) are also called a-priori probabilities. The Bayes’ theorem gives so called a-posteriori probabilities of hypotheses Bi, P(Bi|A):
P( Bi | A) =
P( Bi ) P( A | Bi ) = P( A)
∑ P( B ) P( A | B ) P( Bi ) P( A | Bi )
n
i =1
i
(3.28)
i
In the case of a binary channel, which is often used in information theory (see Chapter 4), there are two possible input bits x (0 or 1). They define input or so called apriori probabilities. The channel is defined using conditional probabilities for output values y, so called transition probabilities, P(yj | xi). The following equations are obvious: P(x1) + P(x2) = 1
(3.29)
P(y1) + P(y2) = 1
(3.30)
P(y1| x1) + P(y2| x1) = 1
(3.31)
P(y1| x2) + P(y2| x2) = 1
(3.32)
The probabilities of outputs are: P(y1) = P(x1)P(y1| x1) + P(x2)P(y1| x2)
(3.33)
P(y2) = P(x1)P(y2| x1) + P(x2)P(y2| x2)
(3.34)
A-posteriori probabilities are calculated as:
P ( x1 | y1 ) =
P ( x1 ) P ( y1 | x1 ) P( y1 )
(3.35)
P ( x2 ) P ( y 2 | x2 ) P( y2 )
(3.36)
and:
P ( x2 | y 2 ) =
104 | Random Processes
The remaining a-posteriori probabilities can be found using (3.35) and (3.36) as: P(x1|y2) = 1 – P(x2|y2)
(3.37)
P(x2|y1) = 1 – P(x1|y1)
(3.38)
and:
In the case of the binary symmetric channel, both transit probabilities are equal: P(y1|x2) = P(y2|x1) = p
(3.39)
3.2 Random signals Many stochastic signals are seen as random in a short time interval, but behave similar if observed in a long time interval. Therefore, the following statistic values are used for stochastic signals: – Arithmetic mean value +∞
for continuous values:
x=
∫ x ⋅p ( x)dx
(3.40)
−∞ n
and for discrete values:
x = ∑ xi ⋅ p ( xi )
(3.41)
i =1
–
Quadratic mean value ∞
for continuous values:
x2 =
∫x −∞ n
and for discrete values:
2
(3.42)
p ( x ) dx
x 2 = ∑ xi ⋅ p ( xi ) 2
(3.43)
i =1
–
Effective value ∞
for continuous values:
xeff = x 2 =
∫x
2
p ( x ) dx
(3.44)
−∞ n
and for discrete values:
xeff = x 2 =
∑x
i
i =1
2
⋅ p( xi )
(3.45)
Random signals | 105
–
Variance +∞
for continuous values:
σ X 2 = ∫ ( x − x ) 2 p ( x )dx = x 2 − ( x ) 2 2
(3.46)
−∞ n
and for discrete values: σ X =
∑ (x − x )
2
i
p( xi )
(3.47)
i =1
−
Standard deviation:
σ X = σ X 2 = x 2 − (x ) 2
(3.48)
3.2.1 Random Variables and Random Vectors Consider a set Ω of possible experiment results ω, i.e. ω ∈Ω, and a set of possible events ε. The function X(⋅) which is defined on a set Ω and whose values are real numbers, is called a random variable of the experiment. Random value X is defined as a translation (mapping) of set Ω to the set R (real numbers), i.e. X: Ω → R so that [Duk08]:
(∀x ∈ R ){ω ∈ Ω | X (ω ) ≤ x} ∈ ε
(3.49)
The translation X can be discrete if it gives a finite number of values from a finite interval, or continual. There are also mixed or combined random values. The simplest experiment is a binary one with two possible results (0 and 1), whereby the random value is called a Bernoulli random value: P(X = 1) = P(A)
(3.50)
P(X = 0) = 1 – P(A)
(3.51)
and:
3.2.1.1 Distribution Function and Probability Density Function A distribution function of a random value x is a real, monotonic non-decreasing function defined as:
FX ( x) = P{[ x ∈ Ω | X (ω ) ≤ x ]} = P ( X ≤ x) A probability distribution function is defined as:
(3.52)
106 Random Processes
f X ( x) = whereby:
∫f ∞
−∞
X
dFX ( x) dx
(3.53)
( x)dx = 1
(3.54)
As the random value can be continuous and discrete (see Chapter 1), distribution functions can be calculated respectively as:
FX ( x) = P( X ≤ x) = and:
FX ( x) = P( X ≤ x) =
∫f x
−∞
X
(3.55)
( y )dy
∑ P ( X = y ) = ∑ P ( X = x ) h( x − x ) i
y≤ x
i
(3.56)
i
with a Heaviside’s function h. The probability distribution function equals:
f X ( x) =
∑ P( X = x )δ ( x − x ) i
i
(3.57)
i
Distribution function of a random value x has following properties:
−
−
−
−
0 ≤ FX(x) ≤ 1 for ∀x ∈ R
FX(–∞) = lim FX ( x) = 0 x → −∞
(3.58) (3.59)
FX(∞) = lim FX ( x) = 1
(3.60)
If a and b are real numbers and a < b, then: P(a < X ≤ b) = FX(b) – FX (a) P (a ≤ X < b) = FX (b–) – FX (a–) P (a < X < b) = FX (b–) – FX (a) P (a ≤ X ≤ b) = FX (b) – FX (a–) P (X = b) = FX (b) – FX (b–) P (X ≤ b) = FX (b) P (X < b) = FX (b–) P (X > a) = 1 – FX (a)
(3.61) (3.62) (3.63) (3.64) (3.65) (3.66) (3.67) (3.68)
x→∞
Random signals 107
3.2.1.2 Random Vectors A random vector or multidimensional random variable X is a set of n random variables on a same event space:
X = [ X 1 , X 2 ,..., X N ]T
(3.69)
Joint distribution function of a discrete random vector X is defined as:
FX1 , X 2 ,... X N ( x1 , x2 ,...x N ) = P( X 1 ≤ x1 , X 2 ≤ x2 ,..., X N ≤ x N )
(3.70)
where x1, x2,…, xN,∈(–∞, ∞). Similarly, in the case of a continual random vector X, joint probability distribution function equals:
FX ( X ) =
∫ ... ∫ ∫ F
xN
x2 x1
−∞
−∞ −∞
Y1 ,Y2 ,...YN
( y1 , y2 ,.., y N )dy1dy2 ...dy N
(3.71)
and in the case of a discrete random vector:
FX ( X ) =
∑ ... ∑ ∑ P( X
y N ≤ xN
y2 ≤ x2 y1 ≤ x1
1
= y1 ,X 2 = y2 ,.., X N = y N )
(3.72)
The joint probability density function in the case of a continual random vector is calculated as:
f X ,Y ( X , Y ) =
∂N FX , X ,... X ( x1 , x2 ,.., xn ) ∂x1∂x2 ...∂x N 1 2 N
(3.73)
and in the case of a discrete random vector:
P ( X = x ,Y = y ) = = P( X 1 = x1 , Y1 = y1 ) ⋅ P( X 2 = x2 , Y2 = y 2 ) ⋅ ⋅ ⋅ P( X N = x N , YN = y N )
(3.74)
Random vectors are independent iff:
FX1 , X 2 ,... X N ( x1 , x2 ,...x N ) = FX1 ( x1 ) ⋅ FX 2 ( x2 ) ⋅ ⋅ ⋅ FX N ( x N )
(3.75)
In the case of a continual random vector, it holds:
f X1 , X 2 ,... X N ( x1 , x2 ,...x N ) = f X1 ( x1 ) f X 2 ( x2 )... f X N ( x N )
(3.76)
108 Random Processes
and in the case of a discrete random vector:
P( X 1 = x1 , X 2 = x2 ,..., X N = x N ) =
= P( X 1 = x1 ) ⋅ PX 2 ( X 2 = x2 ) ⋅ ⋅ ⋅ PX N ( X N = x N )
(3.77)
3.2.1.3 Conditional Probabilities of Random Vectors Consider X and Y as random vectors:
X = [ X 1 , X 2 ,..., X N ]T
(3.78)
Y = [Y1 , Y2 ,..., YN ]
T
(3.79)
and A ⊆ RN and B ⊆ RM, the conditional probability that X ⊆ RN under the condition that B ⊆ RM is calculated as:
P ( X ∈ A | Y ∈ B) =
P ( X ∈ A, Y ∈ B ) P(Y ∈ B)
(3.80)
Conditional probability distribution function is then in the case of continual random vectors:
FX |Y ( x | y ) = P( X ≤ x | Y ≤ y ) =
∫ ... ∫ ∫ f
xN
x2 x1
−∞
−∞ −∞
X |Y
( u | y ) du
(3.81)
and in the case of discrete random vectors:
FX |Y ( x | y ) = P( X ≤ x | Y ≤ y ) = =
P( X ≤ x, Y ≤ y) = P(Y = y )
∑ P( X = z | Y = y)
(3.82)
z≤ x
Conditional probability density function is, for continual random vectors, given as:
f X |Y ( x | y ) =
f x , y ( x, y) fY ( y )
,
fY ( y ) ≠ 0
(3.83)
and for discrete random vectors as:
P( X = z | Y = y ) =
P( X = z, Y = y ) P(Y = y )
(3.84)
Random signals 109
3.2.2 Examples of Often Used Distributions 3.2.2.1 Uniform Distribution For the probability density function of the continuous uniform or rectangular distribution (notation: U(a, b) or unif(a, b)) all probabilities of events on a defined interval are equal to (Fig. 3.1):
1 , f X ( x) = b − a 0 ,
x ∈ [a, b]
(3.85)
otherwise
Fig. 3.1: Probability density function of a uniform distribution.
The most important values of a uniform distribution are given in the following table: Tab.3.1: Basic parameters of the uniform distribution.
Arithmetic mean value
x=
a+b 2
Quadratic mean value
x2 =
Effective value
a 2 + ab + b 2 xeff = 3
Variance
2 a 2 + ab + b 2 σ 2 = (b − a ) 12 3
Standard deviation
σ=
b−a 2 3
The uniform distribution function is given as:
0 , x b
(3.86)
110 | Random Processes
FX(x) 1
b
a
x
Fig. 3.2: Probability distribution function of a uniform distribution.
3.2.2.2 Normal (Gaussian) Distribution Function Normal or Gaussian distribution (notation: Ɲ(µ, σ)) is very often used not only in communications, but also in many natural and social sciences. The probability density function is calculated as (Fig. 3.3):
f X ( x) =
1 2π σ
−
e
( x−µ ) 2 2σ 2
(3.87)
where µ and σ present the mean value and the variance respectively.
fX(x) f X (m ) =
1 2p s
2s
0.61 f X ( m )
m-s m m+s
x
Fig. 3.3: Probability density function of a normal distribution.
The normal distribution function is given as (Fig. 3.4):
FX ( x ) =
1 2π σ
x
∫e −∞
−
( y −µ ) 2 2σ 2
dy =
x−µ 1 1 + erf 2 2 2σ
(3.88)
Random signals | 111
with the error function erf(⋅):
erf ( x ) =
x
2
∫e
π
− y2
(3.89)
dy
0
FX(x) 1 0.5 m
x
Fig. 3.4: Probability distribution function of a normal distribution.
3.2.2.3 Exponential Distribution Function Exponential distribution function (notation: Exp(λ)) is a particular case of a Gamma distribution and it describes the time between events of a Poisson process. Events of a Poisson process occur continually and independently at a constant average rate. The probability density function is calculated as (Fig. 3.5):
f X ( x) = λe − λx
(3.90)
whereby λ is a rate parameter (used for arrival rate in a traffic theory). The most important values of an exponential distribution are given in the following table: Tab.3.2: Basic parameters of the exponential distribution.
Arithmetic mean value
x=
1
λ
Quadratic mean value
x2 =
2
λ2
Effective value
xeff =
2
λ
Variance
σ2 =
1
λ2
Standard deviation
σ=
1
λ
112 | Random Processes
The exponential distribution function is given as (Fig. 3.6):
FX ( x) = 1 − e − λx
(3.91)
fX(x) l
x Fig. 3.5: Probability density function of an exponential distribution.
FX(x) 1
x Fig. 3.6: Probability distribution function of an exponential distribution.
3.2.3 Variance and Higher Order Moments Let X be a random variable with the expected (i.e. mean) value E(X) = μ. While the expectation of a random variable is an average of its magnitudes, variance represents an average magnitude of the fluctuations from its expectation. In other words, variance is the measure of the spread (i.e. the dispersion) of distribution around the expected value E(X). Although it seems to be the same, in general the term “expected value” is referred to the probabilistic property of a random variable, unlike the term “average” (or “mean value”) which is in fact the statistical function of the series of concrete realizations of a random variable. In order to express the “spreading of distribution” in a mathematical way using a scalar value, first idea would be to find the expectation of the “drift of distribution of X” from the expected value of X, i.e. E[X − E(X)]. The problem which arises from
Random signals 113
this quantity is that it is always zero due to the cancelation of the positive and negative deviations of X from E(X): E[X − E(X)] = E(X – μ) = E(X) – μ = E(X) – E(X) = 0
(3.92)
Another idea would be to find E[|X − E(X)|], but dealing with absolute values is mathematically not very comfortable. Hence, the most appropriate quantity is E[(X − E(X))2] which is defined in (3.47) as the variance of X. The square root of E[(X − E(X))2] is defined in (3.48) as the standard deviation of X:
[
Var ( X ) = E ( X − µ ) 2 The variance of X is equal to:
]
and
[
[
σ X = E ( X − µ )2
]
Var ( X ) = E ( X − µ ) 2 = E ( X 2 ) − [E ( X )]
2
]
(3.93)
(3.94)
Since Var(X) is positive or equal to 0, E(X2) is greater or equal to [E(X)]2. The linear transformation aX + b of a random variable X, where a and b are constants, has the expectation:
E [aX + b] = aE ( X ) + b
(3.95)
while the variance equals:
Var (aX + b) = E [(aX + b) − E (aX + b)] =
[
]
2
= a 2 E ( X − E ( X )) 2 = a 2Var ( x)
(3.96)
The expected value of X, i.e. E(X), is also called the first moment of X and the variance Var(X) is also called the second central moment of X. Because of their numerical and theoretical significance, the expected values of some important functions of X (also named moments) have been defined as well [Gha05]. Some of these functions are g(X) = Xn, |X|n, X − c, (X − c)n, and (X − μ)n, where c is a constant and n an integer (n ≥ 0). Under the condition that g(X) is limited in the sense that E(|g(X)|) < ∞, the higher order moments can be defined as follows:
114 Random Processes
Tab. 3.3: Different types of moments.
E[g(X)]
Definition of the moment
E(Xn)
The n-th moment of X
n
E(|X| )
The n-th absolute moment of X
E(X – c)
The first moment of X about c n
E[(X – c) ]
The n-th moment of X about c
E[(X – μ)n]
The n-th central moment of X
The existence of higher moments implies the existence of lower moments, i.e. if E(Xn+1) exists, then E(Xn) also exists. For the particular case, when n = 2, the existence of E(X2) implies the existence of E(X), and consequently, the existence of Var(X). Generally, the nth moment of a random variable X is given by the following equations:
∑
xin PX ( xi ) i E( X n ) = ∞ x n p X ( x)dx −∞
∫
for discrete X (3.97)
for continual X
where PX(xi) is the probability of occurrence of a particular value xi of discrete variable X (probability mass function, pmf), and pX(x) is the probability density function of a continual variable X over the whole range of possible values x (–∞ < x < ∞). The first moment of the variable X is its expectation E(X) = μ. Similarly, the nth central moment (i.e. the nth moment about the expected value μ) is given by:
∑
( xi − µ ) n PX ( xi ) i n E ( X − µ) = ∞ ( x − µ ) n p X ( x)dx −∞
[
]
∫
for discrete X (3.98)
for continual X
The second central moment is the variance Var(X) = σX2. The variance can also be interpreted as the intermittent (changeable) power of a signal and, as already shown in (3.94), it presents the difference between the total power E(X2) and the direct (steady) power [E(X)]2. The 3rd central moment can be used in calculating the “skewness” of the probability distribution (i.e. of a probability density function). In fact, it can be considered as a measure of asymmetry of a probability distribution, while the 4th central mo-
Random signals 115
ment shows the measure of “kurtosis” i.e. the sharpness of the peak of the probability distribution curve. In case of two (non-independent) random variables X and Y, the so-called mixed moments are defined. Hence, the (n + k)th mixed central moment Mn,k of X and Y is defined by:
[ (x − µ ∑∑
]
M n,k = E ( X − µ X ) n (Y − µY ) k =
) n ( y j − µY ) k PXY ( xi , y j )
=∞ ∞ ( x − µ X ) n ( y − µY ) k p XY ( x, y )dxdy −∞ −∞
∫∫ i
i
X
for discrete X , Y (3.99)
j
for continual X , Y
where PXY is the joint probability mass function (joint pmf) of discrete variables X and Y, and pXY(x, y) is the joint probability density function (joint pdf) of continual variables X and Y. Two variables X and Y can be also taken as a two dimensional (i.e. bivariate) variable over dimensions x and y. If n = k = 1, i.e. n + k = 2, the 2nd mixed central moment M11 is defined, which is also known as covariance Cov(x, y) or σXY. Covariance shows the “measure of correlation” between two variables. When Cov(X, Y) = 0, X and Y are uncorrelated i.e. statistically independent. In this case:
Cov ( X , Y ) = E [( X − µ X )(Y − µY )] = E ( XY ) − µ X µY = 0
(3.100)
E ( XY ) = E ( X ) E (Y ) = µ X µY
(3.101)
i.e.:
Another measure which shows the relation between two random variables is the correlation RXY (see Chapter 1) defined as:
R XY =
Cov ( X , Y ) Var ( X )Var (Y )
=
σ XY σ XσY
(3.102)
Correlation is the measure of linear dependence between random variables X and Y. Since σX > 0 and σY > 0, the correlation RXY is always between –1 and 1. When a variable is multidimensional with dimensions x1, x2,..., the (n1 + n2 +...)th central moment of a multivariate variable X with joint probability density function P(x1, x2,...) can be defined in the same manner as:
[
M n1 ,n2 ,... = E ( X 1 − µ X1 ) n1 ⋅ ( X 2 − µ X 2 ) n2 ⋅ ⋅ ⋅
]
(3.103)
116 Random Processes
3.2.4 Moment Generating Function Beside the probability distribution function (i.e. pdf or pmf), another way to specify a random variable is by the use of the so-called moment generating function (m.g.f.) [Gha05]. The moment generating function MX(t) of a random variable X is defined as:
M X (t ) = E (e tX )
(3.104)
where t is a real argument (–∞ < t < ∞), i.e.:
∑
e txi PX ( xi ) i M X (t ) = ∞ e tx p X ( x)dx −∞
for discrete X
∫
(3.105)
for continual X
Defined in such a way, the moment generating function has several useful properties. First of all, it can be used for determination of the (higher order) moments by calculating the values of its derivatives for t = 0:
dn [M X (t )] dt n
E( X n ) =
t =0
= M X( n ) (0)
(3.106)
since (in the case of continual variable X):
d [M X (t )] dt
t =0
d2 [M X (t )] dt 2
= M X ' ( 0) =
t =0
∫
∞ d tx e p X ( x)dx dt − ∞
= M X " ( 0) =
t =0
= M X( n ) (0) =
∫ xp ∞
−∞
X
( x)dx = E ( X )
∫
∞ ∞ d 2 tx ( ) e p x dx x 2 p X ( x)dx = E ( X 2 ) = X (3.107) dt 2 − ∞ t = 0 − ∞
...
dn [M X (t )] dt n
∫
t =0
=
dn dt n
∫
∫
∞ ∞ tx e p X ( x)dx = x n p X ( x)dx = E ( X n ) − ∞ t = 0 − ∞
In fact, the moments E(Xn), n = 0, 1, 2,… are the coefficients of the Taylor expansion of the moment generating function MX(t):
M X (t ) =
∑ ∞
n= 0
M X( n ) (0) n t = n!
∑ ∞
n= 0
E( X n ) n t n!
(3.108)
Random signals 117
A moment generating function uniquely determines the probability distribution. This means that, if two random variables have the same moment generating function, then they have also the same pdf, i.e. the same pmf. A very useful property of m.g.f. is that it can enable finding the probability distribution of the sum of independent random variables.
3.2.5 Characteristic Function The existence of the moment generation function is not mandatory for each random variable. On the other hand, the characteristic function, defined as the Fourier transform of the probability density function pX(x) of a random variable X:
Φ X (t ) = ∫ e jtx p X ( x)dx ∞
−∞
(3.109)
always exists when the argument t is real. Defined in this way, characteristic function represents the expected value of ejtX, i.e. E(ejtX). If a random variable has a moment generating function MX(t), the relation to the characteristic function ΦX(t) can be established by the translation from complex to real domain of the argument as:
Φ X (− jt ) = M X (t )
(3.110)
Since pX(x) and ΦX(t) are the Fourier transformation pair:
p X ( x) =
1 2π
∫Φ
∞
−∞
X
(t )e − jtx dt
(3.111)
by convention, the exponent of the term ejtx has a positive sign in the definition (3.108) of ΦX(t), while for pX(x) in the (3.110) this term has a negative exponent, i.e. ΦX(t) is the complex conjugate of the continuous Fourier transform of pX(x). In the same manner as for a continual variable, the relation between the characteristic function of a discrete variable X and its probability mass function can be written in the form of (conjugated) Fourier transform:
Φ X (t ) = ∑ e jtxi PX ( xi ) i
(3.112)
118 Random Processes
Characteristic function of the sum of two independent random variables X and Y is equal to the product of characteristic functions of each variable:
[
] [
] [ ] [ ]
Φ X +Y (t ) = E e jt ( X +Y ) = E e jtX e jtY = E e jtX ⋅ E e jtY = Φ X (t ) ⋅ Φ Y (t )
(3.113)
and this result can be generalized for more independent random variables X1, X2,…, Xn. Similarly as in case of a moment generating function in (3.106), the moments of a random variable X can be generated using the characteristic function:
E ( X n ) = j −n ⋅
dn [Φ X (t )] = j −n ⋅ Φ (Xn) (0) dt n t =0
(3.114)
In the case of two (non-independent) variables X and Y (i.e. when a variable is bivariate), joint characteristic function ΦX,Y(t1,t2) is defined as:
[
] ∫ ∫e
Φ X ,Y (t1 , t 2 ) = E e j (t1 X +t2Y ) =
∞ ∞
j ( t1x +t2 y )
−∞−∞
p XY ( x, y )dxdy
(3.115)
where pXY(x, y) is the joint pdf. From the previous definition it can be easily shown that, if variables X and Y are independent, i.e. if:
p XY ( x, y ) = p X ( x) ⋅ pY ( y )
(3.116)
then their joint characteristic function is equal to the product of characteristic functions of each variable:
Φ XY (t1 , t 2 ) = Φ X (t1 ) ⋅ Φ Y (t 2 )
(3.117)
and vice versa – if the condition (3.116) is fulfilled, variables X and Y are independent.
3.2.6 Distribution of Function of Random Variable In case when it is needed to find the probability density function pY(y) of a variable Y which is a function of a random variable X of a distribution pX(x), i.e. when Y = g(X), two common methods are used.
Random signals 119
The first method is known as the method of transformations. In order to find pY(y), the first step is to equalize cumulative distributions of X and Y (in the case of increasing function g(X)):
P( X ≤ x) =
∫ x
−∞
p X ( X )dX = ∫ pY (Y )dY = P (Y ≤ y ) y
−∞
(3.118)
By differentiating the left integral of the previous equation by y, it can be written: x dx d x dx d ( ) = p X ( x) p X dX ∫ p X ( X )dX ⋅ = ∫ X dy dy dx dy −∞ −∞ d = p X ( g −1 ( y )) ⋅ g −1 ( y ) dy
[
]
(3.119)
The result of differentiating of the right integral in (3.118) is just pY(y), what is wanted. In the case when g(X) is a decreasing function, the upper bound of the right integral in (3.118) should be ∞ and the lower bound should be x, so the result of the differentiation with respect to y is the same as in (3.119), but with the negative sign. Since in this case the differential with respect to y of the inverse function g–1(y) is also negative, the final result in (3.119) is positive, so the pdf of Y = g(X) can be calculated in general as:
pY ( y ) = p X ( g −1 ( y )) ⋅
[
d −1 g ( y) dy
]
(3.120)
The other way to find pY(y) is known as the method of distribution functions. The first step of this method assumes finding the (cumulative) probability distribution function of Y:
FY ( y ) = P(Y ≤ y )
(3.121)
and after that the differentiating of FY(y) with respect to y. The distribution FY(y) is presented via distribution FX(x), but for the values of X within domain A where, after transformation X = g–1(Y), the domain –∞ < Y ≤ y is translated into the domain X ∈ A. For example, if Y = X2, the domain 0 ≤ Y ≤ y (since Y is always positive) is translated into the domain X ∈ A = [–y1/2, y1/2], i.e.:
FY ( y ) = P(Y ≤ y ) = P( X 2 ≤ y ) = P(− y ≤ X ≤
y)
(3.122)
120 Random Processes
Further, if X is e.g. uniformly distributed between positive values a and b, i.e. if pX(x) = 1/(b – a) for a ≤ X ≤ b (and 0 otherwise), FY(y) is calculated as:
0, FY ( y ) = P (a ≤ X ≤ 1,
y) =
∫b−a = y
dX
a
for
y = xb
Finally, after differentiating with respect to y:
0, for y < a 2 dF ( y ) 1 , for y ≤ b 2 pY ( y ) = Y = dy 2(b − a ) y for y > b 2 1,
(3.124)
The above described methods of finding of pY(y) when Y = g(X) are not always suitable and in the theory and applications of probability often complicated. In some cases, the method which uses characteristic functions may be the easiest way. Namely, as the characteristic function of random variable Y = g(X) can be written using pX(x):
[ ] ∫e
Φ Y (t ) = E e jty =
∞
−∞
jty
[
] ∫e
pY ( y )dy = E e jtg ( x ) =
∞
−∞
jtg ( x )
p X ( x)dx
(3.125)
for calculating of pY(y) it is enough, using suitable substitutions, to write the last ∞ integral in (3.125) in the following form: ∫−∞ � ��� ��, since in this case pY(y) = f(y). 3.2.7 Central Limit Theorem Central limit theorem (CLT) is one of the most important statements in the theory of probability and statistics. It answers the question: “What is the distribution of the mean values (or sums) of sample series of a random variable X?” And the answer is: “The distribution is normal if the size of series is large enough, no matter what the distribution of X is.” For example, in experiment with n = 2 dices, there is a set of possible sums: {1 + 1 = 2, 1 + 2 = 3, 1 + 4 = 5,…, 5 + 6 = 11, 6 + 6 = 12}, where the distribution of the sums is symmetric around the maximal value in the middle, i.e. the value 7 has the
Random signals 121
largest probability of 6/36=1/6 (as the result of any of outcomes: 1 + 6, 2 + 5, 3 + 4, 4 + 3, 5 + 2 or 6 + 1). This distribution is not normal, but when the number of dices in the experiment increases, the distribution of sums approaches the normal one. Often, the value of n ≥ 30 is large enough to say that the sum or the mean value of n outcomes of an experiment has a normal distribution. CLT is important because it introduces a relation between the statistical functions (such as sum and average) of sample series of a stochastic process (i.e. experiment outcome or random variable) and the expected values and variances of those functions as the probability categories. Mathematically, CLT can be presented as follows [Gha05]: Let X1, X2,..., Xn be a sequence of n independent and identically distributed random variables with the same expectation μX and variance σX2. Then, the sum Sn = X1 + X2 +...+ Xn is also a random variable whose distribution converges (for large n) to the standard normal distribution with the expectation E(Sn) = nµX and with the variance Var(Sn) = nσX2, i.e.:
∑
n FSn ( s ) = lim P[S n ≤ s ] = lim P X n ≤ s = n→ ∞ n→ ∞ i =1
∫e s
1
2πnσ
−
(t − nµ X )2 2 nσ X2
2 X −∞
dt
(3.126)
One of the equivalent ways of presenting of the CLT is via the arithmetic mean X of random variables X1, X2,.., Xn:
∑
1 n X i − µ X X − µX n i =1 ≤ x = lim P ≤ x = lim P → ∞ n n→ ∞ σX / n σ X / n
1
2π
∫ x
−∞
e
−
t2 2
dt
(3.127)
The proof of the CLT can be performed using the property (3.112) of the characteristic function applied to the sum of n independent variables. Let Zn denote a random variable:
Zn =
X − µX
σX / n
1 n =
∑ X − µ
n
i =1
i
σX / n
∑(X n
X
=
i =1
i
− µX )
σX n
(3.128)
and Y1, Y2,…, Yn the series of variables: Yi = (Xi – µX)/σX for i = 1,.., n. Since all Xi have identical distributions, the same case applies to variables Yi, i.e. all Yi have the same characteristic function ΦYi(t) = ΦY(t). Also, E(Yi) = 0 and Var(Yi) = 1. By generalizing
122 Random Processes the property (3.112) to the sum of n independent variables, and by expanding ΦY(t) in the Taylor series for values of argument around zero, it can be written:
Φ Z n (t ) =
∏ n
i =1
t = Φ Yi n
t = Φ Y n n
t t2 " = 1 + Φ Y' (0) + Φ Y (0) + ... n 2 n
(3.129) n
The higher members of the Taylor series in (3.129) can be neglected for large n, and from (3.114) ΦY’(0) = j⋅E(Yi) = 0 and ΦY”(0) = –E(Yi2) = –1, so (3.129) can be rewritten as: t − t2 lim Φ Z n (t ) = lim 1 − = e 2 n→∞ n→ ∞ 2n n
2
(3.130)
Finally, the pdf of the variable Zn (for large n) can be found using (3.109) as inverse Fourier transform of its characteristic function:
pZn ( z ) =
1 2π
∫ ∞
Φ Z n (t )e − jtz dt =
−∞
1 2π
∫e ∞
−∞
−t 2 / 2 − jtz
e
dt =
1
2π
e−z
2
/2
(3.131)
which is just the normal Gauss distribution Ɲ{µZ; σZ} with µZ = 0 andσZ = 1.
3.3 Stochastic Processes 3.3.1 Ensemble, Stationarity and Ergodicity The mathematical models that represent real-world phenomena, where various parameters are changing randomly, should be rather probabilistic than deterministic. A model based on a probability is also called stochastic model and the processes described by such models are called stochastic processes. A stochastic process assumes a family of random variables Xt where index t can have values from the domain Ʈ. If the set Ʈ of index values is continual and if Ʈ = [0, ∞), then the index t is often interpreted as time, and the stochastic process is interpreted as a continual function of time X(t) with the values that can be changed every instant [Zit10]. On the other hand, a discrete time process is the stochastic process where changes occur discretely, since in this case the values of index t can only be positive integers (or zero) [Wil96][Jeetal00][Pap84].
Stochastic Processes 123
If a set of “similar” i.e. “related” signals Xi(t) generated under the similar conditions is observed, where each signal has a different shape in time, the set is called ensemble, while each signal is called the member of the ensemble or (one particular) realization. An example of an ensemble could be a set of voltage signals of thermal noise emitted by many identical resistors at the same temperature. Also, an ensemble can be created from one “very long” stochastic process X(t) when split in segments with equal and “long enough” durations, whereby each segment represents a particular member Xi(t) of the ensemble [Lee60]. While a random variable X can take any value from the set of possible (discrete or continual) outcomes xn, a stochastic process (often called random process or random signal) can be considered as the generalization in time of a random variable X, i.e. a stochastic process X(A, t) is a function of two variables – random event A and time t. For a particular realization of the random event A = Ai, a stochastic process X(A, t) becomes a function of time X(Ai, t) = Xi(t). The set of all possible realizations Ai in time of the stochastic process X(A, t), i.e. the set of all possible time functions Xi(t) is a statistical ensemble (for i = 1, 2, 3,…). On the other hand, for a particular instant of time tk, the stochastic process X(A, t) becomes the random variable X(A, tk) = Xk (with possible outcome values xn,k), whose behavior can be described by its probability density function pXk(xk):
P( xk < X k < xk + dxk ) = p X k ( xk )dxk
(3.132)
Stochastic process is stationary when probability density function pXk(xk) does not depend on the moment tk, i.e. when pXk(xk) = pX(x) for each tk. Further, a stochastic process is “stationary in the strict sense” if all higher order joint probability functions of two or more random variables Xk, Xl, Xm,… in different instances of time tk, tl, tm,… do not depend on these moments or the time differences between them. The stationarity in the strict sense is hard to examine in praxis due to the necessity of knowing the higher order joint probability functions. Easier way is to find the mean value and the autocorrelation function (i.e. the first and the second order average functions), so that the stationarity of a stochastic process (i.e. ensemble) can be defined “in the wide sense”. The stochastic process is stationary in the wide sense if the mean value (over the ensemble members in a particular moment) does not depend on time i.e. if:
X k = µ X , ∀t k
(3.133)
124 Random Processes
and if the autocorrelation function is also independent of time, i.e. of the moments t1 and t2:
R X (t1 , t 2 ) = X 1 X 2 = =
∫ ∫x x
∞ ∞
− ∞− ∞
1 2
∫ ∫x x
∞ ∞
− ∞− ∞
1 2
p X1 , X 2 ( x1 , x2 , t1 , t 2 )dx1dx2 =
p X1 , X 2 ( x1 , x2 , t 2 − t1 )dx1dx2 = R X (τ )
(3.134)
where τ = t2 – t1, which means that autocorrelation function RX(t1, t2) depends only on the time difference τ. An important type of stochastic processes are cyclostationary (or periodically stationary) processes [Ben58]. A process is cyclostationary when all of its statistic properties vary periodically in time. Also, there is the wide-sense cyclostationarity where only the mean and the autocorrelation function have to be periodic in time. Mathematically, it can be expressed as:
µ X (t + mT ) = µ X (t )
(3.135)
R X (t1 + mT , t 2 + mT ) = R X (t1 , t 2 )
(3.136)
and:
where m is an integer and T is the time period. A cyclostationary process can be considered as the composition of several interleaved stationary processes – for example, the time division multiplex (see Chapter 6) of n discrete signals produced by sampling of n continual signals with the sampling frequency 1/T. Another example would be the traffic load per hour of a communication line (e.g. between two telephone exchanges). Such a kind of traffic has a periodic statistics with the 24-hour cycle, i.e. this process can be viewed as the combination of 24 different stationary processes (for each particular hour of the day) in a period of several weeks or months. A stationary stochastic process can have the “time mean value” (i.e. the mean value over time) given as:
X i = lim
1 T →∞ 2T
∫ X (t )dt
T
−T
where Xi(t) is the ith member of the ensemble.
i
(3.137)
Stochastic Processes 125
The “ensemble mean value”, where all the members of the ensemble are averaged (in a particular instant of time), is given by:
X =
∫ xp
∞
−∞
X
(3.138)
( x)dx
If the time mean value is equal to the ensemble mean value, the process is “ergodic in the mean” (the term “ergodic” is borrowed from thermodynamics [Lee60]). In the similar manner, the time and the ensemble autocorrelation functions are defined as:
ℜ X (τ ) = X (t ) X (t + τ ) = lim
1 T →∞ 2T
R X (τ ) = X 1 X 2 =
∫ ∫x x
∞ ∞
− ∞− ∞
1 2
∫ X (t ) X (t + τ )dt
T
−T
p X ( x1 , x2 ,τ )dx1dx2
(3.139)
(3.140)
A stationary process is “ergodic in the autocorrelation” if the time and the ensemble autocorrelation functions are equal [Hay94]. If a stochastic process is ergodic both in the mean and in the autocorrelation, the process is ergodic in the wide sense. The stationarity is a necessary but not a sufficient condition for the ergodicity of a stochastic process. Finally, a stochastic process is ergodic (in the strict sense) if all the average functions over time (the mean over time, the time autocorrelation, higher order time average functions) are equal to the corresponding ensemble average functions [Mid96].
3.3.2 Power Spectral Density and Wiener-Khinchin Theorem For the stationary stochastic processes, e.g. for signals continual in time, it is often important to know how the average power of a signal is distributed over frequency, i.e. to know its Power Spectral Density (PSD). According to the Wiener-Khinchin theorem [Wie30][Gar87][Khi34], in the case of a stationary stochastic process, the power spectral density SX(ω) and the autocorrelation function RX(τ) are the Fourier transformation pair [Tho69]:
S X (ω ) =
∫R
∞
−∞
X
(τ )e − jωτ dτ
(3.141)
126 Random Processes
RX (τ ) =
1 2π
∫S
∞
−∞
(ω )e jωτ dω
X
(3.142)
Defined in a such way, SX(ω) and/or RX(τ) sometimes do not exist, since in general they do not have to be absolutely (or square) integrable, i.e. one of the following conditions does not have to be fulfilled:
∫
∞
−∞
RX (τ ) dτ < ∞
∫S
∞
and
−∞
X
(ω ) dω < ∞
(3.143)
In order to obtain the validity of the Wiener-Khinchin theorem also for those cases, a generalization of a PSD is introduced as:
S X (ω ) = S X( c ) (ω ) + S X( d ) (ω )
(3.144)
i.e. as the sum of the continual power spectrum of the random component, (d ) and the discrete spectrum of a periodic component of a signal, S X given by:
S X( d ) (ω ) =
∑ 2π F ∞
n = −∞
n
2
δ (ω − ω0 )
S X(c )
(3.145)
Hereby, |Fn|2 is the power of the nth harmonic of the periodic component (with the basic frequency ω0), and the Dirac pulse δ(ω – ω0) has the value 1 only if ω = ω0 (otherwise it is 0) [Lee60]. The (average) power spectral density is (in its physical interpretation) expressed in W/Hz. In fact, the function SX(ω) is the “mathematical” spectral density (of the average power of signal), since it is defined for positive as well as for negative values of frequency ω, which can be seen from the boundaries of the integral in (3.142). Therefore, the function SX(ω) is also called a “double-sided” PSD, while in literature the “single-sided” or “natural” PSD is often defined, with values of SX(ω) doubled in comparison to the double-sided PSD (see Chapter 1), but only for positive ω (for ω < 0 the single-sided SX(ω) = 0).
Stochastic Processes 127
Fig. 3.7: Double- and single-sided average power spectral density.
For a random signal X(t), the (double-sided) PSD is an even and non-negative function of ω, i.e. SX(–ω) = SX(ω) and SX(ω) ≥ 0 for each ω, while its maximal value for ω = 0 equals:
S X ,max = S X (0) =
∫R
∞
−∞
X
(τ )dτ
(3.146)
The average power of a stochastic process X(t) is defined by:
PX = lim
1 T →∞ T
∫ E[X
T /2
−T / 2
2
]
(3.147)
(t ) dt
If the process is stationary in the wide sense, from (3.96) and (3.134) it can be shown that E[X2(t)] = RX(0), so the average power equals:
PX = lim
1 T →∞ T
∫ RX (0)dt = RX (0) =
T /2
−T / 2
1 2π
∫S
∞
−∞
X
(ω )dω
(3.148)
3.3.2.1 White and Colored Noise White noise is a stationary stochastic process n(t), whose power spectral density has a constant value in the entire frequency range –∞ < ω < ∞:
S n (ω ) =
N0 , 2
N 0 = const , − ∞ < ω < ∞
(3.149)
128 Random Processes
Fig. 3.8: Power spectral density of white noise.
When (3.142) and (3.149) are combined according to the Wiener-Khinchin theorem, the result of the autocorrelation function of white noise is a Dirac pulse with infinite amplitude and “surface” of N0/2 is:
Rn (τ ) =
1 2π
∫
∞
N 0 jωτ N e dω = 0 δ (τ ), 2 2 −∞
− ∞ ωc (the total average power is then N0ωc), and its autocorrelation function equals:
Rn ' (τ ) =
N 0ωc sin(ωcτ ) , − ∞ 0) In this case, the probability density function for maximum entropy equals:
1 − w( s) = e µ µ s
(4.18)
and entropy:
H ( S ) max = ld ( µe)
(4.19)
136 Information Theory and Coding
3.
The random value s has a finite variance σ2 (< ∞) and average µ = 0 In this case, the probability density function for maximum entropy equals:
w( s ) =
1
2πσ 2
e
−
s2
2σ 2
(4.20)
and entropy:
H ( S ) max =
1 ld (2πeσ 2 ) 2
(4.21)
Gauss’ distribution with an average µ has the same entropy as shown in (4.21) (see Chapter 3). All other distributions with the same variance have a smaller entropy. According to the central limit theorem (see Chapter 3), the sum of many independent random variables with finite variances results, in case of an infinite increase of a number of variables, in a Gauss’ distribution.Therefore, it is expected that entropy, as a measure of uncertainity, is maximal in case of a Gauss’ distribution. The random value can get any real value, but with a finite variance, i.e. average power, which is typical for “natural” information sources, like noise [Dra04].
4.2 Source Coding The source or statistical coding aims at economical representation of information emitted by the source, in order to save resources needed for transmission/ memorizing of information. Often used terminology for source coding in a literature is data compression which can be non-destructive or lossless (all information are transmitted or memorized) and destructive or lossy (only a part of information is transmitted or memorized). In the case of discrete sources, both lossless and lossy compression can be performed, and in the case of continual sources only lossy compression is possible. Data compression enables faster transmission, the usage of transmission media with lower data rates, the usage of less memory media, lower costs and data processing using embedded backup-concepts. The transmission capacity can be reduced up to 80 % and even 95 %. The typical applications of data compression are speech and image processing and transmission. Historically, one of the first algorithms of source coding were Shutter telegraph from ca. 1795 [Bur04], Braille alphabet from ca. 1825 for blind persons [Bra29] and the Morse alphabet [ITU-M1677] built on points and dashes from 1835. The first stan-
Source Coding 137
dardized algorithm of statistical coding was International Telegraph Alphabet No. 2 or CCITT-2 (coding of information symbols with 5 bits) from 1930 [Smi01], which was partially replaced by a more popular 7-bit ASCII (American Standard Code for Information Interchange) – the code standardized by ANSI (American National Standard Institute). There are few simple procedures for compression of symbol sequences: 1. “0”-Suppression [Buc97] Longer “0”-sequences, with at least 3 “0”s, are replaced by 2 symbols, whereby the first one signalizes begin of compression, and the second one the length of a compressed “0”-array. 2. Bit-Mapping [Hel87] Information is grouped into blocks of 8 symbols, whereby the most frequent one(s) is (are) erased and replaced with a bit pattern of 8 bits. 3. Run-Length Encoding (RLE) [Sal00] Each sequence of a symbol is compressed if the symbol repeatedly appears at least four times. Therefore, encoding is performed using 3 symbols: – the first one signalizes that the compression begins, – the second one signalizes that the sequence appeared and was compressed, – and the third one is a repeating counter (“0” stands for 4-times repeat etc.). 4. Half Byte Packing [HeMa96] Sequences of symbols, which begin with the same four bits, are grouped. Compression is performed through grouping of four last bits from two symbols in one byte. 5. Distance Coding [Bin00] The first appearance of a symbol is transferred, afterwards the coding is added, which described the distance till the next appearance of the same symbol. Compression is performed using different coding procedures for distances. The simplest procedure for a word compression is using vocabularies [ZiLe78]. In order to differ coding of symbol sequences from coding of words, coding of words has to be signalized using a symbol, which appears never or very rare in information to be compressed.
138 Information Theory and Coding
4.2.1 Code Definition A simple block scheme of an encoder with a discrete source with a finite number of symbols (q) is shown in Figure 4.5:
Fig. 4.5: Block Scheme of Encoder.
A list of input symbols or a source list S is also called a source alphabet. A list of code symbols X or a code list is also finite. A code or coding can be defined [Abr63] as the mapping of a symbol sequence from a source list to a symbol sequence of a code list. Symbol sequences of a code list are also called codewords Xi. The number of code symbols in a codeword is a length of a codeword. Generally, a length of a codeword can be fixed and variable. Tab. 4.1: Examples of tables of codewords.
S
X
S
X
S1
00
S1
0
S2
01
S2
10
S3
10
S3
110
S4
11
S4
1110
Fig. 4.6: Examples of code trees.
Source Coding 139
Codewords can be represented using code tables (Tab. 4.1) and code trees (Fig. 4.6). Code trees are convenient to follow the decoding process, starting from a tree root and following one of the branches. The number of branches equals the number of different code symbols. When the next symbol arrives, the branch bifurcates and new branches come out from a node. After the last symbol of a codeword arrives, the decoding of a symbol is finished. The average length L of a codeword is defined as:
l = L = ∑ pi l i q
(4.22)
i =1
whereby the length of each symbol equals li and probability of appearance of each symbol pi = P(si). The aim of source coding is to find an optimal code i.e. a code with an average length of a codeword smaller (or equal) than average lengths of codewords of all other codes with the same source and the same code list. An optimal length of a codeword lopt is given as:
lopt = ld
1 pi
(4.23)
Obviously, an optimal length can be found using (4.23) only if symbol probabilities are 1/2n, whereby n is integer. Otherwise, the next high integer has to be taken, i.e.:
ld
1 1 ≤ li < ld +1 pi pi
(4.24)
or, if (4.24) is multiplied with pi and sumed for i = 1,…,q:
∑ p ⋅ld p ≤ ∑ p ⋅l < ∑ p ⋅ld p + ∑ p q
i =1
i.e.:
q
q
1
i
i
i =1
i
i
i =1
1
i
H (S ) ≤ L < H (S ) + 1
i
q
i =1
i
(4.25)
(4.26)
In the case of codewords of an extended source, i.e. sequences Ln of n symbols of an original source, (4.26) can be written as:
H (S ) ≤
Ln 1 < H (S ) + n n
(4.27)
140 Information Theory and Coding
(4.27) becomes, for n infinitelly increasing:
lim
n→ ∞
Ln = H (S ) n
(4.28)
where Ln/n is the code rate i.e. the average length of a codeword per symbol of an original source S. (4.28) formulates the Shannon’s source coding theorem or a noiseless coding theorem [Sha48]: It is impossible to compress the data such that the code rate is less than the Shannon entropy of the source, without it being virtually certain that information will be lost. However, it is possible to get the code rate arbitrarily close to the Shannon entropy, with negligible probability of loss.
The code efficiency η determinates how near the average length of a codeword is to the entropy:
η=
H (S ) 100 [%] L
(4.29)
Similarly, code redundancy R is defined as:
R=
L − H (S ) 100 [%] L
(4.30)
4.2.2 Compression Algorithms 4.2.2.1 Huffman Coding Huffman coding is non-optimal algorithm for a lossless compression [Huf52], which replaces symbols of a fix length with codes of a variable length. It was a leading algorithm for a long time, until arithmetic coding appeared as an optimal code. The aim of Huffman’s coding is, like other entropy coding methods, to represent more common symbols with fewer bits. The starting point for Huffman’s coding is a table of symbols and their estimated probability of frequencies of occurrence (weights). Using this table, a binary coding tree is constructed as follows: two symbols with the lowest probability of occurrence are combined in one new symbol and their probabilities of occurrence are summed. The schedule of symbols added to a new symbol makes no difference for the coding algorithm in the case of several symbols with the same probability of occurrence. The symbols are represented as leaves of the tree, and the new symbols are the tree nodes. The branches between the nodes and leaves, as well as between the
Source Coding 141
nodes are coded with “0” and “1”, as long as the last new symbol, so called root of the tree, is reached with the probability of occurrence equal to 1. The following example with the source alphabet [A, B, C, D, E, F] shows the table with symbols and their estimated probability of frequencies of occurrence (Tab. 4.2) and one possible coding tree (Fig. 4.7). Tab. 4.2: Symbols and their probabilities of occurrence.
Symbol
A
B
C
D
E
F
Probability of occurrence
0.4
0.2
0.1
0.1
0.1
0.1
Coding
1
10
0000
0001
0010
0011
Fig. 4.7: Example of a coding tree for Table 4.2.
The compression effect can be explained in example of the message: “BAABCADEFA”. In case without compression, six symbols from Table 4.2 would be coded with 3 bits (23 > 6), i.e. 30 bits would be needed for the codeword with 10 symbols. In the case of compression using Huffman code, the total number of bits is: 4∙P(“A”) + 2∙P(“B”) + 1∙P(“C”) + 1∙P(“D”) + 1∙P(“E”) + 1∙P(“F”) = 4∙1 + 2∙2 + 1∙4 + 1∙4 + 1∙4 + 1∙4 = 24 bits. The compression effect is very modest in the previous example, but it would be much more remarkable in case of a source alphabet with a bigger set of symbols. The greatest application of Huffman code is in Telefax devices [ITU-T4]. Nowadays,
142 Information Theory and Coding
Huffman coding is used as a back-end compression in case of multimedia codes like JPEG [ITU-T81] and MP3 [ISO11172], DEFLATE [Deu96] etc. Code trees of words or parts of words are used in V.42bis, a standard of dictionary based compression designed for modems.
4.2.2.2 Arithmetic Coding Arithmetic coding [Ris76] is an optimal algorithm for a lossless compression. It produces the same output as Huffman coding in case that every symbol has a probability of occurrence of 1/2n. Otherwise, arithmetic coding offers better compression than Huffman coding, especially for small alphabet sizes. Arithmetic coding starts, like Huffman’s coding, with a table of symbols and their probabilities of occurrence which are presented with a real number between 0 (in extreme case that a symbol never occurs) and 1 (in extreme case that a symbol always occurs). An interval consists of numbers between the left and right side of the interval. More frequent symbols narrow the interval less than less frequent symbols, i.e. more frequent symbols add fewer bits than less frequent symbols. The resulting value with the lowest number of bits inside the interval is chosen as a codeword. That also means that only decimal positions are considered as a codeword, as all possible codewords belong to the interval between 0 and 1, so that 0 and 1 do not have to be considered as a part of a codeword. A codeword represents not only the original message, but all the messages beginning with the original message as well. Therefore, the length of the message has to be known, or the last symbol of the original message has to be followed by the symbol EOF (End Of File). The algorithm of arithmetic coding can be explained in the same example as Huffman’s coding, in order to compare both of them. Hereby, the length of the message is known and sent to the receiver before the coded message is sent. The aim of arithmetic coding is optimal coding of each symbol. In case of a message “BAABCADEFA”, the optimal length is calculated using Table 4.2 as:
L = − 0.4 ⋅ ld 0.4 − 0.2 ⋅ ld 0.2 − 0.1 ⋅ ld 0.1 − 0.1 ⋅ ld 0.1 − 0.1 ⋅ ld 0.1 − 0.1 ⋅ ld 0.1 = 23.22 bit
(4.31)
The first step of arithmetic coding is to prepare a table with intervals according to the probabilities of occurrence of symbols:
Source Coding 143
Tab. 4.3: Interval assignment.
Symbol
A
B
C
D
E
F
Probability of occurrence
0.4
0.2
0.1
0.1
0.1
0.1
Coding
[0, 0.4)
[0.4, 0.6)
[0.6, 0.7)
[0.7, 0.8)
[0.8, 0.9)
[0.9, 1)
The first symbol to be coded is “B” and it gets an interval [0.4, 0.6) assigned. All the messages beginning with “B” belong to this interval. After this first step of arithmetic coding, the left side of the interval becomes 0.4, and the right one 0.6. This new interval is then divided into 6 sub-intervals according to the probabilities of occurrence of symbols: Tab. 4.4: Interval assignment for encoding of “B”.
Symbol
A
B
C
D
E
F
Prob. of occurrence
0.4
0.2
0.1
0.1
0.1
0.1
Coding
[0.4, 0.48)
[0.48, 0.52) [0.52, 0.54) [0.54, 0.56) [0.56, 0.58) [0.58, 0.6)
Fig. 4.8: Arithmetic coding procedure.
The second symbol is “A” and all the messages beginning with “BA” are a part of this interval. This interval is also divided into 6 sub-intervals according to the prob-
144 Information Theory and Coding
abilities of occurrence of symbols. The same procedure is repeated, until the whole message is encoded: Tab. 4.5: Interval assignment for encoding of “BAABCDAEFA”.
Symbol
Left side of the interval
Right side of the interval
0.0
1.0
B
0.4
0.6
A
0.4
0.48
A
0.4
0.432
B
0.4128
0.4192
C
0.41664
0.41728
A
0.41664
0.416896
D
0.4168192
0.4168448
E
0.41683968
0.41684224
A
0.41683968
0.416840704
F
0.416840601
0.416840704
The procedure of arithmetic encoding of a word “BAABCDAEFA” is graphically presented in Figure 4.8. A value with the minimal number of decimal positions from the last interval in Table 4.5 [0.416840601, 0.418640704) has to be chosen as a codeword. The solution is the value of 0.4168407 with only 7 decimal positions. As every decimal position is represented using ld10 ~ 3,32 bits, 7 decimal positions accord to the optimal length as calculated in (4.31). The advantages of arithmetic coding become obvious for longer sequences, but the needed computational effort is much higher than in case of Huffman coding and decoding. Arithmetic coding is used as well as Huffman coding for a back-end compression in case of multimedia codes.
4.3 Channel Coding The channel coding aims at making the communication system less vulnerable to channel errors by recognizing and possibly correcting of errors in the received signal. For this purpose, channel codes add redundancy to the information sequence, so that a message of k symbols becomes a codeword of n symbols after coding, whereby n > k. The ratio R = k/n is called code rate and it defines the amount of re-
Channel Coding | 145
dundancy. The disadvantage of the added redundancy is a lowered data transmission rate. Important properties for description of codes and codewords are Hamming weight and Hamming distance. The Hamming weight is the number of non-zero symbols of a codeword. The Hamming distance of two codewords is number of symbol positions in which the codewords differ. The Hamming distance is used as a measure of code correction property: – t errors can be recognized in a codeword, if the Hamming distance of a code equals d = t + 1, and – if t errors can be corrected in a codeword, if the Hamming distance of a code equals d = 2t + 1. In coding theory, coding gain is defined as the difference between the signal-noiseratio levels between the uncoded and coded system needed to achieve the same BER. Coding gain depends not only on the applied code, but also on a decoding algorithm. An example of coding gain is shown in Figure 4.9. -1
Bit-error probability (BER)
10
-2
10
BER without coding (BPSK)
-3
10
Coding gain = 1.33dB
-4
10
BER with coding Coding gain = 2.15dB
-5
10
-6
10
3
4
5
6
7
8
9
10
Eb/N0 [dB] Fig. 4.9: Coding gain.
4.3.1 Block Coding An (n, k) block code gets a message of k symbols at input and transforms it into a codeword of n symbols which depends only on the current input message and not on the past ones: the block code is memoryless.
146 Information Theory and Coding
A block code is linear if a modulo-2 sum of any two codewords results in another codeword and if the code contains the codeword with all zeros. For a given generator k×n matrix G of the code:
g 0 g 00 g g 1 10 . . G= = . . . . g k −1 g k −1, 0
g 01
.
.
.
g11 .
.
.
.
. . g k −1,1
g 0,n −1 g1,n −1 . . . g k −1,n −1
(4.32)
and for a given input message c = (c0, c1,…, ck–1), the codeword is given by:
v = c ⋅ G = c0 ⋅ g 0 + c1 ⋅ g1 + ... + c k −1 ⋅ g k −1
(4.33)
A linear block is systematic if the message of length k is itself a part of the codeword of length n, whereby the redundancy or a parity-check is added to the message:
(c0 , c1 ,..., ck −1 ) → (v0 , v1 ,..., vn− k −1 , c0 , c1 ,..., c k −1 ) message − check parity message
(4.34)
codeword
The generator matrix G of a systematic block code consists of a parity-check k×(n–k) matrix P and a k×k identity matrix I:
G = [P I k ]
(4.35)
whereby P has a following form:
p00 p 10 . P= . . p k −1,0
p01 p11 . . . p k −1,1
. .
. .
. .
p0,n− k −1 p1,n− k −1 . . . p k −1,n− k −1
(4.36)
Using the fact, that the code is systematic and equations (4.33) and (4.35) it can be written that:
Channel Coding 147
v⋅ H = 0 T
(4.37)
whereby the (n – k)×n matrix H is the so called parity-check matrix and is used for error detection of a received codeword.
4.3.1.1 Hamming Codes Hamming codes which were proposed by Richard Hamming at Bell Laboratories in 1950 [Ham50], are the simplest error correcting codes. At the time, error detection was done using parity bits, but there was no way to correct errors. Parity code could detect odd/even number of errors but could not correct any errors. Hamming proposed the single error correcting Hamming code (the minimum Hamming distance is therefore d = 2t + 1 = 3), which is the simplest Hamming code, also known as the (7, 4) code: n = 4 is the number of information bits and k = 3 is the number of paritycheck bits. For the minimum Hamming distance of 3, the Hamming code can recognize t = d – 1 = 2 errors. Hamming codes are perfect codes as they achieve the highest possible rate for codes with their block length and minimum distance of 3. Although the Hamming codes were an extension of the simple parity codes, the idea of Hamming was to assign several parity-check bits to each codeword. By noticing which parity bits are wrong, the erroneous bit can be identified. For information bits i1, i2, i3 and i4 the parity-check bits p1, p2 and p3 are calculated as: p1 = (c1 + c2 + c3) mod 2, p2 = (c2 + c3 + c4) mod 2 and p3 = (c1 + c2 + c4) mod 2. Accordingly, a list of all possible codewords is given as follows: Tab. 4.6: Codewords of a (7, 4) Hamming code.
c1
c2
c3
c4
p1
p2
p3
0
0
0
0
0
0
0
0
0
0
1
0
1
1
0
0
1
0
1
1
0
0
0
1
1
1
0
1
0
1
0
0
1
1
1
0
1
0
1
1
0
0
0
1
1
0
0
0
1
0
1
1
1
0
1
0
1
0
0
0
1
0
1
1
0
0
1
1
1
0
148 Information Theory and Coding
c1
c2
c3
c4
p1
p2
p3
1
0
1
0
0
1
1
1
0
1
1
0
0
0
1
1
0
0
0
1
0
1
1
0
1
0
0
1
1
1
1
0
1
0
0
1
1
1
1
1
1
1
In order to recognize the erroneous bits, 3 parity-check bits have to be recalculated from the received codeword. These 3 bits form a so called syndrome vector (s1, s2, s3). The following table shows all 8 possible cases, which can be handled using the Hamming code, whereby the syndrome vector is calculated from bits of the received codeword as: s1 = (c1 + c2 + c3 + p1) mod 2, p2 = (c2 + c3 + c4 + p2) mod 2 and p3 =(c1 + c2 + c4 + p3) mod 2. Tab. 4.7: Syndroms for the (7, 4) Hamming code.
Case
s1
s2
s3
Interpretation
1
0
0
0
No error occurred
2
1
0
1
The error occurred at position 1 (c1 erroneous)
3
1
1
1
The error occurred at position 2 (c2 erroneous)
4
1
1
0
The error occurred at position 3 (c3 erroneous)
5
0
1
1
The error occurred at position 4 (c4 erroneous)
6
1
0
0
The error occurred at position 5 (p1 erroneous)
7
0
1
0
The error occurred at position 6 (p2 erroneous)
8
0
0
1
The error occurred at position 7 (p3 erroneous)
4.3.1.2 Cyclic Codes A cyclic (n, k, d) code [Hil88] is completely specified by the codeword length of n and its generator polynomial g(z) of degree n – k. The coding rule is given by:
v( z ) = g ( z ) ⋅ c( z )
(4.38)
i.e. every codeword v(z) can be generated by multiplying the message polynomial c(z) by a generator polynomial g(z). The degree of the generator polynomial is equal to the number of parity bits and is given in a following polynomial form:
Channel Coding 149
g ( z ) = g n− k ⋅ z n− k + g n− k −1 ⋅ z n− k −1 + ... + g 0
(4.39)
A codeword (v0, v1, v2,…, vn–1) is then presented as:
v( z ) = v1 ⋅ z n−1 + v 2 ⋅ z n− 2 + ... + vn
(4.40)
Cyclic code have a property that cyclic shifting of any codeword of the code by any number of positions results in another codeword, i.e., if (v0, v1, v2,…, vn–1) is a codeword then (vn–1, v0, v1,…, vn–2) is also a codeword. However, all the codewords of the code cannot be produced by shifting alone but a combination of shifting and addition can be used to generate all the codewords. Most of the well-known linear block codes are cyclic codes. Examples of cyclic codes include Hamming codes (in a cyclic form), Golay codes [Gol49], BCH codes [ReCh99] and Reed Solomon codes [ReSo60]. Reed Solomon code is different from the other examples as it is cyclic but not a binary code, whereas others are binary cyclic codes. Tab. 4.8: Codewords generation.
Step 1
Message polynomial
Message
Codeword polynomial
Codeword
0
0000
0
0 0 0 0 0 0 0 Zero codeword
3
Remarks
2
1
0001
z +z+1
0001011
Generator polynomial
3
z
0010
z4 + z2 + x
0010110
Shifted generator polynomial
4
z2
0100
z5 + z3 + z2
0101100
Two shifts
5
z
3
1011000
Three shifts
6
z2 + z + 1
0110001
Four shifts
3
2
6
4
1000
z +z +z
0111
z5 + z4 + 1 6
3
5
7
z +z +z
1110
z +z +z
1100010
Five shifts
8
z3 + z + 1
1011
z6 + z2 + 1
1000101
Six shifts
4
3
2
9
z+1
0011
z +z +z +1
0011101
Codeword of step 2 XOR codeword of step 3
10
z2 + z
0110
z5 + z4 + z 3 + z
0111010
Single left shift
11
z3 + z2
1100
z6 + z5 + z4 + z2 1 1 1 0 1 0 0
3
2
6
5
3
Two left shifts
12
z +z +z+1
1111
z +z +z +1
1101001
Three left shifts
13
z3 + 1
1001
z6 + z4 + z + 1
1010011
Four left shifts
2
5
2
14
z +1
0101
z +z +z+1
0100111
Five left shifts
15
z3 + z
1010
z6 + z3 + z2 + z
1001110
Six left shifts
1111111
Codeword of step 2 XOR codeword of step 11
16
3
2
z +z +1
1101
6
5
4
z +z +z +z + z2 + z + 1
3
150 Information Theory and Coding
Cyclic codes are error detection and correction codes which are particularly easy to implement in hardware using Linear Feedback Shift Registers (LFSR). They are applied in various practical scenarios such as remote data transmission via modem, as well as on data storage mediums such as CDs. One example of codewords generation using shift and (modulo 2) adding operations is given in the following table, in case that the generator polynomial is given by:
g ( z) = z 3 + z + 1
(4.41)
As the generator polynomial (4.41) has a degree of 3, the degree of the message polynomial is 4 and therefore the length of the codeword 7. The cyclic code generated in Table 4.8 represents a (7, 4) Hamming code in a cyclic form (Tab. 4.7).
4.3.1.3 Cyclic Redundancy Check Codes The Cyclic Redundancy Check (CRC) code [PeBr61] is one of the most widely used error detection code after parity check codes. CRC is based on polynomial arithmetic in Galois-Field with two elements, i.e. GF(2). CRC is performed using division of the message polynomial C(z) by the generator polynomial G(z), which is agreed between the two communication partners i.e. between the sender and receiver. The remainder of the division is taken as the CRC parity-check value, so called the CRC checksum. CRC codes are widely applied in communications: they are used for error detection in Ethernet, Hard Disks, Asynchronous Transfer Mode (ATM) and Global System for Mobile Communications (GSM) etc. The popularity of CRC is due to: – easy implementation (in software and in hardware – using LFSRs) – need for only very few redundancy bits, and – good detection capability of burst errors. Some of the example generator polynomials, along with the standards and protocols in which they are used, are given in Table 4.9:
Channel Coding 151
Tab. 4.9: Generator polynomials used in international standards and protocols.
Name
Polinomial
Example Protocols / Standards
CRC-8-CCITT
z8 + z2 + z + 1
Asynchronous Transfer Mode (ATM), Integrated Services Digital Network (ISDN), ITU-T I432.1, etc.
CRC-16-CCITT
z16 + z12 + z5 + 1
X.25, High-Level Data Link Control (HDLC), Bluetooth, Secure Digital (SD) memory cards, etc.
CRC-32
z32 + z26 + z23+ z23 + z22 + z16 + z12 + z11 + z10 + z8 + z7 + z5 + z4 + x2 + z + 1
ISO 3309, American National Standards Institute (ANSI) X3.66, Ethernet, Serial Advanced Technology Attachment (SATA), Moving Picture Expert Group (MPEG)-2, Gzip, PKZip etc.
The following example for CRC-16 shows a generation of a codeword from a 16-bit (k = 16) message C(z), using the generator polynomial of a degree r = 16, which is standard for CRC-16-CCITT (Tab. 4.9):
G ( z ) = z 16 + z 12 + z 5 + 1
(4.42)
C ( z ) = z 15 + z 14 + z 7 + z 6 + z 5 + z 4 + z 3 + z 2
(4.43)
The message is M(z) given by:
The checksum is calculated in four steps: 1. Multiplication of C(z) with zr: extension of the message by r = 16 bits on the right:
C ( z ) ⋅ z 16 = z 31 + z 30 + z 23 + z 22 + z 21 + z 20 + z 19 + z 18 2.
Division of M(z)∙zr by G(z):
R ( z ) z 31 + z 30 + z 23 + z 22 + z 21 + z 20 + z 19 + z 18 C ( z) z' = = Q( z ) + = G( z) z 16 + z 12 + z 5 + 1 G( z)
( z 13 + z 11 + z 7 + z 6 + z 2 + z+ 1) = ( z + z + z + z + z + z + z+1) + z 16 + z 12 + z 5 + 1 15
3.
(4.44)
14
11
10
5
(4.45)
2
The remainder z13 + z11 + z7 + z6 + z2 + z + 1 is taken as the CRC checksum value and appended it to the message to form the codeword V(z):
V ( z ) = C ( z ) ⋅ z r + R( z ) = z 31 + z 30 + z 23 + z 22 + z 21 + + z 20 + z 19 + z 18 + z 13 + z 11 + z 7 + z 6 + z 2 + z + 1
(4.46)
152 Information Theory and Coding
or in a binary form as [1 1 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 1 0 1 0 0 0 1 1 0 0 0 1 1 1]. The checksum verification is performed by the division of the received codeword by the generator polynomial: in case that the remainder is zero, the transmission (or storage) is declared as errorless and the message as correct.
4.3.1.4 Reed Solomon Codes Reed Solomon Codes (RS Codes) [ReSo60] are cyclic linear block codes. They are particularly suitable for correction of burst errors. RS codes are optimal codes. Optimal codes can be optimal with respect to the Hamming- or Singleton bound. The Hamming bound [WiSl77] is defined as:
∑ i ≤ 2 t
i =0
n
n−k
(4.47)
with n as the number of symbols of a codeword and k as the number of information symbols. The codes which satisfy the Hamming bound with equality are optimal with respect to the Hamming bound and are also referred to as perfect codes. The Singleton bound [Sin64] is defined as:
d ≤ n − k +1
(4.48)
The codes which satisfy the Singleton bound with equality are optimal with respect to the Singleton bound and are also called MDS (Maximum Distance Separable) codes. A (n, k, d)-RS block code is an MDS code and optimal with respect to the Hamming distance. A consequence is that any k symbols of the codeword uniquely determine the codeword. The greatest possible value of d is reached when d = n – k + 1, i.e. d = number of parity symbols + 1. If t symbol errors should be corrected, it is therefore:
n − k = 2⋅t
(4.49)
The Reed-Solomon codes are cyclic q-nary codes. The codewords consist of symbols that are elements of GF(q), with q = 2p, i.e. each symbol consists of p bits. For this reason, a RS code that can correct a symbol with up to p wrong bits, if these all belong to that symbol. Thus, it is able to correct an error burst.
Channel Coding 153
Message
Parity
Fig. 4.10: Codeword.
If t is the number of correctable symbol errors, the generator polynomial is constructed according to the following principle:
g ( z ) = ( z − α 1 ) ⋅ ( z − α 2 ) ⋅ ( z − α 3 ) ⋅ ... ⋅ ( z − α 2t )
(4.50)
where α is a primitive element of GF(q). The encoding is performed using the generator polynomial, as usual in case of cyclic codes: at first, the message is represented as a polynomial c(z) with coefficients from GF(q). It must be filled up with “zeros” to a multiple of p, if needed. Then the message polynomial is divided by the generator polynomial, and the division remainder r is attached to the message as a sequence of parity-check symbols of GF(q):
r ( z ) = c( z ) ⋅ z n− k mod g ( z )
(4.51)
The message c(z) together with the attached division remainder r(z) forms the codeword v(z) that is transmitted over the channel. Since the generator polynomial has the degree of 2⋅t, the number of parity symbols that are the remainder after the division of the message polynomial by the generator polynomial is:
n − k = 2⋅t
(4.52)
The length of the codeword equals to:
n = q −1 = 2p −1
(4.53)
k = 2p − d
(4.54)
d = 2⋅t +1
(4.55)
whereby:
and:
154 Information Theory and Coding
In shortened Reed-Solomon codes, the information part is filled with “zerosymbols” to form k symbols and then the parity-check is calculated. The appended zero symbols are not transmitted, as the receiver knows their positions in advance and must (re-)insert them for decoding. The Reed-Solomon decoder can correct up to t errors or up to 2⋅t erasures. An erasure exists when the positions of the erroneous symbols are known in advance. The information about the position of a wrong symbol is provided by the demodulator or a previous decoding component, if the signal received could not be uniquely assigned to a symbol. When a received word is decoded, there are three possible situations: 1. 2⋅s + r ≤ 2⋅t (s = number of errors, r = number of erasures) In this case, the original codeword can be reconstructed. 2. 2⋅s + r > 2⋅t In this case, the decoder cannot correct the word and reports an error, or 3. The decoder corrects wrongly and the decoded codeword is not the original one. Errors are not recognized if the error pattern is a multiple of the generator polynomial, so there is no remainder after the division of the erroneous codeword. The received codeword v(z) is given by:
v ' ( z ) = v ( z ) + e( z )
(4.56)
where v(z) is the original codeword polynomial and e(z) the error polynomial. At first, the syndrome is calculated and the error positions are tried to be localized and corrected (up to t errors or up to 2⋅t erasures). An RS codeword has 2⋅t possible syndromes (= remainders), that depend on errors and not on the original codeword. The syndrome can be calculated by insertion of the q2⋅t roots of the generator polynomial in v(z). The localization of the positions of erroneous symbols is performed by solving a system of equations with t unknown variables using efficient algorithms. They take into account the special form of the equation matrix that appears in RS codes. The calculation is done in two steps: 1. Calculation of the polynomial that describes the error positions using the Berlekamp-Massey [Ber68][ Mas69] or Euclidian algorithm [Suetal75] 2. Calculation of the roots of the polynomial using the Chien-Search-algorithm [Chi64]. Afterwards, the calculation of error values is performed by solving another system of equations with t unknowns. The Forney algorithm [For66] is usually used for this purpose.
Channel Coding 155
The encoding is considerably faster than the decoding, as the numerical complexity is much higher for decoding.
4.3.1.5 Low Density Parity Check Codes Low Density Parity Check Codes (LDPC Codes) belong to a class of linear block codes. They were first introduced by Gallager in his PhD thesis in 1960. The thesis was published 3 years later by MIT press [Gal63]. LDPC codes were largely ignored for the next 30 years because the encoding and decoding was considered impractical for the technology of that time. Moreover, Reed Solomon codes covered the gap for the next few decades. The only notable work on LDPC codes during this time was done by Tanner in 1981 [Tan81] where the graphical representation of the LDPC codes was introduced, which is now known as the Tanner graph representation. With the demonstration of the excellent performance of Turbo codes, researchers looked further for capacity approaching codes and LDPC codes were reinvented in 1993, largely credited to MacKay [KaNe95][Kay99] and Luby [AlLu96]. LDPC codes are widely used in many standards such as WiFi/IEEE 802.11, WiMAX/IEEE 802.16e, DVB-S2, Ethernet 10GbaseT, etc. Like all linear block codes, LDPC codes can be represented via matrices. The percentage of ones in the parity check matrix for an LDPC code is quite low, hence the name low density parity check codes. For example, a (n, m) code with n = 6 and m = 4 can be described by the following simple m×n parity check matrix H:
1 0 H= 1 0
1 0 1 0 0 1 1 0 1 0 0 0 0 1 1 0 1 1 0 1
(4.57)
If wc is the column weight (the number of ones in a column) and wr the row weight (the number of ones in a row), the following must hold for the low density matrix H: wc n2 (c1 < c2), there is a critical incident angle φ1c by which the angle φ2 of refracted ray is equal 90° (i.e. sinφ2 = 1), determined by:
sin ϕ1c =
n2 n1
(7.29)
Only the rays with incident angles greater than φ1c will be transmitted through a fiber, while the rays with incident angles less than φ1c will be refracted at the boundary between core and cladding and leak out through cladding. The critical angle of total reflection also determines the maximal incident angle φ0,max of rays from the source of light at the beginning by which the transmission through the fiber is still possible. The maximal angle is presented by the parameter named numerical aperture NA: 2
NA = n0 sinϕ 0,max = n1 sin(90o − ϕ1c ) = n1 1 − sin 2 ϕ1c = n1 − n2
n2 j0,max j1c n0
n1
j1 a
(7.40a)
CJν (ur )e jνϕ , for r ≤ a H z (r , ϕ ) = DKν ( wr )e jνϕ , for r > a
(7.40b)
302 | Wired Transmission Media
A, B, C and D are constants, a is the radius of the homogenous core of the waveguide with the refractive index n1 (the refractive index of cladding dielectric for r > a is n2), u and w are given by: 2 = 02 12 − 2 and 2 = 2 − 02 22, where k0 is free-space = 2 / 0 (c0 and λ0 are speed and wavelength wave number: 0 = / 0 = of the light wave in free-space, while µ0 and ε0 are magnetic permeability and electric permittivity of free-space), Jν(x) is the ν-th order Bessel function of the first kind (Fig. 7.13), and Kν(x) is the modified ν-th order Bessel function of the second kind (Fig. 7.13).
– – –
– –
Kn(x)
Jn(x)
1.5 1
J0 1
J1
K0
J2
0.5
K1
J3 0.5
5.520
2.405 0
2
LP01 (HE11)
4
LP11 (HE21 + TM01 + TE01)
8.654 6
8
LP12 (HE22 + TM02 + TE02)
LP02(HE11) LP21(HE31 + EH11)
x
0
1
2
3
4
x
LP13 (HE23 + TM03 + TE03)
LP03(HE13) LP22(HE32 + EH12)
LP31(HE41 + EH21)
LP32(HE42 + EH22)
LP41(HE51 + EH31)
Fig. 7.13: Bessel functions Jν(x) and Kν(x), and propagating modes.
From the previous expressions and Figure 7.13, it is obvious that electric and magnetic field are mostly distributed within the core of optical waveguide (i.e. fiber) but also they are present in the cladding. The energy of the fields in the cladding is much smaller than the energy in the core and it decays fast with radius r (for r > a), i.e. according to the descending trends of the modified Bessel functions Kν(wr). This fact is not in line with the geometric optics approach which does not assume the propagation of rays of light from the source through the cladding. The phase coefficient β can be found in a pretty complicated procedure [Tom92] [ZhBo90][Kiv93] where the boundary conditions at the intermediate surface between the core and cladding (for r = a) are applied on the tangential components of electric and magnetic field (Ez, Eφ, Hz, and Hφ). As a result, a characteristic equation for evaluation of β can be derived [Agr97] (avoided here due to complexity), which can be solved only numerically. Many solutions for β can be calculated from that characteristic equation, but all of them depend on ω, n1, n2 and integer parameter ν. Also,
Optical Cables | 303
the values of β for the real propagating waves can only be found within the following boundaries:
k 0 n 2 ≤ β ≤ k 0 n1
( n 2 < n1 )
(7.41)
and within this range the phase coefficient has a discrete number of solutions. For ν = 0 (electric and magnetic) fields do not depend on the angular coordinate φ and there are two groups of solutions for β, i.e. two groups of waves: 1. transverse electric TE0m waves, where Ez = 0 and 2. transverse magnetic TM0m waves, where Hz = 0 The second index m (m = 1, 2,…) is assigned to each particular solution for β i.e. to a concrete value β0m. For ν ≠ 0 fields change with φ and, instead TEνm and TMνm, the solutions of the characteristic equation are two groups of the so called hybrid waves: 1. HEνm waves, where the influence of Ez is more dominant than of Hz, and 2. EHνm waves, where the influence of Hz is more dominant than of Ez. Transverse TE0m and TM0m waves may also be considered as the special cases of HEνm and EHνm for ν = 0. Each solution βνm represents one possible mode of field propagation and each mode has its own (i.e. characteristic) spatial distribution of the field. Along the waveguide, the spatial distribution of optical field is constant for one propagating mode and the only thing that changes with a longitudinal coordinate z is the phase. Propagation coefficient β depends also on the frequency ω in a very complex way determined by its characteristic equation (where Bessel functions and their derivations are involved). Also, for each propagating mode the assigned βνm can be within the range defined by the inequalities in (7.41). As the consequence, each mode has its critical frequency ωc below which the concrete mode is not possible. In order to make solutions for β universal, i.e. independent on other waveguide parameters (core radius a and refractive indices n1 and n2 of the core and cladding), a parameter known as normalized frequency V is defined:
V = k 0 a n12 − n22 =
ω c0
a n12 − n 22
(7.42)
For transverse modes TE0m and TM0m (i.e. when ν = 0) the characteristic equation for β have a simplified form [Mar98]:
UJ 0 (U ) WK 0 (W ) =− J 0 ' (U ) K 0 ' (W )
(7.43)
304 | Wired Transmission Media
where U and V are normalized parameters defined as:
U = ua = a k 02 n12 − β 2
W = wa = a β 2 − k 02 n22
(7.44)
The connection between normalized parameters is: V2 = U2 + W2. According to (7.41) critical frequencies for different transverse modes of propagation can be found from the condition: (7.45)
β = k 0 n2
As a consequence, since then W = 0 and V = U, the characteristic equation (7.43) is maximally simplified, i.e. it becomes: (7.46)
J 0 (Vc ) = 0
where Vc is the normalized critical frequency. The roots of the zero-order Bessel function J0(Vc) are shown in Figure 7.13. From (7.42), there is:
ωc =
c0 Vc , 0 m a
1 2 1
n −n
2 2
,
λc =
2πc0
ωc
(7.47)
The procedure of finding critical frequencies of hybrid HEνm and EHνm modes of propagation when ν ≠ 0 is also based on solving the characteristic equation for β with the condition β = k0n2, but is more complicated. From the shapes of Bessel functions Jν(Vc) for different values of ν it can be noticed that there is a solution which has a critical frequency at Vc = 0, meaning that such a propagating mode can exist at all frequencies. This solution is for HE11 mode, i.e. when ν = 1 and m = 1, since only the function J1(Vc) has the first root at Vc = 0. The next two modes (see Fig. 7.13) that appear at Vc,01 = 2.405 are modes TE01 and TM01, following by HE21 and other higher modes – EH11, HE12, HE31, EH21, HE41, then at Vc,02 = 5.520 appear modes TE02, TM02 and HE22, etc. (7.47) is significant for the design of optical fibers since it shows how to manipulate the basic waveguide parameters regarding the frequency i.e. the wavelength. For example, the cut-off wavelength λc above which an optical fiber works in the so called single-mode regime (only HE11 mode) can be found from (7.47) for Vc,01 = 2.405. When the difference ∆ between refractive indices of the core and cladding of a fiber is less than 0.2 %, the fiber is with weak guidance of waves. For such fibers some approximations may be applied, after which as the solutions of the characteristic equation pairs of HE and EH modes which have the same propagation coefficientβ, appear i.e.:
Optical Cables | 305
) ( HE) β p( EH −1,m = β p+1,m , p = 1,2,...
(7.48)
It has been shown [Mar98] that the combinations of these pairs of HE and EH modes result in the specific modes with linear polarization of the waves (LP modes), which means that in every point of a fiber the field vectors oscillate along the line passing through the point in both directions (and all the vectors oscillate in parallel lines). The basic mode with linear polarization is LP01, which consists of two identical (mutually orthogonal) HE11 waves. The similar case is with higher LP0m modes (each one consists of two orthogonal HE1m waves). LP11 mode also has two orthogonal waves – the first is a combination of HE21 and TM01, and the second is the combination of HE21 and TE01 modes (Fig. 7.14). Higher LP1m modes are in the similar way the combinations of HE2m, TM0m and TE0m waves. Other LPpm modes (when p ≥ 2) are made of pairs of HEp+1,m and EH p–1,m waves. With all LP modes longitudinal components of field vectors (Ez and Hz) are very small compared to transverse ones, so LP modes can be observed as transverse waves. In the single-mode working regime of a fiber the only mode which propagates is LP01. LP11
LP11
LP01(HE11)
HE21
TM01
HE21
TE01
Fig. 7.14: Field distribution for LP01 mode and for two variants of LP11 mode.
7.3.2 Attenuation in Glass Fibers In Section 7.3.1 the lossless propagation of light through a waveguide is analyzed and the propagation in the direction of longitudinal axes z of waveguide is modeled using the phase coefficient βνm for each propagating mode. Since the transmission of light through glass fibers is lossy, a realistic model of propagation should also include the attenuation coefficient α, in a similar way as in the case of copper transmission lines.
306 | Wired Transmission Media
Attenuation in an optical fiber is the reduction of optical power as it travels along. It is expressed in decibels for a given fiber length or by an attenuation coefficient (in dB/km) at a specific transmission wavelength. In the past, attenuation was a big challenge for the production of optical fibers. For example, the fibers that would be made of the common window glass would have the attenuation of around 50,000 dB/km! Today, optical fibers have attenuation less than 0.2 dB/km as a result of very high purity of quartz glass (SiO2) the fibers are made of. This high purity assumes the presence of less than 1 mg of impurities in 1 t of quartz glass (i.e. < 10–6). The attenuation of optical power is the consequence of absorption and scattering. There are three types of absorption: ultraviolet (UV), infrared (IR) and the absorption due to unwanted or on purpose added impurities. The UV absorption, caused by electrons of the material, is significant in visible and ultraviolet range of wavelengths (for λ < 800 nm). The IR absorption affects wavelengths above 1600 nm as the consequence of oscillations of oxygen atoms in SiO2 molecules. Both UV and IR absorption are properties of material and they cannot be avoided. Because of absorption, a part of optical energy is turned into heat, which increases the temperature of fiber. The impurities in fiber should be reduced to the as low as possible level, since they cause the so called resonant absorption at the wavelengths specific for each type of impurity. Especially important is to offload the fiber from water molecules i.e. from OH– ions, since negative OH– ions are attracted by water molecules. The OH– absorption has the main resonant point at 2700 nm with the first and the second harmonic around 1390 and 950 nm. There is also another point of OH– absorption around 1250 nm. Scattering is a phenomenon when a part of light diverges from its direction of propagation. When a wave of light comes upon some inhomogeneity in the density or in the composition of material, or when the boundary between the core and cladding is not smooth enough due to variations of core diameter and other micro cracks, it may be diffracted in all directions. The scattering caused by inhomogeneity, which are distributed so that each inhomogeneity can be considered as the alone one, is called linear scattering. The most important linear scattering is Rayleigh scattering [Str71][Raj08] caused by thermal fluctuations of density within glass, since it contributes to the overall attenuation in the fiber with up to 90 %. This type of scattering depends on the wavelength of light (relatively to 1/λ4) and is a dominant factor of attenuation for λ < 1300 nm. As the summarization of the above mentioned effects, the attenuation coefficient can be mathematically presented as:
10 P A dB = − log 2 = 4 + B + C (λ ) L P1 λ km
α
(7.49)
Optical Cables | 307
5
4 -
3
OH absorption peak
850nm
1310nm 2
1
Rayleigh scattering + UV absorption
0.7
0.8
0.9
1.0
Third Window (The L-Band)
6
Third Window (The C-Band)
7
Second Window
Early 1980's Late 1980's Modern Fiber
First Window
Attenuation a[dB/km]
where L is the length of a fiber, P1 and P2 are powers of an optical signal at the beginning and the end of a fiber. The coefficient A models influence of Rayleigh scattering, coefficient B represents other effects which does not depend on wavelength, and C(λ) describes particular peak values of attenuation regarding the absorption due to impurities. The typical curve of attenuation within the range of wavelengths of interest is shown in Figure 7.15, where three regions with local minimums can be noticed – the so called optical windows. Hence, the light sources used in optical communications, i.e. the light emitting diodes (LEDs) and lasers, were being developed to operate at or around 850 nm (first window), 1310 nm (second window), and 1550 nm (third window). Over time, the process of production of optical fibers have constantly being improved, resulting in today’s fibers with very low “peaks of attenuation” between the “windows” (so called water peaks), making the division of wavelength range in optical windows obsolete. Also, the attenuations of modern optical fibers can be as low as 0.17 dB/km (at wavelengths around 1550 nm), as shown in Figure 7.15.
IR absorp.
1550nm
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
Wavelength [mm] Fig. 7.15: Attenuation characteristic of optical glass fibers.
There are also other potential sources of attenuation in fibers. For example, when the optical power exceeds certain threshold, the so called non-linear effects may arise, and among them, the Raman [Ram28][Koetal12] and Brillouin scattering [BuBo96]. Once when the non-linear processes in a fiber are triggered, large attenuations may appear and even the physical destruction of the fiber may occur. Very common are the losses due to micro- and macro-bendings in fibers, in which cases the conditions for total reflection of the propagating light are violated, so that a part of light is refracted out of the fiber. Micro-bendings are microscopic
308 | Wired Transmission Media
irregularities in fiber’s geometry – e.g. deviations of the surface between the core and cladding from the ideally cylindrical shape, or circular asymmetry of the fiber’s cross-section. Micro-bends may be the result of mechanical stresses during the production of fibers and optical cables, as well as by the improper installation of optical cables including their laying underground. The losses caused by micro-bending may be between 0.1 and 10 dB/km. When, for example, a fiber is curved (in a terminating or connection box) with the radius of the curvature less than ~ 10 cm, additional losses may appear. These losses, i.e. losses due to macro-bending are a consequence of improper installation of an optical cable. Finally, there are losses at places of connections of fibers, whether the connections are realized by mechanical connectors or by fusion splicing of fibers. Mechanical connectors are placed at the termination points of an optical line, while fusion splice connections are distributed along the whole span of optical trace on every 2– 5 km (depending on the length of optical cable segment delivered by producer, which is wound up the drum before burying underground). Losses introduced by mechanical connectors depend on their quality and they are given by manufacturer as nominal values (commonly 0.15 dB or more). On the other hand, these losses are very sensitive on the particles of dirt, as well as on the possible scratches of the frontal surface of fiber which is ended with connector. The additional attenuation may be several times bigger than nominal, so the handling with optical connectors should be very careful.
x-axis positioning
Display y-axis positioning
Primary protected !ber - coating
Naked !ber - cladding (stripped primary coating) Cleaving
Tension
-
+
Initial crack
Splicer
Tension
Primary protected !ber - coating
Flat surface
Electrodes
Fiber without primary protection
Fig. 7.16: Preparation of ends of fibers and splicing.
Holding clamp
Mounting blocks
Optical Cables | 309
A fusion splice between two fibers may also introduce some attenuation. This type of connection is performed on the field using the sophisticated and very precise instruments – fusion splicers, where the ends of two fibers are merged by melting using an electric arc, i.e. spark between electrodes. Quality of a splice connection (and so the splice attenuation) depends on many factors such as: power and duration of spark, cleanness of electrodes, timing parameters of splicer, environmental conditions, (im)proper handling with fibers’ ends, differences between core diameters of two fibers, declinations of the axes of fibers by splicing etc. Generally, splice connections are better than mechanical connectors, since they introduce less attenuation – on average less than 0.1 dB. Attenuation may be significantly greater, so the quality of splicers and fibers, as well as the careful handling with fibers by technicians on the field are very important. All the above mentioned factors of attenuation have to be taken into account when calculating the so called power budget of an optical communication system. Total power loss of an “optical line” must be less than the difference between optical power of the light emitting source and the sensitivity of the photo-diode in a receiver. The power budget of an optical fiber terminated at both sides with mechanical connectors (mounted in distribution frames or patch-panels) can be estimated using the following expression: Atotal [dB ] = 3 Ac + L α + N s ⋅ 0 . 05 dB + M
(7.50)
where: – Atotal is the total loss of an “optical path” between a transmitter and receiver, – Ac is the nominal attenuation of the connector specified by manufacturer; (it is assumed that each end of the fiber is connected to the transmitter i.e. receiver using the single fiber patch-cord with connectors at both ends; therefore, Ac is multiplied by 3 to take into account losses of two mechanical connectors at fiber ends, and two halves of the losses at the connection points of patch-cords to the transmitter and receiver), – L is the length and α is an attenuation coefficient of the fiber at a working wavelength, – NS is the number of splices along the fiber (taking into account also two additional splices with the “pig-tailed” connectors in distribution frames/patchpanels at both ends), – M is the additional margin (1–2 dB)
310 | Wired Transmission Media
7.3.3 Dispersion in optical fibers Dispersion is the main factor that limits bandwidth i.e. the information capacity of an optical fiber. Similarly, as in case of copper lines, dispersion is manifested with broadening of (optical) pulse signals during the propagation. It is also caused by different delays of propagating waves at the fiber’s end. The differences of delay in optical fibers are caused by different velocities of propagating modes at the same frequency i.e. at the same wavelength (intermodal dispersion), but also by the dependence of a velocity on frequency for each propagating mode (chromatic dispersion). Hence, it is not enough to know only the values of propagating coefficients of various modes βνm at the (optical) carrier frequency, but also how each βνm changes its value in the whole range of the signal frequencies. When a fiber works in the so called multimode regime, both intermodal and chromatic dispersion are present. Which type of dispersion will be dominant depends on the type of excitation of the fiber, i.e. on the source of light of transmitter. If a LED is used as the source, both intermodal and chromatic dispersion are significant (although intermodal dispersion is dominant), while when a laser is used – chromatic dispersion can be neglected regarding the intermodal dispersion, since laser source has much narrower spectrum of emitted light than LED. In terms of geometric optics, intermodal dispersion can be explained by various rays of light (at the same wavelength) which pass through the fiber core under different angles of total reflection (different “modes”), so between the beginning and the end of fiber there are many paths with different lengths, resulting in different signal delays at the receiving side. The explanation that gives electromagnetic wave theory is that each mode has its own propagating coefficient βνm, and since the phase velocity of a mode is cνm = ω/βνm, each mode has a different delay for the same ω. Because of intermodal dispersion, the bandwidth of fibers in multimode regime is much smaller than in the single-mode regime.
Optical Cables | 311
Light Source
Detector Intermodal Dispersion
l4 l3
l1
l4
{
{
{
l2 l1
l3
l2
Dt Dt Dt
Detector
Light Source Chromatic Dispersion
Fig. 7.17: Intermodal and chromatic dispersion.
In the single-mode regime only the basic HE11 (i.e. LP01) mode propagates along the fiber, where an intermodal dispersion does not exist. If the propagation coefficient β of that mode would have a linear dependence on frequency, i.e. when β(ω) = Kω, chromatic dispersion would not exist, since all the propagating frequencies (of a digitally modulated optical carrier) would propagate at the same speed. This is not the case for real fibers. The existence of higher (non-linear) members within the dependence function β(ω) leads to different group velocities of a modulated signal at different wavelengths. This means that a group delay, defined as:
τg = L
dβ (ω ) dω
(7.51)
is a function of a frequency i.e. of a wavelength. Chromatic dispersion presents the variation of a group delay by variation of a wavelength λ for a given fiber length L: d CD =
1 dτ g ⋅ L dλ
(7.52)
Chromatic dispersion is often expressed in ps/(nm⋅km), showing in how many ps the signal at a wavelength λ0 arrives to the end of the 1 km long fiber, before or after the signal at a wavelength λ0 ± 1 nm does it. Since β depends on a very complex way on a frequency and fiber parameters (core diameter, refractive indices and electric and magnetic permeability of core and cladding), it is obvious from (7.51) and (7.52) that dCD is even a more complex function of the same parameters (e.g. the sec-
312 | Wired Transmission Media
ond derivation of β(ω) has to be found). Nevertheless, it is possible to influence as well as to minimize the chromatic dispersion at working wavelength by a proper choice of mentioned parameters. Also, there are several empirical expressions that describe group delay in a simpler way, as well as numerous techniques that mitigate the effects of chromatic dispersion. Another thing that should be mentioned as the possible limiting factor of information capacity of a single-mode optical fiber is the so called polarization mode dispersion (PMD) [PoWa86]. This type of dispersion may arise as a problem at bitrates of 10 Gbit/s and more. Since real optical fibers are not ideal, they may have anisotropic refractive indices, meaning that different polarizations of light propagate at different velocities. As mentioned earlier, each propagating mode in a fiber may have its paired mode with the same propagation coefficient, but orthogonally polarized. The same case is with the single-mode regime where two orthogonal linearly polarized LP01 waves propagate at the same time. Due to anisotropy, the two waves may have different delays and, as the consequence, the optical pulse at fiber’s end is broadened and additionally attenuated. So, beside PMD, there is also the so called polarization dependent loss (PDL) [GaMe05].
DGD
Slow Axis
PDL Fast Axis
Fig. 7.18: Polarization mode dispersion and polarization dependent loss.
Unlike chromatic dispersion, PMD is not constant in time, since its value fluctuates due to outer factors such as mechanical strains and pressures, or temperature. Therefore, PMD is treated statistically, and the main parameter that describes PMD in a fiber is a mean differential group delay (mean DGD or PMD delay) expressed in picoseconds. Since a PMD delay is proportional to the square root of a fiber length, a PMD coefficient expressed in ps/km1/2 is also introduced. Thanks to the improved production process, modern fibers have a lower PMD than e.g. 10 years ago. There are also several ways of mitigating the impairments caused by PMD. The rule of thumb is to keep PMD delay less than one tenth of the duration of one bit for a particular optical communication system.
Optical Cables 313
7.3.4 Types of optical fibers and cables Many types and constructions of optical fibers were designed for various applications during the last four decades of development. According to the material the optical fibers are made of, there are three types: silica glass fibers, plastic fibers and Plastic-Clad-Silica (PCS) fibers. Glass fibers have both the core and cladding made of doped SiO2. Glass fibers are the most used in communications systems, since they have small losses and provide high bandwidths. Plastic optical fibers (POFs) have large diameters of the core and cladding (480/500, 735/750 and 980/1000 µm), large attenuation (more than 1 dB/km at 650 nm) and small bandwidth (around 5 MHz⋅km). On the other hand, they are much cheaper than glass fibers. PCS fibers with SiO2 core and plastic cladding were the compromise between the first two types. The main categorization of optical fibers is done by the type of propagating waves. Regarding this, there are two categories: multi-mode (MM) and single-mode (SM) fibers. The single- i.e. the multi-mode “regime of work” has been mentioned in the previous sections, meaning that for given parameters only one or more modes propagate through a fiber. Namely, if the working wavelength λ of an optical communication system is greater than the critical wavelength λc (also called cut-off wavelength), the fiber is in the single-mode regime, otherwise – in the multi-mode regime. The cut-off wavelength is one of the main parameters of a fiber. The relation between λc and other basic parameters is given by (7.47) forVc,01 = 2.405, giving the condition for single-mode regime:
a 0.27 < λ n1 ∆
(7.53)
Since the refractive indices of fiber’s core and cladding n1 and n2 are adjusted to be around 1.45 and with relative difference ∆ within the range of 0.20–0.39 %, it is obvious that previous condition is fulfilled only when the ratio 2a/λ is less than ~ 10. On the other hand, lasers in optical communications have been designed to work either in the second (around 1310 nm) or in the third optical window (around 1550 nm) due to the attenuation characteristic of glass fibers. Therefore, singlemode fibers must have the diameter less than ~ 11 µm. As a compromise, since a wider core allows the more efficient and simpler injection of light from (laser) source to fiber and it is easier (and less expensive) to produce the fibers with greater core diameters, the practical result today is that all standard single-mode fibers have core diameters between 8 and 10 µm.
314 | Wired Transmission Media
The criteria for the diameter of cladding is not so strict, since in a single mode regime the distribution of optical power is such that only 4–8 % of total power pass through cladding. The cladding diameter has to be just large enough to prevent leaking of light outside the fiber, but it should be also properly chosen to obtain desirable mechanical characteristics of the fiber in the whole. These characteristics should enable easy manipulation, for example by splicing or by making mechanical connection with another fiber when the primary protection (made of plastic) has to be removed at the fibers’ ends. Typically, both SM and MM fibers have the same cladding diameter of 125 µm. Since the intermodal dispersion is impossible in SM fibers, they have much higher capacities than MM fibers, as well as lower attenuation. After more than 30 years of research and development in optical communications, several types of SM fibers are standardized by ITU and used worldwide today. In the beginning, SM fibers were designed and optimized to work in the second optical window, resulting in the most deployed type – the standard mono-mode fiber (SMF), also referred as “non-dispersion-shifted fiber” (NDSF) or ITU G.652 fiber [G652]. These fibers have the following characteristics: – Cut-off wavelength: ≤ 1260 nm – Diameter (core/cladding): 9/125 µm – Attenuation: ≤ 0.4 dB/km at 1310 nm; ≤ 0.25 dB/km at 1550 nm – Chromatic dispersion: 0 ps/(nm⋅km) at 1310 nm; 17 ps/(nm⋅km) at 1550 nm – PMD coefficient: ≤ 0.5 ps/km1/2 (earlier); ≤ 0.2 ps/km1/2 (now) – Bandwidth: ~ 100 THz The next type of SM fibers was optimized for the third optical window with the expansion of the systems at working wavelength of 1550 nm. The main goal was to minimize dispersion in this area. The outcome was the design of the “dispersionshifted fiber” (DSF) i.e. ITU G.653 fiber [G653]. DSF has similar characteristic as SMF except chromatic dispersion, which is now 0 ps/(nm⋅km) at 1550 nm. Still, as the Dense-Wavelength-Division-Multiplexing (DWDM) systems have being grown, it became clear that DSFs are not suitable for such applications because of emerging of a non-linear effect known as “four-wave mixing” [Blo79]. As the solution of this problem, the third type of SM fibers was developed – the “non-zero dispersion-shifted fiber” (NZ-DSF) or ITU G.655 [G655]. It turned out that a small amount of dispersion (but not zero) eliminates four-wave mixing, so NZ-DSFs have chromatic dispersion of up to ±4 ps/(nm⋅km) at 1550 nm and up to ±10 ps/(nm⋅km) in the range between 1450 and 1600 nm. There are more variants of NZ-DSFs on a market today with a small positive (NZD+) or small negative dispersion (NZD–) at 1550 nm. NZ-DSF is almost the only type of SM fibers deployed nowadays.
Optical Cables | 315
SMF (G.652)
Dispersion dCD[ps/nm.km]
20 17ps/nm.km
15 10 5 1310nm
0 1.2
1.3
1.4
1.5
1.6
-10 -15
1.7
l [mm]
-5 1550nm +
NZD (G.655)
NZDDSF (G.653)
-20
Fig. 7.19: Dispersion characteristics of standard SM-Fibers.
Beside these three main types, a few other categories and variants of SM fibers have also been standardized, such as “Low Water Peak Fiber” (ITU G.652C [G652] or TIA 492CAAB/OS2 [492CAAB]) or “Bend-Insensitive Fiber” (ITU G.657 [G657]). Especially significant for the future systems are the “polarization-maintaining” (PM) fibers. PM fibers decline from standard design, i.e. they may have an elliptical, rectangular or other non-circular shape of a cross-section of the core and cladding, or in some cases they may have two additional circular structures (the so called stress rods) outside the core. These modifications enable only one polarization of light to propagate through a fiber, thus increasing the overall bandwidth i.e. the information capacity of the fiber. The second large group of optical fibers – multimode fibers, started its development closely after the historical breakthrough in 1970, when the first single-mode fiber with attenuation of 20 dB/km was made [MaSc72]. Namely, the developers were suspicious about the possibility of achieving the very narrow tolerances regarding the efficient light coupling into SM fibers, as well as the connecting and splicing of SM fibers. The result was the design of multi-mode fibers, which have several times larger core than SM fibers. The first standardized MM fibers had a core diameter of 62.5 µm with a cladding diameter of 125 µm (the same as SM fibers). In order to support bitrates of 10 Gbit/s, newer generations of MM fibers have the 50/125 µm core-cladding ratio. As mentioned earlier, due to intermodal dispersion and higher attenuation, MM fibers have a much lower capacity, i.e. a much lower effective bandwidth-distance product than SM fibers. On the other hand, larger core allows the injection of light not only from laser sources of coherent light at 1300 nm, but also from LEDs with incoherent light at 850 or 1300 nm (in the first and the second optical window), which makes the optical systems with MM fibers significantly cheaper than those with SM fibers. In
316 | Wired Transmission Media
last few years, the low-cost “Vertical Cavity Surface Emitting Lasers” (VCSELs) are used instead of LEDs, which can work also at wavelengths from 850 to 1300 nm. ISO/IEC and TIA/EIA recognize four categories of MM fibers, with the designation OMx (“optical mode” x) [ISO11801]: – OM1, the oldest; dimensions of core and cladding: 62.5/125 µm; bandwidthdistance product at 850/1300 nm: 200/500 MHz⋅km (i.e. GHz⋅m) – OM2, typical 50/125 µm graded-index fiber, also standardized by ITU G.651 recommendation; bandwidth-distance product at 850/1300 nm: 500/500 MHz⋅km – OM3, laser-optimized 50/125 µm fiber (optimized for VCSELs at 850 nm and for 10 Gbit/s systems); bandwidth-distance product at 850/1300 nm: 2000/500 MHz⋅km; maximal link span: 300 m at 10 Gbit/s; also can be used at 40 Gbit/s – OM4, laser-optimized 50/125 µm fiber, designed for 10, 40 and 100 Gbit/s; bandwidth-distance product at 850/1300 nm: 3600/500 MHz⋅km; maximal link span: 400 m at 10 Gbit/s, 150 m at 100 Gbit/s [492AAAD]. All standard multi-mode fibers have attenuations of around 3 dB/km at 850 nm and around 1 dB/km at 1300 nm, but these properties are not very important, since MM fibers are already limited to the shorter distances due to intermodal dispersion. The third criterion for classifying optical fibers is the profile of refractive index in the core and cladding. The refractive index of a light transmitting material represents the ratio between the speed of light in free space and the speed in that material at given wavelength i.e. at given frequency. To avoid confusion, it should be mentioned that the wavelength always refers to the wavelength of light in free space and not in fibers (where the real wavelength is shorter). The only thing that is constant in both free space and fiber is the frequency. For example, the light with the wavelength in free-space of λ = 800 nm has the frequency f = c0/λ = (3⋅108 m/s) / (0.8⋅10–6 m) = 3.75⋅1014 Hz. When this light travels through a glass with refractive index n = 1.45, the real wavelength of light in such a material is λm = (c0/n)/f = λ/n ≈ 552 nm, while the propagation velocity is around 2⋅108 m/s. Silica glass has refractive index of about 1.5. It can be finely tuned to the desirable values by doping with various materials. For example, the refractive index increases when glass is doped with germanium dioxide (GeO2) or aluminium oxide (Al2O3). Also, it can be decreased when e.g. boron trioxide (B2O3) is used for doping. This way, during the production of optical fibers it can be achieved that refractive index in the core and cladding of an optical fiber has constant or variable values with regard to the distance from the central axis of the fiber.
Optical Cables | 317
Fastest Mode
Dispersion
n2 n1
MM Fiber (step-index)
Slowest Mode
Dispersion
n2 n1
MM Fiber (graded-index)
n2 SM Fiber (step-index)
n1
Fig. 7.20: Refractive index profiles.
When refractive index in the fiber’s core has a constant value n1, and in the cladding has another constant value n2, the fiber is referred to as “step-index” (SI) fiber. If the refractive index has a maximal value n1 in the core center and gradually decreases with the distance from center, fibers are called “graded-index” (GI) fibers. Different combinations of these two types of index profile in core and cladding are possible; some of them are shown in Figure 7.20. It is much easier to produce a step-index than a graded-index fiber. Graded index profile is applied to multi-mode fibers in order to reduce intermodal dispersion. Namely, different propagating modes have different velocities, but they also have different spatial distributions of fields. A fiber with graded profile has smaller refractive index at the core periphery than in the center. Therefore, the slower modes are speeded up relatively to the faster ones. In a simpler way this can be explained as that now the total reflection of rays of light occurs gradually within core (not only at the core-cladding boundary) and the rays with different paths converge more to each other. In this way, the intermodal dispersion of graded-index fibers is reduced about ten times in comparison to the earliest step-index multi-mode fibers. Also, the profile of refraction index can be optimized for the specific light sources, as it is done with OM3 and OM4 fibers for VCSELs at 850 nm. All standard MM fibers used nowadays are of GI type. On the other hand, singlemode fibers have a stepped profile of a refractive index. Recently, the SM fibers with graded index have been produced, which further improves their (chromatic) dispersion characteristics.
318 | Wired Transmission Media
7.3.5 Construction Structures of Optical Cables and Their Applications The basic construction element of an optical cable is an optical fiber with its primary protection made of polyethylene or another polymer material. The protective polymer layers, also known as a coating buffer, are applied around fiber’s cladding during the fiber drawing process. Although a pure silica fiber with the 125 µm cladding diameter has a high mechanical strength against pulling, as well as against bending to certain extent, a fiber without primary protection still could not endure all the mechanical stresses during cabling and installation processes. The coating also protects the fiber from moisture. The diameter of a primary protection layer is most often 250 µm (coated fibers) or 900 µm (tight buffered fibers), which gives to the whole structure an appearance very similar to a fishing line. For example, before cabling process (i.e. putting into the internal construction elements of cables), a few kilometer long coated fibers are wound up the 25–35 cm wide coils. Unlike the older optical fibers with only one coating layer, the most recent fibers have two layers of primary protection – the first one (next to the cladding) as the “shock absorber” to prevent the impairments due to micro-bending, and the second layer as the shield to protect the fiber from the outer lateral forces. During the fiber manipulation, by splicing two fibers or making connector terminations, the primary protection has to be removed mechanically at the fibers’ ends, and for this purpose a specialized stripping tool is used. Tight Buffer (900 mm)
50 mm
62.5 mm
250 mm
6
12
Coating 125 mm
Core
(9 mm)
Core
Cladding SM Fiber 9/125 mm
MM Fiber 62.5/125 mm
MM Fiber 50/125 mm
Fig. 7.21: Optical fibers with primary protection.
Depending on the application and capacity demands, many different constructions of optical cables are present on the market. For indoor applications, e.g. in telecom buildings, when various short range connections between terminal equipment ports and optical distribution frames or patch-panels have to be realized, the cables with only one optical fiber are used. A typical one-fiber cable, also known as patch-cord, consists of a primary protected optical fiber tightly enclosed by three additional protection layers – thermoplastic over-coating (i.e. 900 µm tight buffer made of hard
Optical Cables | 319
plastic), a layer with the strengthening aramid fibers (e.g. Kevlar), and the outer plastic jacket. Both ends of one-fiber cables are terminated with connectors. One-Fiber Cable
72-Fiber Cable
Central Strength Element Plastic Jacket Aramid Yarns 12-Fiber Cable 0.9mm Buffered Fiber Sub-Unit Jacket Aramid Yarns Rip Cord PVC Outer Jacket Fig. 7.22: Examples of indoor (tight buffered) optical cables.
Beside one-fiber cables, the cables with two or more fibers are also used in indoor environments, especially when the saving of space is important. A typical “tight buffered cable” consists of several 900 µm buffered fibers within the aramid yarn enclosed with the outer jacket (sheath) made of PVC. Fibers may be also organized in sub-units stranded around the central strength element of the cable. For longer distances, practically for all outdoor installations and outside-plant cable systems, the more robust construction structures are necessary. The cables for these purposes may contain from several to even a thousand fibers, which are organized within the loose structures of secondary protection. Each of these structures contains certain number of 250 µm coated fibers (depending on the capacity of cable), where each fiber can freely move and take various positions within. Because of that, all the fibers in a cable are relaxed after e.g. unrolling the cable from the fabric drum when being buried underground, or by their elongation (or shrinking) due to temperature changes. There are two construction types of the loose structures for secondary protection. The most used type is the loose-tube construction i.e. the small tube made of semi-rigid plastic. The other type is gutter structure which is cut into the central element of cable. Both tubes and gutters have a helicoidally shape around the central element (Fig. 7.23) and may be filled with a gel compound to prevent water penetration. Because of stranding, the fibers within a cable may have up to 3 % greater length, the so called optical length of a cable, than cable. The central element of an optical cable is mostly made of a light weight solid dielectric, which serves as the strengthening element as well as the core foundation around which the tubes of secondary protection (with fibers inside) are wrapped or
320 | Wired Transmission Media
cut in case of a gutter structure. Finally, the whole structure is enclosed by an outer jacket (sheath) mostly made of polyethylene. There are also many different constructions with more jackets, then armored structures, e.g. for submarine cables, etc. Rigid Central Strength Member
Rip Cord 0.25mm Coated Fiber
Gel Filling 6-Fiber Ribbon Hollow Tube PE Sheath Central Element
0.25mm Coated Fiber
Gutter
Rip Cord
Strength Member
PE Sheath 48-Fiber Cable (Loose-Tube Structure)
144-Fiber Cable (Gutter Structure)
Fig. 7.23: Examples of outdoor (loose structure) optical cables.
One interesting example of dual-purpose cables is the “optical ground wire” (OPGW) cable, which is used for the transmission of electric power and as communication line at the same time. In OPGW cables, the central ground wire of the power line is hollow, containing either a complete plastic optical cable with a smaller diameter, or small metal tubes with the primarily protected optical fibers inside. The applications of optical fibers and cables are numerous. They are used for both short and long distance connections – from less than one meter to connect a desk-top computer with the nearest router, to the several thousand kilometer long undersea cables carrying a large number of different high-capacity channels. Internet traffic between the routers of internet providers of all levels, telephony traffic both nationwide between the switches of fixed and mobile telephony and international (i.e. inter-connecting traffic between operators), data traffic over long LAN segments (e.g. Carrier Ethernet), IPTV and aggregated internet traffic from multiservice access nodes (MSANs) and/or DSL access multiplexers (DSLAMs) to the content providing points, etc., cannot be imagined without the usage of optical cables. There are also metropolitan and national CATV systems with their backbone networks based on optical transmission, as well as different realizations of “FTTx” (Fiber-To-The-Home, -Premise, -Building, -Cabinet…), which are increasingly being installed in many countries. Single-mode fibers, i.e. the cables containing SM fibers are typically used for distances longer than few kilometers. Optical communication systems used by telecom companies, internet and other service providers, have mostly transmission
Optical Cables | 321
rates of 10 or 40 Gbit/s per data channel over distances of 10 km and more, i.e. up to ~ 140 km without optical amplifiers and/or regenerators along the line. The systems with 100 Gbit/s are also in use, and even higher bitrates per channel are expected in the near future. Today’s DWDM systems may have up to 160 simultaneous optical data channels (i.e. wavelengths) per fiber, which can obtain a total throughput of several Tbit/s over only one SM fiber. Finally, considering that an optical cable may contain several hundred fibers, it might seem that the transmission capabilities of optical cables are practically unlimited. New services and multimedia applications are continuously demanding increasingly higher bandwidths and bit rates, whereby “transmission bottlenecks” are not at the optical cables’ side; neither is it expected to be in the near future. The limiting factors are at network nodes and other points at the moment, where large quantities of data have to be processed in a short time. Multi-mode fibers are mostly used for distances up to few hundred meters as backbone segments of local area networks (LANs) or within storage area networks (SANs), since their installation is cheaper than the installation of SM fibers. The data rates are typically up to 10 Gbit/s, but the rates of 40 and 100 Gbit/s are also possible using the newer types of multi-mode fibers (OM3 and OM4). Beside in communications, where optical cables play dominant roles in almost all segments, they are used in other areas as well. For example, in the industrial applications and controlling systems with high electromagnetic noise, optical fibers are very suitable for the transmission of control signals, signals from sensors or video signals from thermal cameras etc., since they are not affected by external electromagnetic disturbances. Also, in high flammable environments, where very rigorous safety requirements have to be fulfilled and where copper cables and connectors are a big security risk (due to possibility of sparks, overheating, etc.), optical fibers are practically inevitable. In modern medicine, the use of optical fibers is very important. Due to the very thin and robust structure, optical fibers are used in many medical instruments for visual inspection of internal body parts, or in telemedicine for the transmission of high resolution diagnostic images. Medical lasers use special optical fibers for transmission of high energy light pulses. Plastic and multi-mode optical fibers are used in Hi-Fi and home entertainment devices at short distances. The fiber-optic based telemetry and intelligent transportation systems, sensory devices for measurements of temperature and pressure, wiring in airplanes, automobiles and military vehicles are among many other areas where optical fibers are used.
322 | Wired Transmission Media
7.3.6 Historical Milestones and Development Trends in Fiber Optics The amount of traffic carried by optical networks has been constantly enlarged – on average it is multiplied by 100 every ten years. This expansion is the result of a continuous development and improvements in various domains. The increase of transmission capacity has been maintaining through introducing multiplexing techniques using wavelength, amplitude, phase and polarization of light as the new degrees of freedom for information encoding. Until the end of 1990s, the only practical coding technique was “On-Off Keying” i.e. binary amplitude coding, where the laser is turned on or off to define bits “1” and “0”. In this way, the spectral efficiency (see Chapter 5) of optical fibers was limited to 1 bit/s/Hz, and it was much smaller in most of practical implementations. Several key discoveries in the domain of fiber optics have led to today’s optical networks. Among these breakthroughs, first of all, the development of the low-loss single-mode fiber (led by Charles M. Kao before 1970 [KaHo66]) should be certainly included. As the consequence of the specific attenuation characteristic of fibers (optical windows), the development of optical systems working around 850 or 1310 nm has been forced. After that, the development of InGaAs photo diodes [PeHo78][Kat79][Peetal83] moved the emphasis (in the beginning of 1990s) in the third optical window, i.e. at the wavelengths around 1550 nm (where the attenuation is the smallest), thus extending the maximal reach of optical systems. At about the same time, the invention of erbium-doped fiber amplifier (EDFA) [Meetal87][Deetal87][Ale97] with its constant amplification characteristic in a wider range of wavelengths, i.e. in the “extended third optical window”, increased further maximal spans, due to removing of expensive signal regenerators along the fiber. The real revolution for EDFAs started with wavelength division multiplexing (WDM) [ToLi78][Isetal84], which led to the 1000-fold increase of the fiber capacity in a period of several years period that followed. Namely, the property of EDFA of simultaneous amplification of multiple wavelengths hugely contributed to the development of Dense-WDM (DWDM) systems [Chetal90][G694.1] with many 10 Gbit/s signals over different wavelengths. Numerous techniques for mitigating the chromatic dispersion, PMD and other impairments, as well as the development of more advanced laser sources and other components, played an important role in this “explosion of bandwidth”. In the first decade of the 21st century the increase of fibers’ bandwidth came mostly as the result of new and more complex optical modulation formats together with coherent detection, enabling the use of phase of light for encoding. Finally, few years ago, the polarization-division-multiplexing (PDM) doubled the fibers’ capacity, so that the DWDM systems with 100 Gbit/s signals per wavelength and with the gap between carrier wavelengths of 50 GHz are practically becoming standard. One of the most used modulation formats in such systems is Dual-Polarization Quadra-
Optical Cables | 323
ture-Phase-Shift-Keying (DP-QPSK) [Ful08][Beetal09][LT10][Swe12][Swetal12][Cretal12][Beetal09]. Due to continuous technological improvements, which include some of the mentioned above, but also the newest techniques of electronic digital signal processing (DSP) and advanced coding schemes (e.g. turbo-codes [Djetal09][Fuetal13] [WuXi05][Koetal14]), the spectral efficiency of WDM systems is going to reach its maximum. Spectral efficiency, as the amount of information that can be transported per unit bandwidth for each WDM channel, can have a maximal theoretical value determined by Claude Shannon in 1948 as log2[1 + SNR] [Sha48] (see Chapter 5). The key factor that limits the power of the usefully sent signal through the fiber for a given transmission distance (and this way limits Signal-to-Noise-Ratio) is the so called Kerr effect [Wei08]. This non-linear effect is manifested through the change of the refractive index of the fiber when the power of an optical signal at a certain wavelength is strong enough, which can consequently cause the distortions of other wavelength channels in a (D)WDM system (see Chapter 6). Taking this into account, it is estimated and taken as the consensus, that the maximal practical spectral efficiency of WDM systems is ~ 10 bit/s/Hz. The experimental results with high spectral efficiencies approaching this limit were achieved in 2010 (e.g. 8 b/s/Hz over 320 km distance with total throughput of 64 Tbit/s [Zhetal10]). The total bandwidth of single-mode fibers that can be exploited by current fiber optic systems is around 11 THz. The main limiting factors are the characteristics of optical amplifiers at receiving side. A rough estimation of maximal possible throughput of a single fiber would give a value of (11 THz)×(10 bit/s/Hz) = 110 Tbit/s. This value shows that there is still a room for the improvements in the forthcoming years. At one moment, it seemed that the information capacities that would provide optical cables and DWDM systems in the near future would be practically endless, especially having in mind the new even tighter cable constructions with even larger number of fibers, as well as the fact that after huge expansion and deployment of optical cables in early 2000s, there is still a significant amount of unused (i.e. dark) fibers worldwide. Although such large capacities seem to be “more than enough” for current demands, and the “legacy from the past” can support the future growth, according to some predictions, this growth will still be not enough to sustain the trend of the multiplication of information capacity by 100 every 10 years, as it is estimated for the future capacity requirements. From the moment when bandwidth demands exhaust the capacities offered by conventional single-mode fiber-based systems, without new breakthroughs and innovations, the only way for capacity increase would be the installation of additional parallel systems, which is a very bad solution considering costs and power consumption. There are several possible directions for the future development of optical communications. One of them includes the use of spatial multiplexing over new
324 | Wired Transmission Media
fiber types different from standard single-mode fibers; the other direction assumes development of broader-bandwidth amplifiers. There are also the reduction of fiber attenuation and the mitigation of nonlinear effects, etc. Each way opens different approaches. Recently, Space Division Multiplexing (SDM) (see Chapter 6) has attracted a great attention of many researches, since it promises to open two entirely new degrees of freedom to orthogonally multiplex data, which could extend the fibers’ capacity by more than 100 times. The first new degree of freedom introduced by SDM can be exploited using the so called Multi-core fibers (MCFs) [Saetal12][Saetal12-2]. Unlike standard single- or multi-mode fibers, MCFs incorporate more than one core within one common cladding, where each core is able to transmit its own optical signal (i.e. WDM signal) independently. The earliest developed MCFs had typically 3 or 7 cores, but there are also newer designs with 12, 19 or even 36 cores. The other approach in SDM which actualizes the next degree of freedom is Mode-division multiplexing (MDM) [Gretal12][Ryetal12], where information is transmitted using different propagating modes of fibers as separate channels. For this purpose, the “few-mode fibers” are designed, also named Multi-mode fibers (MMFs), which have a slightly increased core diameter (compared to that of the standard single-mode fibers). Therefore, MMFs transmit the first few propagation modes. So far, some very impressive experimental results in the field of spatial multiplexing have been achieved – e.g. spectral efficiency of 91 bit/s/Hz using 12-core MCF for 52 km long fiber [Taetal12], or 12 bit/s/Hz using 6-mode MMF in 112 km long fiber [Sletal12]. Also, the design of multi-core fibers containing multi-mode cores is very popular among developers. The latest record (from April, 2015) was set by the team of Japanese researchers with the use of the 5 km long 36-core MCF with 3-mode cores (i.e. with 36×3 = 108 spatial channels) [NICT15][Saetal15]. This opened the possibility of the transmission at 10 petabits per second with a single fiber. On the other hand, both MCFs and MMFs have their specific issues that have to be overcome. For example, the multi-core fibers require more complex and more expensive manufacturing, due to non-circular symmetric structures. Besides, the efficient way for coupling of MCFs onto standard single-mode fibers should be found, where each core (and even each spatial mode in a core) should be coupled with a particular single-mode fiber. The other issues of MCFs include the uniformity and shape of individual cores, mode control, and reducing signal interference. Unlike MCFs, it is much easier to produce MMFs, since the existing technologies can be used for this purpose without the necessity for additional investment in development. Nevertheless, the main problem of the mode-division multiplexing and MMFs is mode coupling due to random perturbations in fibers and in mode multiplexers and de-multiplexers. Several techniques can be used for mitigating the
Optical Cables | 325
mode coupling. One of them, borrowed from modern wireless systems, is the use of complex multiple-input multiple-output (MIMO) [Ryetal12-2] processing algorithms for information recovery (see Chapter 8). The other method, based on an adaptive optics feedback, also requires computationally intensive digital signal processing (DSP) algorithms. Multicore !bers 3-core
7-core
19-core 36-core !ber (3 modes each core)
Single-mode !ber
Hollow-core !ber
Few-mode
Multimode
~0.3 mm
Multimode !bers
Fig. 7.24: Different fiber types for future optical systems.
An alternative approach that utilizes the propagating modes in MMFs and simplifies (or even removes) the MIMO processing, is to use the orbital angular momentum (OAM) of light (as opposed to the LP modes) for creating orthogonal, spatially distinct data streams that are multiplexed in a single fiber (see Chapter 6). OAM, as one of the most fundamental physical quantities in classical and quantum electrodynamics is associated with the spatial distribution of electric field. Photons that carry OAM have a helical phase of electric field. So far, the inherent orthogonality of OAM modes in free space has been exploited by some experiments, but not in conventional optical fibers due to their instability caused by mode coupling. Still, there were some breakthroughs in the concept of OAM mode-division multiplexing (OAMMDM) in fibers (see Chapter 6). In one of them [Boetal12] the special designed few-mode fiber (“vortex fiber”) was used to achieve a total transmission capacity of 400 Gbit/s for a single wavelength over 1.1 km length. While standard single-mode fibers support propagation of two distinct, degenerate polarization states of the fundamental mode LP01, vortex fiber is designed to support two transverse modes TE01 and TM01 of opposite spins (circular polarization) as well as two OAM modes with the opposite topological charges (OAM+ and OAM–).
326 | Wired Transmission Media
Maintaining the modal crosstalk during the propagation and the multipath interference (MPI) at low levels, a vortex fiber enables 4 orthogonal channels at a single wavelength to carry independent information signals without the necessity of using MIMO or DSP processing. Also, with the use of two OAM modes and 10 wavelengths, the total data transmission of 1.6 Tbit/s was achieved over the same 1.1 km long fiber [Boetal12-2]. Transmission capacity of optical networks can be increased also by the production of cables, which contain even a greater number of fibers with the very small space occupancy (high density cables). The so called “small-diameter high-density optical cable technology” already produced a 1000-fiber optical cable which has an outer diameter of 23 mm and a weight of 0.45 kg/m. Even denser cable designs are in the development. Nevertheless, this approach of the capacity increase has its physical limits, and the objectives of the future development in this direction are to be expected. The research work in fiber optics today is dispersed in many other directions as well. Beside the significant efforts to increase fibers’ capacity, further lowering of losses and signal attenuation in fibers are also one of the priorities. The rapid penetration of fiber-to-the-home (FTTH) services has involved many unskilled workers in deployment of optical fibers, so the fiber bending loss became a frequent occurrence in the field. Consequently, the development of bending-loss insensitive fiber was boosted. As result, the ITU G.657 Recommendation is in progress as the standard for low bending-loss optical fibers. The two categories of G.657 fibers are provided: Category A which has full compatibility with standard G.652 fibers, and Category B with partial compatibility [G657]. The methods for decreasing bending-losses are based on two approaches. The first one is to control the distribution of a refractive index (optimal-refractive-indexdistribution fiber) and the second is to form air holes (microstructured, i.e. hollowcore fiber). The second approach gives better results in suppressing the bendingloss since the differences in a refractive index of hollow-core fibers are greater than those of optimal-refractive-index-distribution fibers, thus giving the more relaxed level of freedom for design. Future optical fibers will certainly have improved properties of optical materials, and the researches in this field are also in progress. Currently, the most promising are silica glasses with a high content of fluoride, since they have attenuation losses even lower than today's highly efficient fibers. For example, the experimental fibers with 50–60 % of zirconium-fluoride (ZrF4) now have losses between 0.005 and 0.008 dB/km. Compared to the standard fibers (with minimal attenuation of ~ 0.17 dB/km), the fibers with enhanced optical properties will enable significantly longer optical spans than today. Other issues in optical systems have also being addressed. It is worth mentioning the mitigation of optical nonlinearity either by reducing the nonlinearity of the
Optical Cables | 327
fiber itself through improved design and/or new enhanced materials for fiber drawing, or by introducing active electronic or optical means to compensate it. For example, the impairments caused by Kerr effect and other nonlinearities in a fiber can be mitigated using new coding techniques initially developed for wireless systems. Among them is the forward-error correction (FEC) with the use of the low-density parity-check (LDPC) codes (see Chapter 4). There are also turbo codes and turbo equalizers which use the log-likelihood ratio (LLR) calculation at the receiver’s side. The above mentioned directions of development represent only a small part of researches in fiber optics. Many other optical components (not mentioned in this chapter) are continuously being improved and the new ones are being invented. A general trend is to achieve the so called “all-optical networks” where all paths between all communication parts in a network will be entirely optical, i.e. without conversion in electrical domain.
8 Wireless Channel 8.1 Channel Characteristics Wireless communication has made a tremendous impact on a modern life: it enables mobility, not only for voice traffic but also for high volume data traffic. It also enables the interconnectivity between many different, more often tiny devices in the surroundings eliminating the need for thousands of cables. The wireless communication channel, however, is very different from the wired channel. It experiences far more distortions and disruptions from the environment than the corresponding wired channel. Some of the obstacles to the communication include heat, pollution, physical obstacles, such as buildings, trees, mountains etc. Due to natural hindrances, and human made obstacles, a wireless device, e.g. a mobile station (MS) has usually one or more non-line-of-sight paths with a base station (BS). Because of the lack of a direct line-of-sight radio propagation path, the radio waves between the wireless device and the base station experience scattering, reflections, refractions and diffractions (Figure 8.1). Besides, the signal arriving at the wireless device is additionally influenced by a background noise and disturbances from other signals. As a consequence, the signal waves arrive from many different directions, resulting in a phenomenon known as a multipath propagation. Each copy of the signal arrives with its own level of attenuation, delay and phase shift. The receiver sees multiple copies of the signal superimposed over each other. All this creates problems, especially for high order modulations as e.g. 64-QAM, due to the move of the received signal’s phase in a constellation diagram (see Chapter 5). In order to reconstruct the original signal, the different copies are combined at the receiver [Sch05][Pro01][Lee08].
a)
b)
c)
Fig. 8.1: Wireless propagation obstacles and their consequences: a) Shadowing b) Reflection from big surfaces c) Scattering from small surfaces d) Diffraction from sharp edges
d)
330 | Wireless Channel
Signal power
Two major phenomena result from wireless communication, which are missing in wired line communication. These are fading and interference. Fading is the fluctuation of the amplitude or phase of a signal [Dav96]. The transmission channel changes with the location of the MS as well as in time. Therefore, the time delays and phases of partial signals arriving at the MS are different. As a consequence, the signal arriving at the MS experiences short breaks or fades in the amplitude of power, because of erasures of an arriving signal by the signal of an opposite phase (see Figure 8.3 b). This phenomenon is called fast fading. Additionally, the distance of the MS from the BS changes as well as the signal environment, e.g. because of physical obstacles. This leads to slow changes of the average power of the arriving signal, known as slow fading. Both fast and slow fading are shown in Figure 8.2.
Fast fading Slow fading
t Fig. 8.2: Fast and slow fading.
Interference is the superposition of multiple radio waves to produce a resultant wave, which is stronger, weaker or of the same amplitude as the original wave. The waves of a signal can either add up constructively or destructively, depending on the relative phase difference between them. Reinforcement occurs when multiple copies of the signal arrive at the receiver, such that they are in-phase and therefore add up constructively, reinforcing each other and making a stronger resultant signal. Cancellation on the other hand occurs when the copies of the signal arrive out of phase, adding up destructively, thereby declining the signal level relative to noise and making the resulting signal weaker to detect at the receiver.
Multipath Propagation | 331
Direct signal
Reflected signal
0°
180°
Resulting signal
a)
b)
Fig. 8.3: Phase difference between the direct and the echo signal: a) Constructive interference b) Destructive interference
These phenomena are a result of multipath propagation, i.e., a signal taking multiple paths from the source to the receiver.
8.2 Multipath Propagation In certain specialized scenarios, such as satellite to ground station communication, there is a line-of-sight (LoS) channel between the sender and the receiver. In other scenarios, such as mobile telephony, a direct LoS channel might not be available. However, there might still be one or many non-line-of-sight (NLoS) paths between the sender and the receiver due to the presence of too many obstacles for a direct LoS communication to be possible. Typical obstacles include buildings, trees, mountains, billboards, etc. Thus, the radio frequency (RF) signal may traverse via multiple different paths to the receiver, phenomena called multipath propagation. Figure 8.4 shows the existence of multiple paths between the transmitter and the receiver (antenna). There is one direct LoS path but the receiver also receives the signal via multiple indirect paths. The signal can be reflected, refracted, diffracted or scattered by the obstacles in the signal path, resulting in multiple copies of the signal arriving at the receiver with varying time delays and different levels of attenuation and phase shifts. The strength of the resultant signal at the receiver depends on the strength of the individual component signals received from multiple paths.
332 | Wireless Channel
Fig. 8.4: Multipath propagation.
Multipath propagation results in a signal deformation. In case of a signal consisting of rectangular symbols, which are often used in communications, the signal deformation appears already at the signal generation: the spectrum of a rectangular signal, which is infinite (see Fig. 8.5), is deformed by the attenuation of high frequencies, due to a finite bandwidth of the forming low-pass filter at the transmitter. Depending on the low-pass transfer function of the transmission channel, the filtered signal experiences a stronger or weaker deformation, i.e. the attenuation of different frequency components. S(f) T
3
- _T
2
- _T
1
- __ T
0
1 _
2 _
3 _
T
T
T
f
Fig. 8.5: Spectrum of the rectangular signal with duration T.
Intersymbol Interference (ISI) occurs when one symbol interferes with subsequent symbols, thus creating an effect of noise (see Chapter 5). This happens due to the different time delays in receiving the different copies of the signal. The delayed signal copies result in interference of symbols from one signal to the others, making an information recovery very difficult. Figure 8.6 shows the case of three copies of a signal, which are received via paths A, B and C and are time delayed. The symbols S1, S2 and S3 overlap at the receiver, resulting in ISI.
Multipath Propagation | 333
The problem of ISI can be solved using error correcting codes (see Chapter 4), an adaptive equalizer (see Chapter 5) at the receiver, or by separating symbols in time with guard intervals. The guard interval is a period of time between symbols that accommodate for the late arrival of symbols due to multipath propagation causing ISI. Typically, the ending part of the deformed symbol is cyclic repeated at the beginning of the symbol, such prolonging the signal duration for a guard interval Tg. This results in the reduction of a spectral efficiency (see Chapter 5). Depending on application, Tg varies between 1/32 and 1/4 of the symbol duration Ts.
A
S1 S2 S3 S4
B
S1 S2 S3 S4
C C
S1 S2 S3 S4
S1 S2 S3 S4
A
S1 S2 S3 S4 S1 S2 S3 S4
B
1
S1 S2 S3 S4
ISI
Fig. 8.6: Intersymbol interference.
Symbol -1
Symbol 0
Symbol 1 t
-T
-Tg ha(t)
Received symbols
Length of received filtered symbols
Fig. 8.7: Guard intervals.
0
TS
T ha(t-T)
334 | Wireless Channel
8.3 Propagation Model 8.3.1 One-Way Propagation A sine received signal r(t) is uniquely identified by its amplitude A, its frequency f0 = ω0/2π and its phase shift ϕ0 [Lee08]:
r (t ) = Ae j (ω0t +ϕ0 )
(8.1)
When the receiver moves relative to the transmitter and experiences no reflections of the transmitted signal, the phase shift of the received signal varies with the distance travelled. This is shown in Figure 8.8. If the distance between the transmitter and the receiver changes, the transmitted signal experiences a phase change proportional to the path change at the receiver. For simplicity, it is assumed here that this change in path is small enough so that the average path loss is not affected by this change and therefore the propagation loss is approximately constant. For the arrangement shown in Figure 8.8 therefore arises the expression for r(t):
r (t ) = Ae j (2πf0t −βx cosθ )
(8.2)
where β is the wave-number such that β = 2π/λ. If the receiver moves with the speed v in the direction x and covers a distance v⋅t in time t, then:
r (t ) = Ae
q
v j 2πt f0 − cosθ λ
(8.3)
x
Fig. 8.8: One-way signal propagation.
The term v/λ⋅cosθ represents the Doppler’s shift, which will be discussed later. Therefore, the envelope of the received signal is a constant value given by:
Propagation Model | 335
r (t ) = A
(8.4)
The reception level is therefore approximately unaffected by small path changes. Therefore, no fading effect can be observed. This was expected, since only one propagation path exists, there may be neither constructive nor destructive interference.
8.3.2 Two-Way Propagation If a reflection path exists between the sender and the receiver, then the above equation changes, as in this case the received signal is made up of two components (see Figure 8.9).
q2 q1
v
Fig. 8.9: Two-way signal propagation.
The propagation loss is in both ways the same size, so the received signal r(t) can be written as [Dav96] (phase shifts on reflection point are ignored): − j 2π cos θ 2 A j 2πf 0 t − j 2π λ cos θ1 λ +e e e 2 vt
r (t ) =
vt
(8.5)
or:
r (t ) =
vt cos θ1 + cos θ 2 2
A j 2πf 0 t − j 2π λ ⋅ e e 2
vt cos θ 1 − cos θ 2 + j 2π ⋅ − j 2π vtλ ⋅ cos θ1 −2cos θ 2 2 λ +e e
(8.6)
The envelope of the received signal is now given by:
vt (cos θ1 − cos θ 2 ) r (t ) = A cos 2π 2λ
(8.7)
336 | Wireless Channel
This shows that the envelope of the two-way propagation signal is not a constant anymore; it is rather dependent on the path and the angle of arrival of the signal components.
8.3.3 N-Way Propagation Generally, a lot of waves (N) arrive at the receiver from different directions (θ i) and with different amplitudes (Ai). The resultant received signal is derived from the vector superposition of all of these individual waves. The resultant vector is given by: N
r (t ) = ∑ Ai e j 2πf 0t e− jβvt cosθ i
(8.8)
i =1
8.4 Fading Model In the presence of multipath propagation, the waves of signals arrive via multiple paths at the receiver and are normally out of phase resulting in a reduction of signal strength due to wave cancellation. Depending on the amount by which the signals are attenuated, delayed or phase shifted, fluctuations occur in the signal amplitude with time, leading to fading. Fading is the deviation of attenuation, affecting the signal over a wireless channel. Due to the presence of multiple paths between the sender and the receiver in wireless communications, the amount of fading may vary with time and the location of the receiver, especially when the receiver is not stationary. Due to multipath propagations fading is in general called multipath fading. The faded signal’s amplitude varies with time and appears to be a random variable at the receiver. Fading is often modelled as a random process. Two types of fading, the Rayleigh fading and the Rician fading are further discussed [Skl97].
8.4.1 Rayleigh Fading The case is considered that the receiver receives a large number of reflected and scattered waves and there is no direct LoS signal reception due to obstacles or the distance between the sender and the receiver. Due to the wave cancellation effects, the instantaneous received signal power appears to be a random variable at the moving receiver. The fading effect experienced by the receiver is called Rayleigh fading and Rayleigh distribution is used to model the multiple paths of the densely scattered signals reaching the receiver without a LoS component.
Fading Model | 337
The probability density function (PDF) of a Rayleigh distribution is given by:
f X ( x) =
x
σ2
−
e
x2 2σ 2
,
(8.9)
x>0
where x is the absolute value of the amplitude and σ is the standard deviation (see Chapter 3). The mean (expected value) of a Rayleigh distribution is given by:
E{x} = σ
π
(8.10)
2
whereas the variance of a Rayleigh distribution is given by:
π Var{x} = σ 2 2 − 2
(8.11)
Rayleigh distribution for various values of σ is shown in Figure 8.10.
1.4
fx(x)
1.2
s=0.5 1
0.8 0.6
s=1
0.4
s=2
0.2 0 0
s=3
s=4 x
1
2
3
4
5
6
7
8
9
10
Fig. 8.10: Rayleigh distribution for different values of σ.
8.4.2 Rician Fading Rician fading (also called Rice fading) is similar to the Rayleigh fading, except the fact that, as opposed to Rayleigh fading, a dominant component of the signal is received by the receiver in Rician fading. This dominant component may be due to
338 | Wireless Channel
the presence of a direct LoS channel. The Rician fading is modelled using a Rician distribution. The probability density function of a Rician distribution is given by:
f X ( x) =
x
σ
−
x2 +s2
e 2
2σ 2
xs I0 2 σ
(8.12)
where s represents the dominant component of the signal and I0(x) is the modified zero-order Bessel function of the first kind given by:
I 0 ( x) =
1
π
π
∫e
x cosθ
dx
(8.13)
0
If there is no sufficiently strong signal component s, i.e. if:
s
σ
→0
(8.14)
then the PDF of the Rician distribution equals the PDF of the Rayleigh distribution. A Rician factor is defined as the ratio of the power in the dominant path divided by the power in the non-dominant (scattered) signal paths. The Rician factor K is given by:
K=
s2 2σ 2
(8.15)
Rician distribution for different K-factors is shown in Figure 8.11. A value of K = 0 indicates the absence of any dominant signal component and the resultant curve is similar to that for Rayleigh distribution. As the value of K increases, the Rician distribution gets closer to the Gaussian distribution.
Doppler Shift | 339
0.9
fx(x) K=0 dB
0.8 0.7
K=3 dB
0.6
K=7 dB
0.5 0.4
K=13 dB
0.1 0.2 0.1 0 0
x 0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
6
Fig. 8.11: Rician distribution for different values of K.
8.5 Doppler Shift Apart from fading, there are many more effects which influence the wireless transmission. One of the very common effects, observed in the case of a moving sender or receiver is the Doppler shift (also called Doppler effect). Doppler shift represents the case of change in the frequency of a wave when the receiver moves relative to the source. With a change in the distance between the source and the receiver, the phase of the received signal changes as well. As the distance between the source and the receiver decreases, e.g., when the source moves towards the receiver, each successive wave is emitted closer to the receiver as compared to the previous one. This results in the waves bunching together. On the other hand, when the distance between the source and the receiver increases, e.g., when the source moves away from the receiver, each successive wave is emitted at a farther distance than the previous. This results in the waves spreading out. The case will be considered that the mobile receiver moves with a velocity v from point A to point B. During the time ∆t, the receiver covers the distance d = v⋅∆t. This results in the change of the path length and the reception phase at the receiver. As a consequence, the wave from the sender to the receiver experiences a shift in the frequency. This shift in the signal’s frequency due to the motion is called the Doppler shift. The Doppler shift is given by:
fD =
v
λ
cosθ
where λ is the wavelength of the signal and θ is the receiving angle (Fig. 8.8).
(8.16)
340 | Wireless Channel
The maximum Doppler shift fm is given by:
fm =
v
λ
(8.17)
Two extreme values of the Doppler shift, i.e. fD = +fm and fD = –fm, arise when a MS moves directly towards a BS or moves away from it (Fig. 8.12). fc
v=0 orq=90°
fc+fm v
fc-fm v
Fig. 8.12: The zero- and the two extreme Doppler shifts.
The Doppler shift fD can also be written as:
fD =
vf c cosθ c
(8.18)
where λ = c/fc, such that c is the velocity of light and fc is the carrier frequency of the radio signal. When a moving MS receives a large number of signal components from different directions, each of these signal components is experiencing a specific Doppler shift in the interval [–fm, +fm]. A model of the multipath propagation with Rayleigh or Rician fading assumes random angle of reception (isotropic scattering) and the uniform distribution of phase of each component within interval [0, 2π]. In such a model each component has its own Doppler shift. As a consequence, when a sinusoidal frequency fc is transmitted, the received signal spectrum will have a widened frequency band of width BD = 2fm around carrier frequency fc. This spectrum is called Doppler spectrum which is (for Rayleigh fading) given by:
Doppler Shift | 341
σ 2 S ( f ) = π f m − ( f − f c )2 0
,
f − fc < fm
(8.19)
, otherwise
and represented by the U-shape function [Cla68] shown in Figure 8.13. S(f)
~ ~ BD
f fc-fm
fc
fc+fm
Fig. 8.13: Power density spectrum of a sine wave suffering from a Doppler spread.
In reality, the average power spectrum follows approximately this theoretical shape depending on how close the model matches a concrete situation. Due to the Doppler shifts of many scattered components that gather at the place of the (mobile) receiver, a single spectral line that represents the original sine signal (carrier frequency) is dispersed onto the range of non-zero frequency components with the minimum around fc and the (theoretical) asymptotes at fc ± fm. This broadening of a single spectral component is called Doppler spread. In a (mobile) communication system the effects of Doppler spread are negligible if the bandwidth of the baseband signal, i.e. the bandwidth of the signal around the carrier frequency is much greater than Doppler spread. The time variations of a channel affected by Doppler spread are characterized by the parameter called coherence time Tc. As the dual of Doppler spread in time domain, coherence time tells how long the channel characteristics remain unchanged, i.e. how long they are considered unchanged. If the channel impulse response is observed, coherence time represents the duration of time interval within which the impulse response is invariant, i.e. over which the amplitude correlation of the two time-separated received signals (pulses) is significant. Inversely proportional to Doppler spread (Tc ~ 1/fm), coherence time is used for the estimation of the influence of Doppler effect on the mobile channel with regard to the velocity of the mobile receiver v, carrier frequency fc and the symbol rate of the signal. Namely, if defined as the time interval over which the time correlation function of two received signals is above 0.5, coherence time (for the isotropic scattering model) can be approximated [Sha02][Rap02] as:
342 | Wireless Channel
Tc ≈
9 16πf m
(8.20)
i.e., when maximal Doppler shift is replaced by fm = v/λ = vfc/c:
Tc ≈
9 c ⋅ 16π vf c
(8.21)
Two different signals, e.g. two adjacent symbols of a digital communication system will be treated equally by the channel only if the time difference between them is less than Tc, i.e. if the symbol rate is greater than 1/Tc. So, if fc and the symbol rate of a communication channel are known, (8.21) can be used to estimate maximal velocity of the receiver under which Doppler effect will not influence mobile communication. In praxis (8.20) is proved to be too restrictive, so in modern mobile systems another expression is often used [Rap02]:
Tc =
9 0.423 = 2 fm 16πf m
(8.22)
as the geometric mean of (8.20) and the expression Tc = 1/fm (which is, on the other hand, proven to be too loose).
8.6 Diversity 8.6.1 Diversity and Combining Techniques Diversity methods are used to combat the detrimental effects of channel fading experienced in wireless communications. The basic idea is to transmit multiple copies of a signal over multiple statistically independent fading channels. Signals are normally affected differently by the different channels and the probability of experiencing deep fades by two or more received copies of the signal at the same time is too low. At the receiver, the multiple copies are combined in such a manner that the effects of fading are minimized. The receiver diversity has a main advantage in the fact that it mitigates the fluctuations due to fading resulting in the channel to appear more like an AWGN channel. Assume that the receiver has two different antennas to measure the received signals at a slightly offset from each other. The received signals with their fast and slow fading (mA and mB) are shown in Figure 8.14.
Diversity | 343
Received signal [dB]
Signal B
mAA(x(t)) m
mBB((xt)
Signal A
Time
Fig. 8.14: Uncorrelated fading signals.
If as an example the strongest signal is selected, the fading influence is significantly reduced. This method is called the space diversity reception. The distance between the antennas must be chosen to be large enough to diminish the correlation between the signals. For fast fading, this means that the distances should be in the order of the wavelength. This situation is referred to as Micro-Diversity or microscopic diversity scheme. Such schemes are used when the signals are obtained via independently fading paths through two or more different antennas at a single receiver site. In order to reduce the effects of slow fading, normally significantly greater distances between the antennas are required. The corresponding diversity schemes are called macroscopic diversity schemes, where a signal is received by multiple antennas (or BSs) and then combined. Due to the increased technical complexity in the receiver, space diversity is used primarily on the BS side. Macroscopic diversity can only be done if multiple BSs are able to receive a given mobile’s signal, which is typically the case for CDMA systems (see Chapter 6). Multiple diversity combining schemes have been proposed in literature. These techniques are mostly linear. The output is typically a weighted sum of the different fading paths (also called branches). Some of the diversity schemes discussed here include selection combining, switched combining, maximal ratio combining and equal gain combining. Further details on these combining methods can be found in [Gol05].
344 | Wireless Channel
8.6.1.1 Selection Combining In selection combining, the idea is to select the best signal out of all the received signals. The combiner always chooses the signal on the branch which has the highest signal to noise ratio (SNR). A selection combiner is shown in Figure 8.15.
REC
REC
Fig. 8.15: Selection combining.
Let M be the number of received signals over the M branches in the selection combiner. Let the average SNR on the ith branch be Γi and let the instantaneous SNR be γi. The probability that γi of the branch i is below a certain level γ0 is given by:
P(γ i < γ 0 ) = 1 − e −γ 0 / Γi
(8.23)
If the received SNR is same in all the branches, i.e.:
Γ = Γi , ∀i ∈ [1, M ]
(8.24)
then the outage probability (the probability that the SNR stays below γ0 in all the branches) is given by:
(
Pout = P(γ < γ 0 ) = 1 − e −γ 0 / Γ
)
M
(8.25)
The outage probability of selection combining for various values of M is shown in Figure 8.16. One can clearly see a significant reduction in outage probability for increasing value of M.
Diversity | 345
0
10
-1
10
Pout -2
10
M=1
M=2 M=3 -3
10
M=4
M=20
M=10
-4
10 -10
-5
0
5
10
15
20
25
30
35
40
10log10(G/g0)
Fig. 8.16: Outage probability Pout for various M.
8.6.1.2 Switched Combining Selection combining achieves very good results but it is very costly to implement it because of the need for M separate receivers. Switched combining is another possibility for diversity combining where only one receiver is necessary. Switching between the antennas is carried out prior to the receiver as shown in Figure 8.17. With the arrangement of two antennas, two branches have signals r1(t) and r2(t). The signal r1(t) is connected through to the receiver, as long as it is above a predetermined threshold T. As soon as the signal falls below the threshold level T, the receiver is switched to the second signal r2(t) regardless of the instantaneous level of this signal.
REC
Fig. 8.17: Switched combining.
346 | Wireless Channel
The probability that a signal ri(t) is smaller than the threshold level T is given by:
p (T ) = P ( ri < T ) = 1 − e − T
2
/Γ
(8.26)
where Γ is average SNR of the received signal as in the “selection combining”. From the above equation, the dropout probability of the combined signal r(t) is given by:
p(r0 ) − p(T ) + p(r0 ) p(T ) , r0 > T P(r < r0 ) = , r0 ≤ T p(r0 ) p(T )
(8.27)
The SNR of the resultant signal, based on two received fading signals, under the switched combining is shown in Figure 8.18. g
Switched combining signal Branch 1 Branch 2
gT
Time
Fig. 8.18: SNR of the resultant switched combining signal.
8.6.1.3 Maximal Ratio Combining Selection combining as well as the threshold combining techniques outputs the signal received on one of the branches. In maximal ratio combining, on the other hand, the output of the combiner is a weighted sum of all of the signals received on the branches. Though it is better than selection combining and threshold combining techniques, it is also the most expensive way of combining faded signals. In maximal ratio combining, the signals received on all of the M branches are weighed, phase corrected and added. Each branch signal is weighed with a gain factor proportional to its own SNR and all the weighed branch signals are summed. The block diagram of a maximal ratio combining scheme is shown in Figure 8.19. For a signal ri and a weighing factor ai for branch i, the weighted sum for M branches is: M
r = ∑ ai ri i =1
(8.28)
Diversity | 347
The coefficients ai are chosen as γi.:
ai2 = γ i
1
REC
2
(8.29)
M
...
REC
+
REC
S(Envelope) M
i
i=1
Fig. 8.19: Maximal ratio combining scheme.
where γi is the instantaneous SNR of the ith channel. The signal to noise ratio on the output of the adder becomes: M
γ out = ∑γ i
(8.30)
i =1
This equation shows that by choosing γi = γ, for all i, i.e. a constant on all branches, the output signal to noise ratio would be: (8.31)
γ out = M ⋅ γ
This shows that through maximal ratio combining, the signal to noise ratio is improved by a factor of M. The drop out probability for the maximal ratio combined signal is given as: i −1
r02 M −r02 / Γ Γ P(r < r0 ) = 1 − e ∑ i =1 (i − 1)!
(8.32)
The drop out probability for maximal ratio combining is shown in Figure 8.20. The figure shows a clear gain as compared to the method of selection combining.
348 | Wireless Channel
0
10
-1
10
Pout -2
10
M=1
M=2 M=3
-3
10
M=4 M=10 M=20 -4 10 -10 -5
0
5
10
15
20
10log10(r0 /G) 2
25
30
35
40
[dB]
Fig. 8.20: Outage probability for various M.
8.6.1.4 Equal Gain Combining Equal gain combining is similar to maximal ratio combining except that all the weights are set to unity. This still provides an acceptable output SNR from unacceptable input signals over the branches. However, the performance is inferior compared to maximal ratio combining. 8.6.2 Multiple Input Multiple Output Multiple Input Multiple Output (MIMO) is a wireless transmission technique based on multiple transmitting and receiving antennas (Fig. 8.21). Its early development started in 1970s with research about multi-channel digital transmission systems. The first publication and the patent about transmission of multiple, independent information streams using co-located antennas and multi-dimensional signal processing was made by G. Raleigh [RaCi96] in 1996, following by the publication about layered space-time architecture from G. J. Foschini [Fos96]. MIMO systems are much more efficient than a conventional Single Input Single Output (SISO) system, because of the higher data rate and lower Bit Error Rate (BER) which implies a higher spectral efficiency (see Chapter 5) and Quality of Service (QoS). MIMO is a part of a variety of wireless standards: Wi-Fi [IEEE802.1][IEEE802. 11ac], HSPA+ [HSPA11], WiMAX [IEEE802.16] and LTE [EUTRA16]. It has been also applied to power-line communication (multiple signals over phase-, zero- and earthwire) and recently MIMO techniques are used in the (still experimental) optical systems with the new types of optical fibers (e.g. few-mode multi-core fibers, see Chap-
Diversity | 349
ter 7). The development of MIMO systems continues, as MIMO is a promising technique for the future wireless applications.
Tx
Rx h11
1 h31
h21
1
h12 h13
h22
2
2
h32
h23 h33
3 . . .
3 . . .
Fig. 8.21: Multiple Input Multiple Output.
Unlike conventional wireless systems which suffer from the multipath propagation, MIMO systems exploit it for multiplying the capacity of the radio link. Namely, if more different data streams at various frequencies are brought to each of transmitting antennas (e.g. an OFDM signal, see Chapter 6), each data stream reaches the receiving antennas from different directions, i.e. each receiving antenna receives multiples of different signals from many directions. Since each receiver may have two or more antennas (the antenna array), it is possible to adjust the receiving gain of the array to be maximized in one or more directions, i.e. to make the receiver selective only for the signals from the chosen i.e. wanted directions, whereby one dominant signal (particular data stream) comes from each direction. The similar, so-called beamforming technique may be also applied on the transmitter’s side, which can further increase the spatial separation of different data streams coming to the receiver at the same frequency. The spatial multiplexing of more data streams on the same frequency is achieved in this way and the overall throughput of the system is multiplied. The above mentioned techniques of adjusting the amplitudes and phases of different signals on different transmitting and receiving antennas assume that the receiver and/or transmitter possesses the knowledge about each radio channel characteristic, i.e. the so called channel state information (CSI). The spatial multiplexing that takes advantage of multipath propagation would not be possible without CSI. A MIMO system can be mathematically modelled using the m×n matrix H(t) of the pulse responses hi,j(t) for each channel between the jth transmit and the ith receive antenna (j = 1,…, m; i = 1,…, n). The transmitter sends multiple streams by multiple
350 | Wireless Channel
transmit antennas through a matrix channel consisting of all the paths between n transmit antennas and m receive antennas:
h11 (t ) h12 (t ) h (t ) h (t ) 21 22 H (t ) = ... ... hm1 (t ) hm 2 (t )
... h1n (t ) ... h2 n (t ) ... ... ... hmn (t )
(8.33)
In general, all the channels between transmit and receive antennas are considered to be uncorrelated and Gaussian distributed, which leads to the very complex analysis of a multipath environment. In reality, the previous assumption is not fulfilled in most of cases, as many channels are correlated (to the certain extent) regarding the receiving angle, impulse delay and other parameters. Instead of describing a large number of individual paths, the propagation paths with similar parameters can be grouped into the so-called clusters. The use of clusters simplifies a MIMO channel model since only clusters (as the supersets of particular paths) have to be parameterized. Having the above in mind, channel impulse response with the multipath propagation between the jth transmit and the ith receive antenna can be described via cluster parameters as the superposition of N clusters with KN components of the multipath propagated signal: N
KN
hi j (t ) = ∑∑ α li,,kjδ (t − Tl (i , j ) − τ l(,ik, j ) )
(8.34)
l =0 k = 0
,
where , is the coefficient of the channel between the transmit antenna j and the (, ) (, ) receive antenna i, is the delay of the lth cluster and , the relative delay of the th th k signal in the l cluster. In the beginning of its development, the MIMO concept assumed only one data signal which had to be transmitted in parallel by more than one transmitter’s antenna and received by more than one receiver’s antenna. The reliability and robustness of a wireless system are increased in this way only because of the redundancy i.e. spatial diversity. Over the time MIMO technology grew up into a concept that includes many techniques which may be classified into three groups: 1. Precoding 2. Spatial multiplexing 3. Diversity coding Today MIMO may denote a system with one multi-antenna transmitter and one multi-antenna receiver able to transmit/receive more different signals over the same
Diversity | 351
radio channel. Also, there are multi-user MIMO systems (MU-MIMO) with more multi-antenna receivers (i.e. users). One of the main properties of a MIMO system is that it uses information of the states of each channel between ith transmitter and jth receiver antenna. The already mentioned CSI can be used by both the transmitter and the receiver or only by one of two. CSI is information about channel properties, i.e. the signal propagation over the channel influenced by effects of fading, scattering, power decrease with distance etc. If CSI is known to the transmitter and/or the receiver, the transmission can be adapted to the channel conditions. In the single-stream MIMO systems each transmit antenna sends the same signal weighted with the appropriate complex coefficients that represent amplitude and phase, and the set of weighting coefficients is chosen in that way, that the signal level at the receiver’s output is maximal. The proper choice of phases and amplitudes at each transmitting antenna creates the specific beam shape that leads to the maximal constructive superposition of different multipath components at the receiver antenna. This beamforming technique is an excellent way to reduce the multipath fading.
8.6.2.1 Precoding Precoding technique is based on the knowledge of CSI and is used as the generalization of beamforming. The transmitter predicts the channel transfer function, and based on the prediction, the transmitter “precodes” the original signal to be sent, in order to cancel the signal distortions at the receiver side introduced by the channel. While beamforming maximizes the signal power, precoding minimizes the errors at the receiver’s output. In the case that the receiver has more antennas, it is impossible to maximize the signal power at all antennas. Instead, more data streams are transmitted from different antennas in such systems, and the new goal is to maximize the throughput of the system.
8.6.2.2 Spatial multiplexing With spatial multiplexing technique a high data rate signal is divided into multiple low data rate signals, which are then transmitted from different antennas in the same frequency band. The receiver antenna array can separate the arriving streams with sufficiently different spatial signatures into parallel channels, if the receiver possesses an accurate CSI. The number of spatial streams cannot be higher than the number of transmit or receive antennas. Spatial multiplexing can be used also with-
352 | Wireless Channel
out the knowledge of CSI at the transmitter. In case that the transmitter possesses the knowledge of CSI, spatial multiplexing can be combined with precoding.
8.6.2.3 Diversity coding By diversity coding the independent fading in the multiple antenna links is used to increase the signal diversity. Hereby a single stream is transmitted, as a difference to the spatial multiplexing, but the signal is coded using space-time coding. Diversity coding works without any knowledge of CSI at the receiver. In case the transmitter possesses the knowledge of CSI, diversity coding can be combined with spatial multiplexing.
8.7 Propagation and Path Loss in Free Space 8.7.1 Concept of Free Space The concept of free space is that the propagation medium (the channel) between the sender and the receiver is free of obstacles and hindrances for the radio frequency (RF) propagation. This means that the hindrances discussed earlier, such as reflection, refraction, diffraction, scattering, absorption etc. are not present in the path between the sender and the receiver. In general, it is assumed that the transmission between the sender and the receiver is only a function of the distance between them.
8.7.2 Path Loss If the signal is transmitted with the transmit power Pt and received with the power Pr, the linear path loss (PL) is defined as the ratio of the transmit power to the receive power, i.e. PL = Pt/Pr. The path loss of the channel in dB is defined as a logarithmic ratio of the transmitted and received signal power:
PL [dB] = 10 log10
Pt Pr
(8.35)
The path gain (PG) is the negative value of the path loss:
PG [dB] = − PL [dB] = −10 log10
Pt Pr
(8.36)
Propagation and Path Loss in Free Space | 353
8.7.3 Path Loss in Free Space If the distance between the transmitter and the receiver in free space equals d and Pt is the transmission power, then the power at the receiver (Pr) is given by: 2
λ Pr = Pt Gt Gr 4πd
(8.37)
where Gt is the gain of the transmitting antenna, Gr is the gain of the receiver antenna and λ is the carrier wavelength. d and λ are expressed in the same units, e.g. in meters. The path loss of the free space model in dB is given as:
PL [dB] = 10 log10
λ 2 Pt P = −10 log10 r = −10 log10 Gt Gr Pr Pt 4πd
(8.38)
where the corresponding path gain of the free space model is given as:
λ 2 PG = − PL = 10 log10 Gt Gr 4πd
(8.39)
8.7.4 Effective Isotropic Radiated Power The product of the transmission power Pt and the gain of the transmitting antenna Gt is often called “effective transmit power” or “Effective Isotropic Radiated Power” (EIRP). Therefore, the EIRP performance of a transmitting device indicates the transmission power which would have to be fed into an isotropic radiating antenna (an antenna which evenly distributes the power in all directions, without a preference to any particular direction) to generate the same power density at the receiver. Another related terminology is effective radiated power (ERP). The difference is that for ERP, the antenna gain is expressed relative to an ideal half-wave dipole antenna but for EIRP the gain is expressed relative to an ideal isotropic antenna. EIRP can be calculated from ERP as:
EIRP = ERP + 2.15 dB
(8.40)
354 | Wireless Channel
8.8 Path Loss Prediction Models The path loss prediction is normally possible only for simpler cases such as the free space propagation. The prediction of the propagation of electromagnetic waves in real environments with obstacles such as buildings, mountains, valleys and houses etc., is possible theoretically only to a limited degree. However, for the practical design of mobile networks and the correct placement of base stations, it is of great importance to be able to make predictions about the expected reception level. For this reason, many empirical propagation models from some very extensive series of measurements have been developed. These models include the prediction model from Lee [Lee93], Okumura [Oketal68] and Hata [Hat80], EURO COST model [ECOST231] (also known as COST231) and Walfish/Bertoni model [WaBe88]. The Lee model [Lee93] is used to predict the local mean of a received signal along the path of the mobile receiver unit. This model predicts two factors affecting a mobile signal, i.e. the strong ground reflection which depends on the terrain contour and the human made structures which results in signal loss along the path of travel. The model was developed for use over a frequency of 900 MHz. The Okumura model [Oketal68] was developed originally based on the data collected for Tokyo city. It is applicable over a frequency of 150–1500 MHz and over distances of 1–100 km. This model develops a set of curves giving median attenuation relative to free space of signal propagation in irregular terrain. Okumura’s model is the most widely used model for signal prediction in urban areas and serves as a base for the Hata model. The Hata model [Hat80] is based on Okumura model and is used as a prediction model for urban areas. It incorporates the graphical information from Okumura model to realize the effects of diffraction, reflection and scattering caused by urban area structures. It operates over the same range of frequencies as Okumura model, i.e. 150–1500 MHz. The EURO COST model [ECOST231] was developed by European cooperative for scientific and technical research and it extends the Hata model to 2 GHz. The COST-231 model does not consider the impact of diffractions from rooftops and buildings. The Walfish/Betroni model [WaBe88] considers these impacts as well.
8.9 Software Defined Radio The term radio is more often understood as the AM/FM radio device in a car. However, the literal term is much broader and it includes any kind of transmission through the air such as cell phone data and wireless computer data communication etc. Software defined radio (SDR) means that the receiver is not a hardware compo-
Software Defined Radio | 355
nent, rather almost all or most of it is built in reconfigurable software. The concept of SDR was introduced by J. Mitola [Mit92]. He defined SDR as an identifier of a class of radios (transceivers and receivers) that could be reprogrammed and reconfigured through software. Theoretically, an "ideal SDR" has only the antenna and the A/D converter as hardware and the rest as software, but practically more components can be in the form of hardware instead of software [JoSe03]. The block diagram of a SDR is shown in Figure 8.22. An analog radio frequency (RF) signal is received via an antenna. It is converted from the RF to the analog intermediate frequency (IF) signal by the RF tuner. A/D converter converts this analog IF signal to digital IF samples. The next step is to pass the digital IF samples through the digital downconverter (DDC). A DDC is typically implemented as a field programmable gate array (FPGA) circuit with the so called intellectual property (IP) core, i.e. a portable logic or data block. In Figure 8.22, the DDC is shown as a dotted box with three integral components, i.e. a digital mixer, a digital local oscillator, and a finite impulse response (FIR) low pass filter (see Chapter 1). The digital mixer and the local oscillator shift the IF digital samples to baseband, while the FIR lowpass filter limits the bandwidth of the final signal. The digital baseband samples produced by the DDC are fed to a digital signal processor (DSP). The DSP performs the tasks such as demodulation and decoding. DDC Digital Downconverter Analog RF signal
RF Tuner
Analog IF signal
A/D Conv.
Digital IF samples
Digital mixer
Lowpass !lter
Digital baseband samples
DSP
Digital local oscillator
Fig. 8.22: Software defined radio-receiver.
The block diagram of a SDR transmitter is shown in Figure 8.23. The input signal is converted (or produced as) a digital baseband signal by a DSP. This signal is fed to a digital up converter (DUC) block, shown in dotted box. The DUC translates the baseband samples to digital IF samples. The D/A converter converts these digital IF samples to analog IF samples. This is followed by the analog up converter, which converts these analog IF samples to RF frequencies. Finally, the power amplifier boosts signal energy to the transmitter antenna.
356 | Wireless Channel
DSP
Digital Baseband Samples
Interpolation !lter
DUC Digital Up Converter
Digital Baseband Samples
Digital mixer
Digital IF Samples
D/A Conv.
Analog IF Signal
Analog RF Signal RF Power Upconverter Ampli!er
Digital local oscillator
Fig. 8.23: Software defined radio-transmitter.
8.10 Cognitive Radio Cognitive radio technology, as well as software defined radio, was introduced by J. Mitola [MiMa99] as a novel approach to describe intelligent radios capable for making decisions based on information about the RF environment and based on the previous experience. As such, intelligent radios are able to learn and plan. Cognitive radio (CR) is considered as an effective technology for improvement of the utilization of the radio spectrum [Hay05][Goetal09]. Nowadays, the used radio spectrum is allocated on a dedicated basis: each wireless system has an exclusive licence for operation in a given frequency band. Such a spectrum allocation policy has its advantages and disadvantages. The main advantage is that operating systems use sufficient transmit power without interference with other licensed operating systems, reaching a satisfying Quality of Service (QoS) in a wide coverage area. The main disadvantage is that it becomes more and more difficult to enable dedicated spectra for new systems and to enhance existing systems. The other significant disadvantage is that assigned spectrum bands are much underutilized [Daetal09] over time, space and in a frequency domain. The most promising techniques for the increase of the spectrum utilization are CR techniques, which offer a new approach for spectrum sharing: − Horizontal: distributed spectrum allocation, such as listen-before-talk and equal rights to access the radio spectrum, and − Vertical: licensed users have a higher priority to access the radio spectrum than unlicensed ones. Licensed users are so-called primary users (PUs) and unlicensed users are so-called secondary users (SUs) or cognitive users. Benefits of the spectrum sharing are numerous: the same frequency band can be accessed by more than one wireless system; operators can more easily access spectrum and deploy their own wireless infrastructure; the spectrum management is simplified while the spectrum utilization is maximized. Therefore, CR techniques can be applied to the various communication systems as cellular networks, mesh
Cognitive Radio | 357
networks, public safety and emergency networks, wireless medical networks, leased networks, military networks etc. The CR cycle (Fig. 8.24) includes measurements of the physical radio environment, analysis of the measurements, estimation, prediction and learning from current and previous measurements, and adaptations of the transmission to the spectrum characteristics and user requirements [Goetal09]. Unlike a traditional radio, the cognitive radio possesses: − cognition capability: ability to identify the available spectrum by sensing the radio environment, analyse information from the environment and take decision about the part of spectrum to be accessed and about the best transmission strategy, and − configurability: ability to dynamically adapt transceiver parameters to the radio environment.
Sensing
Recon!gurability/Adaptation
Radio Environment
Analysis
Reasoning /Deciding
Fig. 8.24: Cognitive radio cycle.
Spectrum sensing describes the ability of the SUs to sense and be aware of the spectrum availability, access policies and radio channel characteristics (transmit power, interference and noise). The main types of spectrum sensing methods are: 1. Primary transmitter detection or indirect spectrum sensing: based on the detection of the surrounding PU transmitters [Lietal11]. The mostly known primary transmitter detection techniques are: − Energy detection [GhSo08]: the signal is detected by comparison of the received energy with the defined threshold, in order to find out if the PU signal is present or absent − Cyclostationary feature detection [Suetal08]: the SU can decide on the frequency band occupancy based on the analysis of the cyclic autocorrelation function of the received signals (see Chapter 1), which enables recognition of cyclostationary characteristics (modulation type, carrier frequency and symbol duration)
358 | Wireless Channel
Matched filter detection (see Chapter 5): an optimal detection technique in case that the SU transmitter has a-priori knowledge about the PU transmitter signal information (operating frequency, bandwidth, modulation type and packet format). Interference based detection: based on the directly detection of the PU receiver. This technique avoids the hidden node problem [YuAs09] specific for the primary transmitter detection techniques, when the transmission range of the PU transmitter reaches the PU receiver, but not the SU transmitter. The mostly known interference based detection techniques are: − Primary receiver detection or direct spectrum sensing [GhSo08]: detection of the nearby PU receiver using the local oscillator detection [Aketal08] or the proactive detection [Zhetal09] − Interference temperature model: based on the setup of the maximum interference level which the PU receiver can tolerate for a given frequency band B and geographic location [Cla07], whereby the temperature depends on the RF power P available at the receiving antenna (fc – the carrier frequency, k – Boltzmann constant): −
2.
T ( f c , B) = 3.
P( f c , B) , kB
k = 1.38 ⋅ 10 − 23
J K
(8.41)
Cooperative spectrum sensing: based on a collaboration and combining of sensing information of multiple SUs in order to improve the sensing performance [Aketal11], i.e. to decrease the probability of miss-detection and false alarm thanks to multiuser diversity and independent fading channels and to solve the hidden node problem. The main steps of the cooperative spectrum sensing technique are local sensing, reporting and data fusion.
Spectrum access approaches depend on the spectrum access policy and applications. They can be divided into three access models: 1. Dynamic exclusive use model: based on the maintaining of the current spectrum regulation policy with the addition of flexibility for the improvement of the spectrum efficiency, i.e. the licensees can lease the underutilized spectrum to the third party under an agreement. This is possible by using two possible approaches: − Spectrum property rights: the licensees are allowed to sell and trade spectrum, which improves the economical profit and the market of the limited radio resources − Dynamic spectrum allocation: improvement of the spectrum usage by a dynamic spectrum assignment based on exploiting of the traffic statistics for different services.
Cognitive Radio | 359
2.
3.
Open sharing model: based on the rule that anyone can access any spectrum range without permission, but respecting a minimum set of technical standards rules which are required for sharing spectrum. Hierarchical access model: the most promising technique for the increase of the spectrum utilisation, which is based on the access priority between the primary and secondary networks. There are three spectrum sharing approaches: − Interweave approach: SUs access the spectrum during the periods of nonactivity of PUs; if the PU becomes active, the SUs has to stop the transmission in order avoid the inference to the PU − Overlay approach: the PUs share their knowledge (messages and signal codebooks) with SUs; SUs can use this knowledge in order to improve the performance of the primary transmission or to eliminate the interference of the primary transmission at the SU receiver − Underlay approach: the coexistence of simultaneous primary and secondary transmission is allowed in the same frequency band and in the same geographic area as long as the interference from SUs to the PU receivers stay under the defined threshold [Sietal15] − Hybrid approach: the advantages of the previous mentioned access models are combined.
CR is constantly developing for different applications as inevitable and inescapable future solution for intelligent and effective usage of wireless channels spectra.
9 Cryptography 9.1 Basic terminologies 9.1.1 Crypto ABC Cryptography Cryptography is the science of mathematical methods for encryption and decryption of data. These methods are used not only for data confidentiality but also for verification of data integrity, authentication of the communication partner and the source of data. Cryptanalysis Cryptanalysis is the method of defeating the objectives of cryptography. This can be done mathematically by circumventing the goals of the cryptographic algorithms, or it can be done by exploiting the weaknesses in the implementation of cryptographic algorithms. Goals of an Attacker The main goal of an attacker is to be able to defeat the objectives of cryptography. An attacker is interested in knowing the plaintext corresponding to a cipher text or to fake a message with a valid signature, which is in the category called partial breaking. However, the ideal scenario for an attacker is to be able to extract the secret or private key used in cryptography. In order to extract the key, an attacker can always use the brute force attack, i.e., to look for all possible combinations until the key is found. For a key length of 64 bits, the number of possible keys, or the key space, is 264 bits. With the increase in key length, the effort required to perform a brute force attack increases exponentially and becomes impractical for a very large key space, e.g., for a key length of 256 bits, the key space is 2256. Since it is impractical with today’s technology to search the complete key space, an attacker might resolve to intelligent ways to reduce the key space. For example an attacker might use dictionary attacks, in which instead of searching the whole key space, an attacker looks for the most likely words for a key (i.e., dictionary words) in order to reduce the search space. Cryptology Cryptology involves the study of both cryptography and cryptanalysis. Cryptosystem A cryptosystem is comprised of a set of cryptographic elements to offer information and/or communication security.
362 | Cryptography
9.1.2 Cryptographic Design Principles Confusion The aim of confusion is to make the relationship between the plaintext and the corresponding cipher text as complicated as possible, i.e. after confusion there shouldn’t be any relationship between the statistics of the plaintext and the statistic of the cipher text. This makes it impossible to derive the plaintext or parts of it from a given cipher text. Confusion is achieved through bit or byte substitution in modern block ciphers. Diffusion The aim of diffusion is to dissipate the statistical structure of the plaintext into the cipher text. This is achieved when each bit of the input affects many bits of the output, i.e., a change of a single input bit changes many bits of the output. Implicitly, this also means that each bit of the output is affected by all input bits. This is also implied by a phenomenon known as Avalanche Effect, which states that even the change of a single input bit results in a change of approximately 50 % of the output bits, i.e. that any change of the input will change each output with the probability of 0.5. Random Oracle A random oracle is a black box which produces a random output for each input. The output is repeated only for two same inputs. Diffusion can also be seen in the context of a Random Oracle, whereby every input bit gets reflected in many output bits to achieve the desired effect of randomness.
9.1.3 Encryption/Decryption The aim of encryption is to map data, through mathematical transformations, into such a version that an attacker cannot reconstruct the original data from the transformed data. However, a legitimate recipient should be able to reconstruct the original data through an inverse mathematical transformation, i.e. decryption. Plaintext
Encryption
Fig. 9.1: Encryption/decryption.
Ciphertext
Decryption
Plaintext
Basic terminologies | 363
The following definitions are taken from [ISO18033-1]: – The un-ciphered information is known as plaintext. It is the original information which has some significance for its creator and consumer and needs to be protected from unintended recipients. – The transformed plaintext, so that its information content is hidden, is known as the cipher text (Fig. 9.1). – A reversible operation by a cryptographic algorithm to transform the plaintext into cipher text so as to conceal hide its information content is called encryption or encipherment. The transformation is achieved through cryptographic algorithms. – The reversal of the encryption is known as decryption or decipherment. Only the legal recipient should be able to perform the decryption and no one else. In certain cases, even the sender should not be able to perform the decryption. Kerckhoffs’s Principle In simple words, Kerckhoffs’s principle [Ker83] states that the security of a cryptosystem lies in the key alone. This means a cryptosystem should be secure even if an attacker knows everything about the system, i.e., the encryption and decryption algorithms. The only exception is the secret key, which is not known to the attacker.
9.1.4 Key Based Encryption It is advisable to use the algorithms which are known to have little or no weaknesses through years of investigations. Mostly such algorithms are dependent on an additional parameter, such as a key Ke for encryption. Key is defined as a sequence of symbols that controls the operations of a cryptographic transformation, e.g., encryption and decryption [ISO9798-1].
Sender
Plaintext
Encryption
Ke
Ciphertext
Decryption
Plaintext
Receiver
Kd
Fig. 9.2: Key based encryption.
Decryption is not possible without the knowledge of the key Kd. Therefore, the key Kd must be kept secret. The encryption and decryption algorithms are themselves made public as per Kerckhoffs’s principle. The keys Ke and Kd are identical for symmetric encryption. For asymmetric encryption, Ke and Kd are different and Ke can be made public under the condition that Kd cannot be derived from Ke (Fig. 9.2).
364 | Cryptography
9.1.5 Symmetric Cryptography Symmetric cryptography is based on the usage of a single key for both encryption as well as decryption. This key must be kept secret and is therefore called the secret key. Secret Key A secret key is used with symmetric cryptographic techniques by a specified set of entities [ISO11770-3]. Cryptography based on a single shared key is also known as private key cryptography. The key needs to be exchanged over a secure channel between the communication partners (Fig. 9.3).
Sender
Plaintext
Encryption
Decryption Ciphertext
Plaintext
Receiver
Secret key K
Fig. 9.3: Symmetric encryption.
If there are n participants in symmetric cryptography, then n⋅(n – 1)/2 keys must be exchanged in total. Thus, the number of keys increases quadratically with the number of participants. This, together with the need for a secure channel for key exchange, makes the key management very difficult with an increasing number of participants.
Basic terminologies | 365
9.1.6 Asymmetric Cryptography Asymmetric cryptography was proposed in [DiHe76], in which the communication partners use different keys in order to perform encryption and decryption. The text encrypted with one key can only be decrypted using the other key; therefore, it is called asymmetric cryptography. Each participant in such a system gets a key pair. One of the keys of the key pair is made publically available and is called the public key, whereas the other key is called the private key and it is kept private by the participant. It is not possible to obtain the private key with the knowledge of the public key and therefore the public key can be distributed to other participants. Asymmetric Key Pair An asymmetric key pair is a pair of related keys for an asymmetric cryptographic technique, where the private key defines the private transformation and the public key defines the public transformation [ISO18033-1]. Private Key A private key is the key of an entity’s asymmetric key pair, which should only be used by that entity [ISO11770-1]. A private key is supposed to be kept secret. Public Key: A public key is the key of an entity’s asymmetric key pair, which can be made public [ISO1770-1]. For the public key to be made public, at least the following conditions must be met: – It should not be possible to derive the private key from the public key by inverse operations. – Even if someone can transform a chosen plaintext with the public key, the private key cannot be deduced from it. In principle, the private key is always derived from the public key or vice versa. For this to be possible but sufficiently difficult, algorithms based on the solution of problems of complexity theory are chosen. Such functions are called one way trap door functions. One way functions have the property that from a given input argument it is easy to find the function value, but it is not possible, with a reasonable effort, to do the inverse transformation, i.e., to go back from the function value to the input argument. If there is a parameter, such as a private key, which makes it easier to do the inverse transformation, it is called a one way trapdoor function. Asymmetric cryptography can be used to achieve multiple purposes, such as encryption and digital signatures.
366 | Cryptography
9.1.6.1 Asymmetric Encryption Asymmetric encryption can be used to encrypt information. In such a case, the public key is used for encryption and the private key for decryption (Fig. 9.4). If participant A wants to send a confidential message to the participant B: 1. A takes the public key (eB) of B, e.g. from a public directory 2. A encrypts the message with the public key of B. As only B has the corresponding private key dB, only B can decrypt the message encrypted with the public key of B. Asymmetric encryption can be used for the encryption of secret (shared) keys. The secret key for symmetric encryption can be encrypted using the public key of the intended recipient. Only the indented recipient, with the corresponding private key, can decrypt and make use of the secret key. P ✁✂✄c directory ❯✎✆✡
❯✎✆✡ ✏☞✑☞
Key
✓ ✠ ✳✳ ✳
①✝✒ ✳✳✳ ☞✁❝ ✳✳✳ ✳✳ ✳
❑✆✝ ✌ ❑✆✝ ✍ ✳✳ ✳
Cyphertext
❯✎✆✡ ✠
P ✁✂✄❝ ☎✆✝ ✞✟ ✠ Plaintext
❯✎✆✡ ✓
Plaintext
P✡✄☛☞te key Fig. 9.4: Asymmetric encryption.
9.1.6.2 Digital Signatures By the virtue of asymmetric encryption, encrypting a message with the private key and decrypting it with the public key makes it clearly evident, who has sent the message (Fig. 9.5). Due to the equivalence of this to signing, it is known as a digital signature. Digital signature mechanisms can be used to provide entity authentication, data origin authentication, data integrity and non-repudiation services [Ker83].
Basic terminologies | 367
Fig. 9.5: Digital signature.
Combination of Encryption and Digital Signatures: A message can be encrypted and signed at the same time (Fig. 9.6). This can be done by signing (encrypting) the message using the private key of the sender and then additionally encrypting the message using the public key of the receiver. Thus the digitally signed message is transmitted as encrypted over a public network.
Fig. 9.6: Combination of encryption and digital signatures.
368 | Cryptography
9.1.6.3 Man-in-the-Middle Attack In a man in the middle attack, the attacker is located between two legitimate communication partners and intercepts their messages. In the view of public key cryptography, the man in the middle attack can be launched by the attacker E, to intercept the communication between legitimate communication partners A and B, as follows (Fig. 9.7): 1. A sends his public key to B, which is intercepted by E and replaced with his own public key. 2. B sends his public key to A, which is also intercepted by E and replaced by its own public key. 3. Now when A sends an encrypted message to B, E intercepts the message and since it is encrypted with E’s public key, E decrypts it with its own private key to read its content. E then encrypts the message with B’s public key and sends it to B. B decrypts the message with its private key. Here B is unable to notice that the message was already disclosed. The same is true for communication from B to A. Considering that the public keys are publically known, the man-in-the-middle attack has more significance in asymmetric cryptography than in symmetric cryptography.
User A
Original Connection
User B
Attacker
Fig. 9.7: Man-in-the-middle attack.
9.1.6.4 Certificate The problem of the man-in-the-middle attacker being able to replace the public key of the communication partner(s) by his own can be addressed by using authenticated communication channels. It needs to be ensured that the public key is authentic, i.e., the public key belongs to the claimed entity and the authenticity is reliable. This can be ensured using different mechanisms [Rul93]. One of these mechanisms is the use of a public key certificate. A certificate, in short, is a proof of the ownership of a public key. It is someone’s public key signed by a trusted third party, called the certification authority. The public key certificates are normally available in public key databases and contain information about the owner of a certificate, such as name, address etc. and the public key certificate. The signer
Basic terminologies | 369
are normally available in public key databases and contain information about the owner of a certificate, such as name, address etc. and the public key certificate. The signer of the certificate guarantees that the information about the owner is correct and that the public key really belongs to the owner. The signer is normally a trusted third party called a certification authority (CA). Certificates bind a public key to a subject. In practice, a certificate contains a lot of information. As an example, consider X.509 v3 certificates, where X.509 is an international telecommunication union’s (ITU-T) standard for public key infrastructure. The X.509 certificate binds a public key to a distinguished name, an email address etc. The standard format of an X.509 v3 certificate is shown in Figure 9.8.
Fig. 9.8: X.509 v3 certificate.
370 | Cryptography
9.2 One Way Collision Resistant Hash Function 9.2.1 Characteristics A one way hash function H maps an input M of variable length to an output H(M) of fixed length (Fig. 9.9). This property has many functionalities, e.g., to calculate the checksum for data transmission. Hash functions are standardized in ISO standard ISO/IEC 10118.
Input
Hash function
Output
Fig. 9.9: Principle of hash function.
The one way collision resistant hash function has following properties: – When an input M is given, it should be easy to calculate the output value H(M) = h. – If a hash value h is given, it should be difficult to find a suitable input string M such that H(M) = h holds. This property is called pre-image resistance. – If M and H(M) = h are given, it should be difficult to find another suitable M′, such that H(M′) = H(M) = h holds. This property is known as second pre-image resistance. – It should be difficult to find any two different messages M and M′, such that H(M) = H(M′). This property is called collision resistance. “It should be difficult to find a suitable input string or another input string” means that it may not be possible with a reasonable effort. The “reasonable effort” depends on the level of development of resources and the security requirements of the user. Such one way collision resistant hash functions are useful in the following cases: – To compress a message to a shorter “fingerprint” which serves as input to electronic signatures process or to confirm a given value without disclosing the value itself. – As a pseudorandom number generator, since the hash must have random characteristics.
One Way Collision Resistant Hash Function | 371
The output of the hash function is called the hash code. Sometimes, the hash code is also called the hash value or just hash. It can be used for example as modification or manipulation detection code, checksum or a verification code. The hash code should not be confused with a message authentication code (MAC), whose goal is to authenticate a message and for the calculation of which a secret key is needed. MAC will be explained in Section 9.6. The hash function is key independent and publically known. In the context of digital signatures and cryptographic algorithms, one understands “hash function” implicitly as a “one way collision resistant hash function”. The probability of finding two messages with the same hash code depends on the length of the hash code.
9.2.2 Security of a One Way Collision Resistant Hash Function As the length of hash value is smaller than the message length, many messages result to the same hash value tag. Let h = H(M), where h is the hash value of a message M, generated using the hash function H. An adversary might want to find another message M′, which has the same hash value as H(M). This gives the adversary an opportunity to replace the original message with a fake message. Perhaps it is enough for an attacker to find any pair of messages, which have the same hash value, called to find a collision, for replacing the one message by the other one. The effort required to do this is better understood by understanding the mathematical problem called birthday paradox. The probability that two messages have the same hash value is dependent on the length of the hash value. An important question therefore is what should be the minimum length of the hash value? This is essentially determined by the birthday paradox. The name of this fact is determined by the phenomenon that only an amazingly low number of people in a room are required such that the probability for two of them to have the same birthday is high (> 1/2). This is analogical to asking the question, how many pairs of messages and hash values have to be generated such that the probability for two of them to have the same hash is greater than 1/2? For n input values, there are n/(2·(n – 1)) possible pair combinations. When there are k output values, e.g., the number of possible birthdays or hash values, then the probability, that a pair of input has the same output is 1/k. After taking k/2 pairs, the probability is 50 %, i.e. if:
n k (n − 1) > 2 2
(9.1)
then for > √ , the probability that the second input of a pair has the same output as the first input is high.
372 | Cryptography
The probability that there is no collision is given by: k −1
k −1 1 2 (1 − ) ⋅ (1 − ) ⋅ ... ⋅ (1 − )= n n n
i
∏ (1 − n )
(9.2)
i =1
When x is a small real number, then:
1 − x ≈ e−x
(9.3)
where this is obtained by choosing the first 2 terms of the series expansion:
e −x = 1 − x +
x2 x3 − + ... 2! 3!
(9.4)
Substituting in the above equation gives: k −1
∏ i =1
i (1 − ) ≈ n
k −1
∏
e
−
i n
=e
−
k ( k −1) 2n
(9.5)
i =1
Thus the probability of at least one collision is 1 − e
−
k ( k −1) 2n
and after presenting it with
p it holds:
e
−
k ( k −1) 2n
≈ 1− p
(9.6)
and can be expressed as:
−
k (k − 1) ≈ ln(1 − p) 2n
(9.7)
or:
k 2 − k ≈ 2n ⋅ ln
1 1− p
(9.8)
Ignoring the term – k gives:
k ≈ 2n ⋅ ln
1 1− p
(9.9)
One Way Collision Resistant Hash Function | 373
and for p = 0.5:
k ≈ 2 ⋅ ln 2 n ≈ 1.17 n
(9.10)
In case of n = 365, k is 1.17 365 ≈ 23 . In case of n = 2128 (length of the hash value = 128 bits) the complexity of a collision attack is only 1.17⋅264, which is too small. Therefore hash functions should generate hash values with at least 160 bits. Today hash value lengths of 192, 228 and 256 bits are used.
9.2.2.1 Random Oracle and Avalanche Effect Hash functions, like most cryptographic functions, should behave like a random function, i.e., for each input, the output seems to an observer like a random number. The output is independent on all previous inputs. For the same input, the same output is produced. For a new input, the output is totally new, uniformly and randomly distributed also known as the random oracle. In a random oracle, the output is independent of the input; however, it is still interesting to see how many bits in the output change when the input is changed. The probability Pd that d bits of the output change, if the input is changed, is given by:
n 1 Pd = n d 2
(9.11)
where n is the length of the output. With a high probability, about 50 % of the bits of the hash value are changed. With a very small probability, only a few bits are changed. The influence of the change of input bits on the output bits, even when a single bit is changed, is known as the Avalanche effect.
9.2.3 Hash Functions in Practice Dedicated hash functions are specified in [ISO10118-3]. These include RIPEMD-160, SHA-1, SHA-256, SHA-384 and SHA-512, where SHA-1 has a hash code length of up to 160 bit, SHA-256 and SHA-384 have hash code lengths of up to 256 and up to 384 bits respectively. SHA-512 has a fixed hash code length of 512 bits. WHIRLPOOL also has a hash code length of up to 512 bits.
374 | Cryptography
Each of these dedicated hash functions make use of a round function which is called iteratively. They also include a padding method and initialization values. In practice, the Merkle-Damgård construction [Mer79] is used to build collision resistant cryptographic hash functions. The hash function pads the message, breaks the message into blocks and processes the blocks of a message one by one applying a compression function on each input block and combining it with the output of the previous round. The Merkle-Damgård construction is shown in Figure 9.10. Let the one way compression function be denoted as f. The message is padded with 0s followed by the message length and split into blocks. In each round, the compression function takes a message block, compresses it and combines it with the output of the previous round (the IV is used as the output of the previous round as a starting value). Finally, at the end a finalization function may be used. This might be done to compress the output further or to perform a better mixing on the output of the hash function. It is to be noted that the Merkle-Damgård construction is not resistant to the length extension attack (see Section 9.6.4).
IV
Message Message block 1 block 2
Message block n
Message Message block 1 block 2
Message Length block n padding
f
f
f
f
Finalisation
Hash
Fig. 9.10: Merkle-Damgård construction.
9.3 Block Cipher 9.3.1 Product Cipher A block is defined as a bit string of a defined length [ISO18033-1]. One speaks of a block cipher, when the encryption algorithm processes a block of plaintext, i.e., a bit string of a fixed length, to produce a block of cipher text. If the length of the plaintext is larger than one block, then it is split into multiple blocks and then the blocks of plaintext are encrypted to produce cipher text blocks (Fig. 9.11).
Block Cipher | 375
Plaintext
Key
❇✤✥✦✧ ★✩✪❤✫✬ ❊✭✦✬✮✪
tion
Ciphertext Fig. 9.11: Block encryption.
An n-bit block cipher produces a cipher text of n-bits from a plaintext of n-bits using a secret key. This means that if n > 1, the use of a block algorithm causes an inherent delay of collecting n bits before performing encryption or decryption.
9.3.2 Padding Sometimes it is necessary for the input string which is to be subjected to cryptographic operations, to be divisible by a particular block length. If the length of the input string is not divisible by the block length, then padding is needed. Padding is defined as appending extra bits to a data string [ISO10118-1]. Padding can be done by appending a 1 followed by “0”s to the input string, or vice versa, if the length of the message is not known to the receiver. It may happen that the message is extended by a whole block length. If the receiver knows the length of the message from other means such as the header of the message, then it is enough to fill the message with the binary value of “1” s or “0”. In the ANSI standard X9.23, one can choose between a bit oriented or octet oriented padding. The plaintext is extended to the desired length with any bits or octets. The last 8 bits are left free and coded as follows: – The most significant bit indicates whether it is bit or octet oriented padding. – The remaining 7 bits indicate by how many bits or octets the plaintext has been extended. Since the length information has to be there in any case, i.e., the last 8 bits are always used, it is possible that the message is complemented by a complete block. In PKCS#7 padding, specified in RFC 5652, the blocks of bytes are padded with the number of padded bytes. For example, if the plaintext is 14 bytes and split into groups of 8 bytes, then the last 2 bytes have the value 2. However, if the plaintext is 16 bytes long then another block of 8 bytes is appended and the value of each byte
376 | Cryptography
in this block is set to 8. The padded bytes can be removed simply by looking at the value of the last byte and removing that many bytes from the data.
9.3.3 Block Ciphers in Practice 9.3.3.1 Advanced Encryption Standard Advanced Encryption Standard (AES) is a symmetric block cipher specified in [ISO18033-3]. AES processes data blocks of 128 bits and supports key lengths of 128, 192 and 256 bits. The respective algorithms are called AES-128, AES-192 and AES-256. The number of rounds Nr varies with the key length. Nr is 10 for key length 128, 12 for key length 192 and 14 for key length 256.
Plaintext Initial round
❆✯✯✰✱undKey
RoundKey 0
BytesSub
Normal round (Nr-1 times)
ShiftRows MixColumn AddRoundKey
RoundKey 1,...,Nr-1
BytesSub Final round
ShiftRows AddRoundKey
RoundKey Nr
Ciphertext Fig. 9.12: AES encryption.
The data after each transformation is known as state. Each round in AES (Fig. 9.12) has four byte oriented transformations, subBytes, shiftRows, mixColumns and addRoundKey. The state in AES is represented as a two dimensional array of bytes for ease. Since the data block size is 128 bits, the state is an array of 4×4 bytes. As the names suggest, subBytes support Byte substitution by a non-linear substitution function, implemented as S-boxes, shiftRows supports shifting of the rows
Block Cipher | 377
of the state array by different distances, and mixColumns combines the bytes of each column of the state array using an invertible linear transformation. mixColumn together with shiftRows provide diffusion. addRoundKey XORs the actual the state of the array with the Roundkey Ki, where 1 ≤ i ≤ 32. The decryption process (Fig. 9.13) uses inverse transformations corresponding to each transformation used in the encryption. These inverse transformations are represented by invsubBytes, invshiftRows, and invmixColumns.in reverse order of the rounds.
Plaintext RoundKey 0
AddRoundKey Inv BytesSub
Final round
Inv ShiftRows Inv MixColumn RoundKey 1,...,Nr-1
AddRoundKey Inv BytesSub
Normal round (Nr-1 times)
Inv ShiftRows RoundKey Nr
AddRoundKey
Initial round
Ciphertext Fig. 9.13: AES decryption.
The new round keys K′i are derived from the keys Ki as follows: K′i = Ki (for i = 0 or i = Nr) or invmixColumns(Ki) for (for 1 ≤ i ≤ Nr – 1)
9.3.3.2 Lightweight Cipher PRESENT For all practical applications, AES is an excellent and preferred choice. However, for certain constrained and embedded environments, such as RFID tags and sensor networks, AES is not suitable [Boetal07]. PRESENT algorithm is a lightweight symmetric block cipher standardized in [ISO29192-2]. It processes 64 bit long data blocks
378 | Cryptography
and supports key lengths of 80 bits and 128 bits. The respective algorithms are called PRESENT-80 and PRESENT-128. PRESENT cipher processes data in 31 rounds. A total of 32 keys are derived from the given key. One key is used in each round and the 32nd key is used after the last round. The data after each transformation is known as state in PRESENT. Each round has three steps, addRoundKey, sBoxLayer and pLayer. addRoundKey XORs the current round key with the current state. sBoxLayer is a non-linear substitution function, which substitutes 4-bit with another 4-bits. Thus such 4-bit to 4-bit substitutions are applied 16 times in parallel in each round. The pLayer function permutes the bits of the state to scramble the output state of the current round. Finally, addRoundKey is performed over the state and the last (32nd) key after the last round to achieve a whitening effect over the state. The update function produces the next round key for use in the next round. Let Ki, 1 ≤ i ≤ 32, be the round keys. The scheme of PRESENT is shown in Figure 9.14. Key reg✐✲✴✵✶
Plaintext addRoundKey sBoxLayer
✉✷✸✹✴✵
pLayer
.. . sBoxLayer
✉✷✸✹✴✵
pLayer addRoundKey Ciphertext
Fig. 9.14: PRESENT encryption.
The decryption process of PRESENT algorithm applies inverse operations in the reverse order as that of encryption. It uses addRoundKey, invSBoxLayer and invPLayer. Here addRoundKey is similar to that used in encryption. invSBoxLayer is the inverse of the substitution performed by the sBoxLayer, i.e., the 4-bit to 4bit substitutions are reversed. invPLayer performs the permutation in reverse order as compared to the encryption process. The decryption process of PRESENT algorithm is shown in Figure 9.15.
Modes of Operations for Block Ciphers | 379
Ciphertext
Key reg✺❀❁❂❃ addRoundKey
✺✻✼✽✾✿
yer
invUpdate
invSBoxLayer
.. .
.. .
invPLayer
invUpdate
invSBoxLayer addRoundKey Plaintext Fig. 9.15: PRESENT decryption.
Other noteworthy lightweight symmetric block ciphers include HIGHT, DESXL, and CLEFIA. Out of these ciphers, CLEFIA is also standardized together with PRESENT cipher in [ISO29192-2].
9.4 Modes of Operations for Block Ciphers N-bit block oriented algorithms transform plaintext blocks of N-bits into ciphertext blocks of N-bits and vice versa. Five different modes of operations are defined in [ISO10116] for such algorithms. Some of these modes require the use of padding to fill the input string to a complete block length. The five modes of operations defined in [ISO10116] are: – Electronic Codebook (ECB) – Cipher Block Chaining (CBC) – Cipher Feedback (CFB) – Output Feedback (OFB) – Counter (CTR)
9.4.1 Electronic Codebook (EBC) The ECB mode is similar to the standard encryption algorithm. The plaintext is divided into multiple blocks (Fig. 9.16). Each block is encrypted independently of the other blocks. It is like looking into a dictionary and for each plaintext block finding
380 | Cryptography
a cipher text block and vice versa. The last part of the message may need to be padded to the block boundary. Each block of N bits is encoded independently of the other blocks. Due to this independence, the order of encryption of blocks can be changed as desired without affecting the decryption. In ECB mode, same plaintext blocks always yield the same cipher text blocks. This property is very critical as often repetitive code sequences occur in databases and data communication protocols. It offers also the possibility of Chosen Plaintext Attacks, where an attacker can gain pairs of plaintext and cipher text. Bit or burst errors in a cipher text block cause a corresponding erroneous plaintext block. This is based on the property of the encryption algorithms that any change of the input block changes about 50 % bits in the output block. However, due to the independent encryption and decryption of blocks, the error is not propagated beyond the affected blocks. If the block boundaries are lost during transmission, e.g., by transmission errors, the synchronization between encryption and decryption is lost. All the subsequent blocks are no longer correctly decrypted until the return to the correct block limits. In ECB mode, the same plaintext under the same key is always encrypted to the same cipher text. M1
Sender
Receiver
M2
BA
K
K
... BA
Mn
K
C1
C2
...
Cn
C1
C2
...
Cn
K
BA-1
M1
Fig. 9.16: ECB mode.
K M2
BA-1
K
...
Mn
BA
BA-1
Modes of Operations for Block Ciphers | 381
9.4.2 Cipher Block Chaining (CBC) In CBC mode the encryption of each block depends on the previous block. For the chaining mechanism to also work for the first block, an agreed upon starting value is required, called the Initialization Value or Initialization Vector (IV). This IV is used in the first step as the previous encrypted block. The IV does not need to be kept secret. In CBC mode, the same plaintext results in the same cipher text if the same key and the same IV are used. This can be avoided by using a different key or a different IV for encrypting the next plaintext. Since the message is encrypted block wise, the last block may need to be padded to the block boundary (Fig. 9.17). Since chaining is used, the cipher text produced is dependent on the IV and all the previous plaintext blocks. This means that the plaintext blocks cannot be rearranged or shuffled. Additionally, the use of a different IV prevents the same plaintext from producing the same cipher text using the same key. In CBC mode, the same plaintext, under the same key and same IV, always produces the same cipher text. This problem can be solved by changing the key, the IV or the first block of the plaintext. The first block can be changed normally by prepending a random value to the plaintext. M1
IV
Sender
IV
Receiver
K
M2
BA
K
M3
BA
K
C1
C2
C3
C1
C2
C3
K
M1
BA-1
K
M2
BA-1
K
BA
BA-1
M3
Fig. 9.17: CBC mode.
If a bit or burst error occurs in a cipher text block, the decryption of that and the succeeding blocks is disrupted. If the ith cipher text has errors, the ith plaintext has
382 | Cryptography
approximately 50 % of the bits in error due to the Avalanche Effect [Meetal96]. In the (i + 1)th plaintext block, however, only those bits are disturbed that have been disturbed in the ith cipher text block. This means if the last block can be decrypted correctly, it does not guarantee that the entire message was successfully decrypted. If the block boundaries are lost during transmission, the synchronization between encryption and decryption is lost, i.e., all the subsequent blocks are no longer correctly decrypted until the correct block limits are reached.
9.4.3 Cipher Feedback (CFB) In CBF mode, j bits (1 ≤ j ≤ N) of plaintext are encrypted at a time. When j = 1, the encryption of individual bits is performed. When j = N, it works as a chained block encryption, where each cipher text is a function of the current and all the preceding plaintexts. A typically used value for j is 8. A shift register of length N is used in encryption and decryption, which is loaded with the IV. For every j bits to be encrypted, a complete block encryption is performed. Thus the method is not as efficient as the ECB or CBC mode. The encryption starts by encrypting the content of the shift register. The most significant j bits of the output of the encryption are XORed with the first j bits of the plaintext. The resulting j bits of the cipher text are transmitted. The shift register is then shifted k bits to the left and the right most j bits are filled with the transmitted cipher text. This process is repeated for all the j bit units of the plaintext. Filling up the feedback with k – j “1” bits reduces the number of possible states of the shift register and reduces the effort of dictionary attacks. Therefore k = j is recommended. The decryption of a j bit unit of cipher text is the same as the encryption. The only difference is that the received j bit cipher text is XORed with the output of the encryption function to get the plaintext. It is to be noted that encryption and not decryption is used during both encryption and decryption steps (Fig. 9.18). Due to chaining, the cipher text is dependent on the whole of the preceding plaintext. Since the j bit encrypted units are concatenated together, the order of these cannot be reversed. Using different IV produces a different cipher text. It is to be noted that both the encryption and decryption sides use the encryption mode of the encryption algorithm. The CFB mode can be used to convert a block cipher into a stream cipher. If the bits of the j bit cipher text unit are disturbed in CFB mode, the decryption units will be incorrect until the faulty bits have been shifted out to the left at the receiver side.
Modes of Operations for Block Ciphers | 383
Fig. 9.18: CFB mode.
If the boundaries of the j bit units are lost during transmission, the synchronization between encryption and decryption is lost, i.e., all the subsequent blocks are no longer correctly decrypted until the correct block limits are reached. The error propagation continues N bits afterwards. If an entire j bit unit is lost or inserted, the synchronization will automatically be re-established after N bit. CFB mode is self-synchronizing in the case j = 1: after receiving N error free bits after an error or bit slip the decryption delivers the original plaintext.
9.4.4 Output Feedback (OFB) The structure of the OFB mode is similar to that of the CFB mode. The only difference is that the output of the encryption function is fed back into the shift register instead of feeding back the part of cipher text. Another difference is that instead of the j bit parts, complete plaintext blocks of length N are used to produce N bit cipher text blocks. In OFB mode, the same plaintext, key and IV produces the same cipher text. Therefore a different IV should be used every time for a new plaintext. It is to be noted that both the encryption and decryption sides use the encryption mode of the
384 | Cryptography
encryption algorithm. In order to encrypt j bits, a whole block encryption is required; therefore, this mode is not as efficient as the ECB or the CBC mode, if j < N. There is no error propagation when bits or group of bits are disturbed in the cipher text. Any faulty bits in the cipher text give a corresponding erroneous plaintext and this does not affect the next bits. The OFB mode is not self-synchronizing. If encryption and decryption are out of synchronization, the system must be reinitialized. Such a loss of synchronization can be caused by bit slips of the underlying transmission system. For the resynchronization another initialization vector should be used.
Fig. 9.19: OFB mode.
The leftmost j bits in Figure 9.19 can be used as the output of a pseudorandom number generator.
9.4.5 Counter Mode (CTR) CTR mode has gained a lot of attention recently, mainly with the application to asynchronous transfer mode (ATM) or disk encryption. A counter with size equal to that of plaintext is used to encrypt each plaintext block. The counter value is encrypted and the output of the encryption function is XOR-ed with the plaintext to
Modes of Operations for Block Ciphers | 385
produce the cipher text. The counter is normally initialized to a certain value known to both the sender and the receiver. The counter is incremented to encrypt or decrypt each block. There is no chaining in CTR mode. During decryption, the cipher text is XOR-ed with the encrypted counter value (Fig. 9.20).
Fig. 9.20: CTR mode.
The starting counter value should be shared between the sender and the receiver to be able to perform encryption and decryption. There is no error propagation in CTR mode. Each cipher text is decrypted independently of any other block; therefore, error in one cipher text block does not impact the decryption of another cipher text block. If the block boundaries or complete blocks are lost during transmission, the synchronization between encryption and decryption is lost. All subsequent blocks are no longer correctly decrypted until the return to the correct block structure.
9.4.6 Other Modes of Operation Other modes of operation exist aside from the five standard modes of operations discussed above. Amongst them, the most noteworthy mode of operation is the cipher text stealing mode (XTS) which is used for disk storage, standardized by
386 | Cryptography
IEEE Std 1619 [IEEE1619] and NIST Special Publication [Dwo10]. Galois Counter Mode (GCM) and Galois Message Authentication Code (GMAC) are standardized by NIST in [Dwo07]. GCM supports authenticated encryption (i.e., authentication and confidentiality at the same time) and is widely adopted for its enhanced performance due to the ability of parallel processing. GMAC is an authentication only counterpart of GCM. Offset Codebook Mode (OCB) is another mode of operation that supports authenticated encryption. OCB has three different versions, OCB1 [IEEE802.11], OCB2 [ISO19772] and OCB3 [RFC7253]. Authenticated encryption is very interesting, because encryption itself doesn’t prevent or recognize data modification or manipulation performed by third parties during transmission or storage. Cryptographically secure redundancy has to be added to recognize modifications or manipulations of plaintext of cipher text.
9.5 Bit Stream Ciphers A bit stream encryption is used when no delay should be caused by encryption and decryption. In this case only a single bit is encrypted at a time just by a XOR operation (Fig. 9.21). Bit stream encryption is normally performed at the physical layer of the OSI reference model. One possible shortfall of the bit stream encryption is that the decryption might return wrong data if the received bit stream is erroneous. Therefore the consideration of synchronization and error propagation properties is of particular importance in bit stream encryption. CFB mode and OFB mode are examples for bit stream encryption, if j = 1 (see Section 9.4).
▲❅▼◆❖ ❍■❏❍ Key
❄❅❉❋●ithm
Key stream
XOR
Cipher ❍■❏❍ Fig. 9.21: Bit stream encryption.
In [ISO18033-4] two different types of stream ciphers are standardized: synchronous stream ciphers and self-synchronizing stream ciphers. Ideally the key stream used in the bit stream cipher is as long as the plaintext. This however, creates problems with the sharing of the key stream. One solution is to use a pseudo random bit stream generator algorithm using a key, also called a seed, on both sides (Fig. 9.22).
Bit Stream Ciphers | 387
❪❚❳❫❨❳❩❬am ❣❬❱❬❩❙❳❚❛❱ ❙❘❣❛❩ithm
Key K
Key stream
◗❘❙❚❱ ❲❚❳❨❳❩❬❙❭
❪❚❳❫❨❳❩❬am ❣❬❱❬❩❙❳❚❛❱ ❙❘❣❛❩ithm
Key K
Key stream Cipher ❲❚❳❨❳❩❬❙❭
◗❘❙❚❱ ❲❚❳❨❳❩❬❙❭
Fig. 9.22: Stream cipher using bit stream generator.
OFB-mode is an example for a synchronous stream cipher, CFB mode an example for a self-synchronizing stream cipher. Nevertheless they are very inefficient for stream encryption, because only one bit of a cipher block is used for the XOR operation. In practice, algorithms are used for stream ciphering, which are based on linear feedback shift registers, which are supplemented by non-linear feedback or nonlinear functions to increase the linear equivalent. Each generator, which produces a periodic sequence, can be replaced by a linear shift register – even it is quite long. The length of such replacing linear shift register is called linear equivalent of the generator. If the linear equivalent is known, it is proven by Berlekamp-Massey [Meetal96], that an output sequence of double linear equivalent is enough to predict the further output – und calculate also the preceding sequence. Some of the most widely used stream ciphers include RC4, SEAL, Enocoro and Trivium [ISO29192-3]. The generation of random and pseudo random numbers is explained in Section 9.8. The only encryption method, which is provable secure, and its security was proven by Claude Shannon, is the stream or block encryption with a one-time key stream. One time key stream means, that the probability of the value of each bit is 0.5 and independent of all bits, which have been generated before, and the key stream or parts of it have not been used before. The practical problem is that this one time key stream has to be shared between the authorized entities, before encryption can happen. This type of encryption is called One time pad or Vernam-Chiffre.
388 | Cryptography
9.6 Message Authentication Codes 9.6.1 Generation Message Authentication Code (MAC) algorithms are data integrity mechanisms that compute a short string (MAC) as a complex function of every bit of the data and of a secret key. Their main security property is unforgeability: someone who does not know the secret key should not be able to predict the MAC on any new data string. MAC is also used to verify the data integrity in situations where there is a risk of intentional and unintentional changes introduced to the data during storage or transmission. Data integrity is defined in [ISO7498-2] as the property that the data has not been altered or destroyed in an unauthorized manner. A MAC algorithm is defined as an algorithm for computing a function which maps a string of input bits and a secret key to another string of bits, of a fixed length [ISO9797-1]. It is similar to a one way collision resistant hash function (see Section 9.2), but it uses a shared key, so MAC functions belong to symmetric cryptography. – –
The MAC function must satisfy the following two properties: For any key and input string, the function can be efficiently computed. For any given key, with no prior knowledge of the key, it is computationally infeasible to find the function value for a given input string, even if there is a knowledge of a set of input strings with their corresponding function values, where the value of the ith input string might have been chosen after observing the function values for the first i – 1 input strings (where i is and integer, such that i > 1).
ISO has three different standards for MAC generation mechanisms. [ISO9797-1] defines the mechanisms for MAC using block ciphers, [ISO9797-2] defines the mechanisms for MAC based on dedicated hash functions and [ISO9797-3] defines MAC mechanisms based on a universal hash function. MAC algorithms support authentication of data origin. Authentication of data origin means [ISO9797-2] that a MAC can provide assurance to the receiver, that a message has been originated by an entity in possession of the shared secret key. Nevertheless, digital signatures can provide assurance to third parties, that a message has been originated by an entity in possession of private key. The MAC or MAC tag is calculated on the data using a secret symmetric key and appended to the data (Fig. 9.23).
Message Authentication Codes | 389
r❵ss❧♠❵ ❴❵❜❞❵t ❦❵❡
MAC
❴❢❥❞❧♠❵♥ ♦♣❧qq❵l
rt♦ ✈✇q❜❢②❥n
Fig. 9.23: MAC mechanism.
If one wants to verify the order and completeness of message sequences or to prevent replay attacks, a time varying parameter such as a sequence number or a timestamp is appended to the message before calculating a MAC tag. The receiver receives the possibly modified message M′ and a possibly modified MAC′. The receiver then calculates a MAC′′ on the received message M′ and compares it with the received MAC′. If MAC′′ is equal to MAC′, it is concluded that the message is received without any errors (Fig. 9.24). The receiver also assumes that the message originated from the sender with whom he shares a secret key.
③⑨⑩⑥❶❷④❸ ❹❺❶❻❻④l ③④⑤⑥④t ⑦④⑧
❼④❽❽❶❷④ ❼❾❹ ❿➀❻⑤⑨➁⑩n
❼❾❹➂ ③➀⑤⑤④❽❽❿➀➅ ➇④❽ ❼❾❹➂ ➄ ❼❾❹➃ ➆④⑥➁❿➁⑤❶⑨➁⑩n
➈⑩ ➉❻❽➀⑤⑤④❽❽❿➀➅ ➆④⑥➁❿➁⑤❶⑨➁⑩n
❼❾❹➃
Fig. 9.24: MAC verification at the receiver.
9.6.2 MAC generation using symmetric block cipher In [ISO9797-1], six algorithms are specified for the generation of an m-bit MAC using a symmetric N-block cipher, such as DES, 3DES or AES. This mechanism is also known as CBC MAC, where CBC stands for the Cipher Block Chaining mode specified in [ISO10116]. The message D is divided into N-bit blocks D1, D2,…, Dk, where the last block might need to be padded. The padding is however used temporarily for calculating the MAC tag at the sender and the receiver sides and does not need to be transmitted or stored along with the message. From the N bit output, the left most m bits are chosen as the MAC tag. Here, m should be as large as possible, preferably equal to N. If N is not large enough to protect against collision attacks (see Section 9.6.4.4), the MAC should be extended, e.g. by concatenation with the MAC symmetrically encrypted with another key.
390 | Cryptography
If the same algorithm is used for MAC computation and encryption of the message, then it is advisable to choose separate keys for each of the security mechanism.
9.6.3 MAC Generation Using Dedicated Hash Function In [ISO9797-2] three MAC algorithms are standardized that use a one way collisionresistant hash-function and a shared secret key.. All of the three dedicated hash functions are chosen from [ISO10118-3]. The strength of the data integrity and message authentication mechanisms is dependent on the length and secrecy of the key, on the length and strength of the hash-function, on the length of the MAC, and on the specific mechanism. The first of the three hash mechanisms specified in [ISO9797-2], is called MDxMAC. It makes a minor adjustment to the round function by adding a key to the additive constants in the round function. The second mechanism is known as HMAC and it calls the complete hash function two times, in contrast to the first mechanism. The third mechanism is a variant of the first one that considers only short strings as input.
9.6.4 Security Aspects 9.6.4.1 Length Extension Attack Length Extension Attack (Fig. 9.25) can be launched on MAC algorithms using hash values of type H(key || message), i.e., the MAC codes with the construction, in which the key is prefixed to the message, e.g., MD5, SHA-1, SHA-256.
IV
Message Message block 1 block 2
Message block n
Message Message block 1 block 2
Message Padded data block n by attacker
f
f
f
Attacker
f
Hash
Fig. 9.25: Length extension attack.
Let the message M sent from legitimate user A to B is a sequence of blocks x = (x1, x2, x3,..., xn). A computes the authentication tag on M as:
Message Authentication Codes | 391
m = MACk ( x) = H (k || x1 , x2 , x3 ,..., xn )
(9.12)
The problem with the construction is that the MAC for the message x′ = (x1, x2, x3,..., xn, xn+1) can be constructed from m, by appending a block xn+1 to the message x, without the knowledge of the secret key. Thus B accepts x′ as a valid message, even though only x was authenticated by A. The attack is possible due to the fact that calculating MAC on the additional message block xn+1 only requires the output of the previous hash, which is equal to A’s m, and xn+1 as input but not the secret key. A hash based MAC resistant to the length extension attack is the hashed MAC (HMAC) construction using two hashes, the inner and outer hash. The key (K) is first XORed with an inner pad (ipad) and the result ( ⨁ ) is pre-pended as a block to the message blocks (x1, x2,...., xn). The result is hashed as:
hipadx = H ( K ⊕ ipad || x1 || x 2 || ...... || x n )
(9.13)
Now the key is XORed with the outer pad (opad) and pre-pended to the hipadx as calculated above. The result is hashed to produce the HMAC of message x as:
HMAC ( x) = H ( K ⊕ opad || H ( K ⊕ ipad || x1 || x 2 || ...... || x n ))
(9.14)
The ipad and opad are chosen as constant bit patterns: opad = 0x5c5c...5c and ipad = 0x3636...36 of lengths equal to one block length. The length extension attack can be also applied to hash functions (see Section 9.2.3).
9.6.4.2 Forgery Attack In forgery attacks, the MAC for a sent message M is predicted by an attacker. If the prediction is correct for every message, it is called an “existential forgery”. If however, the prediction is correct for a specific message, the attack is called “selective forgery”. Additionally, the attacker can verify if the attack is successful or not (verifiable/non-verifiable attack).
9.6.4.3 Key Recovery Attack In a key recovery attack, the attacker acquires the knowledge of the key. The security of MAC is dependent on the length k of the secret key and the length of the chaining variable j, i.e., the output length of the iterative compression function. In order to perform a brute force attack, the attacker needs probably 2k – 1 trials. Additionally an attacker requires the knowledge of n/k message/MAC pairs, in order
392 | Cryptography
to verify that he has found the correct key, as there are 2k/2n keys that yield the same MAC for a given message. In practice, j ≤ k ≤ n, or even j = k = n.
9.6.4.4 Collision Attack Since the MAC generation function reduces the m bit input to n bit MAC, different messages have the same MAC. Therefore the collision attacks as described in Section 9.2. can be performed by an attacker, if the attacker is able to calculate as many MACs as necessary to find a collision. Therefore the length of the MAC should be at least twice as long as the requested security level.
9.7 Digital Signatures 9.7.1 Digital Signatures with Appendix For digital signatures with appendix a hash value is computed on a message, the hash value is signed and transmitted with the message (Fig. 9.26). For verification the receiver achieves the inverse signature operation on the received signed hash value with use of the sender’s public key, which will give the hash value calculated by the sender, if no modification of the message or signed hash value happened. The result is compared with the hash value of the received message calculated by the receiver. If both values are equal, the verification is ok. Digital signature schemes with appendix are standardized in ISO/IEC 14888 [ISO14888-1]. Digital signatures with appendix are the most common use of digital signatures.
+
Message
Hash function
Hash-value
Signature
Signature Algorithm
a)
Signature'
Sender's public key
Signatire verification algorithm
Message'
Hash-value'
Hash function
Verification OK / NOT OK b)
Fig. 9.26: Signing a message: a) Generation; b) Verification.
Sender's private key
Digital Signatures | 393
Fig. 9.27: Verifying a Signed Message: a) Generation; b) Verification.
9.7.2 Digital Signatures with Message Recovery In case of the digital signature with message recovery, the message or part of it is contained in the signature and not transmitted in plaintext. The verification process reveals all or part of the message (Fig. 9.27). The message is supplemented by redundancy and then transformed by the signature algorithm, signed by using the private key and transmitted. At the receiver, the original message is not needed for verification because it can be obtained through inverse transformation. It is recommended, that at least 50 % of the input of
394 | Cryptography
the signature generation contains redundancy, which is generated under cryptographic security aspects Therefore digital signatures with message recovery can only be applied to very short messages, because the complete input of the signature algorithm has to be smaller than the maximum input length of the signature algorithm. Variants of digital signatures with recovery support one recoverable part of the message and one part protected by a hash value, which is also part of the digital signature. Digital signature schemes with message recovery are standardized in [ISO9796-2] and [ISO9796-3].
9.7.3 RSA 9.7.3.1 Introduction The best known asymmetric encryption method is the RSA algorithm, named after its inventors Ronald Rivest, Adi Shamir and Leonard Adleman [Rietal78]. RSA based digital signature schemes with message recovery is standardized in [ISO14888-2]. Its security is based on the factorization problem, i.e., to decompose a large number n into its prime factors. The encryption (i.e., mathematical transformation E) for conversion of the plaintext M into the cipher text C is given as follows:
C = E ( M ) = M e mod n
(9.15)
The inverse transformation D to get the plaintext M back from the cipher text C is given by:
M = D(C ) = C d mod n
(9.16)
The message M is represented by a positive integer between 0 and n – 1. Because of modulo calculation, C is also between 0 and n – 1. Messages that are larger than n – 1 in their numerical representation shall be divided into blocks. The RSA algorithm is therefore also a block algorithm. In practice, M is the hash value of a message. It is a commutative process, i.e. M = D(E(M)) and M = E(D(M)). It is therefore suitable both for ensuring the confidentiality and digital signatures. The digital signature S (i.e. mathematical transformation D for conversion of the plaintext M into the signature S) is given as follows (Fig. 9.28):
S = D( M ) = M d mod n
(9.17)
Digital Signatures | 395
The inverse transformation E to get the plaintext M back from the signature is given by:
M = D( S ) = S e mod n
(9.18)
RSA can be used to generate digital signatures with appendix or with message recovery. Ciphertext Plaintext
Me mod n ➻➼➽➾➚c key: (e,n)
Md mod n Private key: (d,n)
Plaintext
Digital signature Fig. 9.28: RSA method.
The public key is (e, n) and the secret key is (d, n). n is the product of two large prime numbers p and q, such that:
n = p⋅q
(9.19)
The inventors of this method proposed before at the time of invention to use for p and q hundred digit primes. Nowadays, mostly for n, a length of 1,024–2,048 bits is chosen.
Example of RSA based encryption: Let’s choose: p = 11, q = 19 and e = 17; Thus: n = p·q = 209 and d = 53; Now if the message is M = '101', then the corresponding ciphertext is: C = 517 mod 209 = 80; The inverse transformation gives the plaintext, i.e.: M = 8053 mod 209 = 5.
9.7.3.2 Generation of RSA key system A RSA key system consists of: – e and n as the public key, and – d and n as secret key, where n is also a component of the public key.
396 | Cryptography
The main task is to find a very large random number n, such that n = p⋅q (see (9.19)), where p and q are prime numbers, and p and q should be slightly different in length. This requirement arises from the fact that p and q are not determined by simple testing of all prime numbers around n1/2. Generation and testing of prime numbers are specified in ISO/IEC 18032. After the generation of p and q, a number e is determined so that:
gcd(e, ( p − 1) ⋅ (q − 1)) = 1
(9.20)
Then d is determined so that:
e ⋅ d mod( p − 1) ⋅ (q − 1) = 1
(9.21)
For the calculation of d, the Extended Euclidean algorithm is used. The length of the secret key d of RSA key system must be at least N/3, where N is the length of n. Since the security of the process depends on d and the decomposition of n, e is determined so that the arithmetic complexity is minimized. e can be selected even constant, if e is large enough, such that certain attacks with the chosen plaintext are not possible. Of course, it must be considered that e and (p – 1)⋅(q – 1) have no common divisor. Also d and n are relatively prime to each other. If e is a prime number, then this property is automatically fulfilled. In practice, e = 216 + 1 is used (4th Fermat number), which is a prime number. e has a low Hamming weight of 2, which allows very fast computation of the encryption operation E, when the Square-and-Multiply-Algorithm is used [Meetal96]. There are many optimizations to compute the modular exponentiation of C and S.
Mathematical Background n, e and d are constructed such that:
e ⋅ d mod( p − 1) ⋅ (q − 1) = 1
(9.22)
φ (n) = ( p − 1) ⋅ (q − 1)
(9.23)
and:
where p and q are prime numbers.
φ(n) is the Euler φ-Function, which calculates, how many numbers smaller than n are relative prim to n.
Digital Signatures | 397
Then:
e ⋅ d = t ⋅ φ (n) + 1, t ≥ 1
(9.24)
and after the Euler’s rule:
M = C d mod n = ( M e mod n) d mod n = ( M e ) d mod n = M t ⋅φ ( n ) +1 mod n = ( M φ ( n ) ) t ⋅ M mod n = 1t ⋅ M mod n = M
(9.25)
For encryption, it must be ensured that the message is large enough so that no attacks by repetition or division are possible. If the message is small, it should be padded before encryption.
9.7.4 El-Gamal 9.7.4.1 Introduction This method was published in 1985 by El-Gamal. It is used for authentication of messages, but can also be used for key agreement. Its security is based on the discrete logarithm problem. Two public system parameters are selected, a large prime p and an element g (where gi mod p, i = 0,..., p – 2, generates all the numbers j = 1,..., p – 1; g0= gp – 1 = 1 mod p); g is a so called primitive element in the Galois field GF(p), which is spanned by p. The sender chooses a random number r < p and calculates:
K = g r mod p
(9.27)
The public key of A consists of K, p, g, where p and g may be the same in a large group of participants and r is the private key of A. (9.27) describes the Discrete Logarithm Problem: for given g, p and r it is easy, to calculate K, but the discrete logarithm has to be calculated to get r, if g, p and K are given. The calculation of the discrete logarithm is computationally infeasible, if p is large enough. The order of the size of p of the discrete logarithm problem is similar to n of the RSA cryptosystem: today at least 1.024 bit, better 2048 bit.
9.7.4.2 Authentication of Message H is a one-way collision resistant hash function. As usual, the message itself is not signed, rather the hash value H(M), which is calculated on the message.
398 | Cryptography
If A wants to sign H(M): 1. It chooses a random number R (a onetime private message key), where R < p, and calculates:
X = g R mod p 2.
Then it solves the following equation to find Y (using the Euclidean algorithm):
H ( M ) = (r ⋅ X + R ⋅ Y ) mod( p − 1) 3.
(9.28)
(9.29)
The pair (X, Y) is the signature on message M.
9.7.4.3 Verification of Message The receiver B receives M', X', Y' and wants to verify that the message M' really came from A. The receiver (Fig. 9.29): 1. Calculates:
2.
3.
Z = K X ' ⋅ X 'Y ' mod p
(9.30)
Z ' = g H ( M ´) mod p
(9.31)
and compares whether:
If yes, the message M' is authentic and came from A.
Fig. 9.29: El-Gamal method.
Digital Signatures | 399
Mathematical Background Since g is a primitive element of GF(p), it holds that gi = gj iff i = j mod (p – 1). The receiver calculates:
Z ' = g H ( M ') mod p = g rX ' ⋅ g rY ' mod p
(9.32)
If K = g r mod p and X ' = g R mod p , then (by application of Euler’s rule):
Z ' = K X ' ⋅ X 'Y ' mod p
(9.33)
9.7.5 Digital Signature Algorithm (DSA) 9.7.5.1 Introduction A variant of the El Gamal method, called Digital Signature Algorithm (DSA), proposed by NIST’s FIPS as Digital Signature Standard (DSS) is also based on the computational intractability of the discrete logarithm problem. It works as follows: – The sender of a message selects a large prime p, where 21023 < p < 21024 and p – 1 should have a prime factor q, such that 2159 < q < 2160 (from year 2005 onwards, p should be at least 1280 bits, even better 2048 bits long). – Sender also selects an element g with g = h(p – 1)/q mod p, where h is an integer with 0 < h < p – 1, and h(p – 1) / q mod p > 1. – Then the sender chooses another random number x, 0 < x < q and computes y = gx mod p. – M is the message that should be signed and transmitted. The public system parameters p, q and g can applied jointly for a group of users. The individual public key of the originator is y, the private key is x. These precalculations are independent of the message to be signed.
9.7.5.2 Authentication of Message If A wants to sign the message M: 1. It chooses a random number k, 0 < k < q, and calculates:
r = g k mod q
(9.34)
400 | Cryptography
2.
Then using the Euclidean algorithm, it calculates:
s = (k −1 ( H ( M ) + x ⋅ r )) mod q
(9.35)
where k–1 is the multiplicative inverse of k mod q, such that k–1⋅k mod q = 1 and 0 < k–1 < q. 3.
The pair (r, s) is the signature on message M.
9.7.5.3 Verification of Message The receiver B receives M′, r′, s′ and wants to verify that the message M actually came from A. From r′ and s′, the content of the message cannot be derived. The recipient can only verify if r′ and s′ suits to M (r′ and s′ are an authenticator for the message M): 1. B verifies if 0 < r′ < q and 0 < s′ < q 2. and then calculates using Euclidean algorithm:
w = ( s' ) −1 mod q 3.
(9.36)
Additionally B calculates:
u1 = (( H ( M ' )) ⋅ w) mod q u 2 = ((r ' ) ⋅ w) mod q u1
(9.37)
u2
v = ((( g ) ⋅ ( y ) ) mod p ) mod q 4. and compares v and r′. In case of equality the message is considered to be authentic. The advantage of DSA as compared to the original El Gamal is the smaller length of the digital signature, which can have the same size as the hash value. The disadvantage is, that two types of arithmetic operations are needed, in GF(p) and GF(q).
Digital Signatures | 401
9.7.6 Elliptic curve digital signature algorithm (ECDSA) 9.7.6.1 Elliptic Curves An elliptic curve is a set of points satisfying a mathematical equation of the form (called the Weierstrass form [Mil86]):
y 2 = x 3 + ax + b
(9.38)
where a and b are real numbers and:
4a 3 + 27b 2 ≠ 0
(9.39)
Elliptic curve is a plane algebraic curve (Fig. 9.30), i.e., it is non-singular without any self-intersections, cusps or isolated points. By changing the constants a and b, different curves can be generated. For cryptographic use elliptic curves (9.38) and (9.39) are defined over finite fields GF(p), p > 3. a, b and the coordinates x and y of the points (x, y) on the curve are elements of GF(p). Together with the point at infinity the points of the curve, which satisfy (9.38) and (9.39), are the elements of a finite field E(GF(p)) [Kob87]. y
x
Fig. 9.30: Elliptic curve.
Point Addition in E(GF(p)) The sum R = (xR, yR) of two points P = (xP, yP) and Q = (xQ, yQ) is defined as follows:
R = P+Q
(9.40)
402 | Cryptography
Let:
s=
y P − yQ x P − xQ
(9.41)
then:
x R = s 2 − x P − xQ
(9.42)
y R = − y P + s ( x P − xQ )
(9.43)
and:
This operation can also be interpreted geometrically: A straight line is drawn through P and Q and the mirror of the intersection point of the line with the curve along the x-axis is taken as the result (Fig. 9.31). y
P+Q
P
x
Q
Fig. 9.31: Point addition on elliptic curve.
Point Doubling: Doubling of a point P is adding the point to itself to get the point R, i.e., R = 2P (Fig. 9.32). Let:
s=
3x P + a 2 yP
(9.44)
then:
xR = s 2 − 2xP
(9.45)
Digital Signatures | 403
and:
y R = − y P + s ( x P − xQ )
(9.46)
This can be also interpreted geometrically: A tangent is drawn through P and the mirror of the intersection point of the line with the curve along the x-axis is taken as the result.
Remark: The division symbol in above formulas means the calculation of the multiplicative inverse element of the divisor in GF(p). y
P
x
2P = P + P
Fig. 9.32: Point doubling on elliptic curve.
Scalar Point Multiplication Multiplication in E(GF(p)) is defined as a scalar multiplication Q = d·P = P + P + …. + P (d – 1 additions) where P and Q are points on the curve and d is an element of GF(p). An efficient method is required for the point multiplication on elliptic curve. Normally this is done using double and add algorithm, which uses point addition and point doubling, e.g., if the scalar d = 13 is multiplied with point P on the elliptic curve, then d·P = 13P = 2(2(2P + P)) + P.
404 | Cryptography
Discrete Logarithm Problem on Elliptic Curves Let P and Q be two points on the elliptic curve and d an element of GF(p). Now the discrete logarithm problem on elliptic curve can be formulated. Q = d·P If d and P are given, it is easy to calculate Q, but it is computationally infeasible, to calculate d, if P and Q are given. The point P is a system parameter, d is used as private key and the point Q is the public key. To calculate d, the discrete logarithm problem on an elliptic curve has to be solved. Therefore the discrete logarithm problem on elliptic curves is interesting for the use in cryptography. Cryptographic techniques based on elliptic curves are standardized in [ISO15946-1].
9.7.6.2 ECDSA Elliptic Curve DSA (ECDSA) is an elliptic curve analogue of the DSA algorithm. ECDSA and its variants are standardized by ISO in [ISO14888-3]. ECDSA requires the generation of the public/private key pair for a user with respect to a set of elliptic curve domain parameters. These domain parameters may be common to a group of users and may be public. ECDSA domain parameters are: – p: the size of the underlying Galois field GF(p) – a and b: two field elements defining the equation of elliptic curve over GF(p), e.g. y2 = x3 + ax + b – P = (x, y): A point on the curve, known as the base point, where x and y are elements of the field GF(p), called base point – #E(GF(q)) = n – ord(P) = q, order of the base point, a large prime number For ECDSA signature generation and verification, the ECDSA public/private key pair (d, Q) is generated at first, where d is the private key and Q is the public key.
Key Pair Generation: The key pair of a user is generated by a participant as follows: – d is generated by choosing a random number d, such that 1 ≤ d ≤ q – 1 – Q is dependent on d and is chosen as Q = d·P
Digital Signatures | 405
Signature Generation: A signs a message M for B using the following sequence of steps: – Choose a random number k, such that 1 ≤ k ≤ q – 1. – Calculate the point k·P = (x1, y1) – r = x1 mod q. If r = 0, choose a new k and repeat this step by choosing a new point k·P and the corresponding r. – Calculate s = (k–1(H(M) + r·dA)) mod q, where dA is the private key of A and H(M) is the hash of message M. – The signature on M is (r, s).
Signature Verification: B verifies the signature (r, s) on the message M using the public key QA of A and the following sequence of steps: – Verify r and s are integers in the interval [1, q – 1]. – Verify if r mod n = 0. If yes, the signature verification fails. – Compute w = s–1 mod q. – Calculate u = w · H(M) mod q and v = w · r mod q. – Calculate the point (x1, y1) = u · P + v · QA. – Signature verification succeeds when x1 mod q = r.
Advantages of Elliptic Curve Cryptography Till today there are no attacks on elliptic curve cryptography known with a complexity lower than exponential: for each additional bit of key length the effort for an attack doubles. The complexity of attacks on factorization based algorithms as RSA or discrete logarithm problems based on GF(p) is sub exponential. The consequence is that the key length, modulus length and signature length are significantly shorter with elliptic curve cryptography.
Comparison of Security Level A comparison of security level provided by symmetric ciphers is compared with Elliptic Curve Cryptosystems and RSA in Table 9.1. The table shows the key length in bits for the same security level provided by the different kind of ciphers. The difference in key lengths for the same security level is quite remarkable and even more evident for a higher security level.
406 | Cryptography
Tab. 9.1: Comparison of key sizes for the same security level in bits.
Security level (symmetric cipher key lengths)
ECC key lengths
RSA/ElGamal/DSA key lengths
80
160
1024
112
224
2048
128
256
3072
192
384
7680
256
512
15360
Remarks: All algorithms, which are executed on GF(p), can be also executed on GF(2n), which are called binomial fields. Implementations on GF(2n) are mainly used for hardware implementations, because of the bitwise operations.
9.8 Random Numbers 9.8.1 Randomness A hardware device or an algorithm that produces a sequence of statistically independent and uniformly distributed numbers is called a random number generator. True random numbers occur in nature, such as the amount of thermal noise from semiconductor diode at a particular instance of time. However, they are inefficient to be produced and it is difficult to prove that they are not influenced by external events. In most practical cryptographic applications, random bits are required. The requirement can however, be fulfilled using pseudo random bit sequences rather than truly random bit sequences. Random bit generation is standardized in [ISO18031]. Pseudo random bit generators (PRBG) are deterministic algorithms, which produce pseudo random bits of length n, given an input random number of length m, where n >> m. The input to PRBG is called “seed”, which should be truly random, but the output bit sequence is deterministic. Good PRBGs should pass certain statistical tests in order to prove their randomness. An example of such tests can be found in NIST publications on random number generation tests [STS00]. Pseudo random bits are used in many cryptographic applications, some examples of which include: – Stream ciphers – Keys for encryption/decryption – Keys for one time pads
Random Numbers | 407
– – – – – –
Initialization values Keys for MAC In the production of asymmetric key pairs Nonce values Random values in key establishment protocols Challenges in authentication protocols
9.8.2 Random Number Generation 9.8.2.1 True Random Number Generation True random numbers can be obtained from natural sources. Such numbers are either produced in hardware or through software algorithms. In case of hardware, they can be obtained from sources such as a laser, a sound signal from microphone, disk read latency times etc. In this case, the hardware can be integrated with the device using the random numbers. However, since they rely on an additional hardware, they are expensive to implement. Depending on the scenario, the additional expense might be acceptable. If the true random numbers are desired to be obtained from software, they are typically based on events such as the content of user input or output buffers, the system clock, the elapsed time between user keystrokes, the mouse movements, or operating system load etc. The generated random numbers have to be monitored and analysed for randomness.
9.8.2.2 Pseudo Random Number Generation Pseudo random number generators do not rely on natural phenomenon to produce the number streams. They, however, need an initial seed value, which should be truly random for good results. To an observer, without the knowledge of the seed, the pseudo random bit streams appear to be true random bit streams. It is not possible to predict the next bit. However, if the seed is known, together with the algorithm to generate pseudo random bit streams, then anyone can (re)produce the pseudo random bit stream. The seed is therefore kept secret and should be sufficiently large for protection against brute force attack. Examples of pseudo random bit stream generators include linear congruential generator, lagged Fibonacci generator, feedback shift registers, ANSI X9.17 [ANSIX9.17] and FIPS 186 [FIPS186-2] generators.
408 | Cryptography
9.8.2.3 Cryptographically Secure Pseudo Random Number Generation A cryptographically secure pseudo random number generator (CSPRNG) is a pseudo random number generator which satisfies the following: – It passes the next bit test. The next bit test is satisfied when given the fist k bits of a random sequence, there is no polynomial time algorithm which can predict the (k + 1)th bit with a probability of success greater than 0.5. – It is resistant to state compromise extension attacks. This means, even if the internal state of the random number generator is known at some time, an observer is still unable to predict the previous or future outputs of the random number generator. As said, not every pseudo random bit generator is a cryptographically secure pseudo random bit generator. Examples of cryptographically secure pseudo random bit generators include RSA pseudo random bit generators and Blum-Blum-Shub pseudo random bit generator [Bletal86].
Algorithm: RSA Pseudo Random Bit Generator 1. Generate two large, distinct and secret primes p and q 2. Compute: n = p·q and Φ = (p – 1)(q – 1) 3. Choose a random integer e, such that 1 < e < Φ and gcd(e, Φ) = 1 4. Choose a random integer x0 (the seed) in the interval [1, n – 1] for i := 1 to l xi = (xi–1)e mod n zi = LSB of xi end for 5.
output random bit sequence z1, z2, z3, … ,zl of length l
Algorithm: Blum-Blum-Shub Pseudo Random Bit Generator 1. Generate two large, distinct and secret primes p and q, where both are congruent to 3 modulo 4 2. Compute: n = p·q 3. Choose a random integer s (the seed) in the interval [1, n – 1] such that gcd(s, n) = 1 and compute x0 = s2 mod n for i := 1 to l xi = (xi–1)2 mod n zi = LSB of xi end for 4. Output random bit sequence z1, z2, z3,..., zl of length l
Random Numbers | 409
One Way Collision Resistant Hash Functions as Pseudo Random Bit Generator One way collision resistant hash functions are also used for pseudo random bit generation because of their excellent randomization properties. They are initialized with SEED and then there output is fed back to generate the next bits. For example, this method is used for key derivation described in ISO 11770-6 [ISO11770-6] when working keys or subkeys have to be derived from a master key.
References [492AAAD]
[492CAAB]
[Abr63] [AcTs04] [AdVa04]
[AES15] [AES41] [Agr97] [Aketal08]
[Aketal11]
[Ale97]
[AlLa87] [AlLu96]
[Anetal02]
[AnKi42] [ANSI-T1.105] [ANSI-X9.17] [AtFa98] [Aww12]
TIA-492AAAD Recommendation, “Detail Specification for 850-nm Laser-Optimized, 50-μm Core Diameter/125-μm Cladding Diameter Class la Graded-Index Multimode Optical Fibers Suitable for Manufacturing OM4 Cabled Optical Fiber”, Arlington, VA 22201, USA, 2009. TIA-492CAAB Recommendation, “Detail specification for class IVa dispersionunshifted single-mode optical fibers with low water peak”, Arlington, VA 22201, USA, 2000. N. Abramson, “Information Theory and Coding”, Mc-Graw-Hill Book Company, New York, 1963. T. Acharya, P-S. Tsai, “JPEG2000 standard for Image Compression: Concepts, Algorithms and VLSI Architectures”, John Wiley & Sons, INC., Publication, USA, 2004. M. Adrat, P. Vary, “Turbo Error Concealment of mutually independent Source Codec parameters”, 5th Int. ITG Conference on Source and Channel Coding (SCC), Erlangen, Germany, January 2004. “Technology trends in Audio Engineering”, report by the AES technical Council, J. Audio Eng. Soc, Vol. 63, No. 1/2, January/February 2015. AES41 (1–5): AES standard for digital audio, audio-embedded metadata. G. P. Agrawal, “Fiber-Optic Communication Systems”, John Wiley & Sons, Inc., New York, 1997, pp. 32–35. I. Akyldiz, W. Y. Lee, M. C. Vuran, S. Mohanty, “A survey on spectrum management in cognitive radio networks”, IEEE Communications Magazine, Vol. 46, No. 4, pp. 40–48, April 2008. I. Akyldiz, B. F. Lo, R. Balakrishnan, “Cooperative spectrum sensing in cognitive radio networks— A survey”, Elsevier, Physical Communications, Vol. 4, pp. 40–62, Dec. 2011. S. B. Alexander, “Wavelength division multiplexed optical communication systems employing uniform gain optical amplifiers”, United States Patent Office #5696615, 1997. S. C. Althoen, R. McLaughlin, “Gauss-Jordan reduction: A brief history”, The American Mathematical Monthly, 94 (2): 130–142, 1987, ISSN 0002-9890. N. Alan, M. Luby, “A linear time erasure-resilient code with nearly optimal recovery”, IEEE Trans. Information Theory, Vol. 47, no. 6, pp. 1732–1736, November, 1996. S. Andersen, W. Kleijn, R. Hagen, J. Linden, M. Murthi, J. Skoglund, “iLBC-a linear predictive coder with robustness to packet losses”, in Proc. IEEE Speech Coding Workshop, pp. 23–25, 2002. G. Antheil, M. H. Kiesler, “Secret Communication System, US2292387 A, August 1942. ANSI T1.105: SONET—Basic Description including Multiplex Structure, Rates and Formats, 1996. ANSI X9.17-1985: American National Standard, Financial Institution Key Management (Wholesale), American Bankers Association, April 4, 1985, Section 7.2. E. Atsumi, N. Farvardin, “Lossy/Lossless Region-Of-Interest image Coding Based on Set Partitioning in Hierarchical Trees”, in Proc. IEEE ICIP, pp. 87–91, USA, 1998. O. Awwad, “WDM Optical Network Design”, AV Akademikerverlag, 2012.
412 References
[Azetal96]
[Baetal74]
[Bar02] [Bea01] [Beetal09]
[Beetal09-2] [Beetal93]
[Bel81] [Ben58] [Ber68] [Bha15] [Bin00] [Bla03] [BlDa09] [Bletal86] [Blo79]
[BlTu58]
[Boetal07]
[Boetal12] [Boetal13] [BoSm74] [BoSm74-2] [BoTe98] [Bra29]
S. Z. Azami, P. Duhamel, O. Rioul, “Joint Source-Channel Coding: Panorama of Methods, CNES Workshop on Data Compression”, Toulouse, France, November 1996. L. Bahl, J. Cocke, F. Jelinek, J. Raviv, “Optimal decoding of linear codes for minimizing symbol error rate”, IEEE Transactions on Information Theory, IT-20, pp. 284– 287, March 1974. S. A. Barbulescu, “What a wonderful turbo world”, Adelaide, 2002. Beauchamp, K.G. History of Telegraphy: Its Technology and Application. Institution of Engineering and Technology. pp. 394–395, (2001), ISBN 0-85296-792-6. M. Bertolini, et al., “On the XPM-induced distortion in DQPSK-OOK and coherent QPSK-OOK hybrid systems”, in Optical Fiber Communication Conference, 2009, Vol. 2, no. 3, pp. 9–11. J. Berthold et al., “100G Ultra Long Haul DWDM Framework Document”, OIF, 2009, www.oiforum.com C. Berrou, A. Glavieux, P. Thitimajshima, “Near Shannon Limit Error Correcting Coding and Decoding: Turbo Codes”, Proc. IEEE International Conference on Communication, Vol.2/3, pp. 1064–1070, Geneva, Switzerland, 1993. A. G. Bell, “Telephone-circuit”, US Patent 244 426, 1881. W. R. Bennet, “Statistics of Regenerative Digital Transmission”, Bell Syst. Tech. J., Vol. 37 (1958), pp. 1501–1542. E. R. Berlekamp, “Algebraic Coding Theory”, McGraw-Hill, New York, 1968. U. N. Bhat, “An Introduction to Queuing theory”, Springer, 2015. E. Binder, “Distance Coder”, Usenet group: comp.compression, 2000. R. E. Blahut, “Algebraic Codes for Data Transmission”, Cambridge University press, 2003. T. Blumensath, M. Davies, “Iterative Hard Thresholding for Compressive Sensing”, Appl. Comput. Harmon. Anal., Vol. 27, No. 3, pp. 265–274, 2009. L. Blum, M. Blum, M. Shub, “A Simple Unpredictable Pseudo-Random Number Generator”. SIAM Journal on Computing, Vol. 15 (2), pp. 364–383, May 1986. N. Bloembergen, “Recent Progress in Four-Wave Mixing Spectroscopy”, in Laser Spectroscopy IV, edited by H. Walther and K. W. Rothe, (Springer, Berlin, 1979), pp. 340–348. R. B. Blackman, J. W. Tukey, “The Measurement of Power Spectra from the Point of View of Communications Engineering – Part I”, Bell System Technical Journal 37: 185, 1958. A. Bogdanov, L. R. Knudsen, G. Leander, C. Paar, A. Poschmann, M. J. B. Robshaw, Y. Seurin, C. Vikkelsoe, “PRESENT: An Ultra-Lightweight Block Cipher”, P. Paillier and I. Verbauwhede (Eds.): CHES 2007, LNCS 4727, pp. 450–466, 2007, Springer Verlag Berlin Heidelberg. N. Bozinovic et al., ECEOC’12, Th.3.C.6 (2012). N. Bozinovic et al., “Terabit-Scale Orbital Angular Momentum Mode Division Multiplexing in Fibers”, Science, Vol. 340 no. 6140, pp. 1545–1548 (2013). W. Boyle, G. Smith, “Buried channel charge coupled devices”, US patent 3792322 A, 1974. W. Boyle, G. Smith, “Three dimensional charge coupled devices”, US patent 3796927 A, 1974. M. Bossert, B. G. Teubner,“Channel coding”, Stuttgart, 1998. L. Braille, “Method of Writing Words, Music and Plain Songs by Means of Dots, for Use by the Blind and Arranged for Them”, 1829.
References 413
[Bra65] [Bri99] [BuBo96] [Buc06] [Buc97] [Bur04] [Cam10] [Can08]
[CaTa05] [CaWa08] [ChCh05]
[Chetal00]
[Chetal90]
[Chi64] [Chu00] [ChZa00] [Cietal07] [Cla07] [Cla68] [Cos56] [CoTu65] [Cretal12]
R. N. Bracewell, “The Fourier Transform and Its Applications”, p. 381, McGraw Hill, New York, 1965. S. Brink, “Convergence of Iterative Decoding”, Electronic Letters, Vol. 35, no. 10, May 1999. E. L. Buckland, R. W. Boyd, “Electrostrictive contribution to the intensity-dependent refractive index of optical fiber”, Opt. Lett., Vol. 21, pp. 1117–1119, 1996. M. Buchrer, “Code Division Multiple Access (CDMA)”, Morgan & Claypool, 2006.]. W. Buchanan, “Advanced Data Communications and Networks”, Chapman&Hall, 1997. R. W. Burns, “Communications: an international history of the formative years”, Chapter 2: Semaphore Signalling, ISBN 978-0-86341-327-8, 2004. F. Camerer, “On the way to Loudness nirvana”, EBU Technical Review, 2010 Q3. E. Candès, “The Restricted Isometry Property and Its Implications for Compressed Sensing”, Comptes rendus de l’Académie des Sciences, Série I, Vol. 346 (9–10), pp. 589–592, 2008. E. Candès, T. Tao, “Decoding by Linear Programming”, IEEE Trans. Inform. Theory, Vol. 51, No. 12, pp. 4203–4215, 2005. E. J. Candès, M. B. Wakin, “An Introduction to Compressive Sampling”, IEEE Signal Processing Magazine, Vol. 25, No. 2, pp. 21–28, March 2008. A. Cheng, D. T. Cheng, “Heritage and early history of the boundary element method”, Engineering Analysis with Boundary Elements, Vol. 29, Issue 3, 268–302, Elsevier Ltd., March 2005. C. Christopoulos, J. Askelof, M. Larsson, “Efficient Methods for Encoding Region of Interest in the Upcoming JPEG 2000 Still Image Coding Standard”, IEEE Signal Process. Lett. Vol. 7, No. 9, pp. 247–249, 2000. N. Cheung, K. Nosu; G. Winzer, “Dense Wavelength Division Multiplexing Techniques for High Capacity and Multiple Access Communication Systems”, IEEE Journal on Selected Areas in Communications, Vol. 8 No. 6, August 1990. R. T. Chien, “Cyclic Decoding Procedures for the Bose-Chaundri-Hocquenghem Codes”, IEEE Transactions on Information Theory, Vol. 10, no. 4, pp. 357–363, 1964. S. Y. Chung, “On Construction of some Capacity—Approaching Coding Schemes”, Doctoral Dissertation, MIT, Boston, USA, 2000. G. Cheung, A. Zakhor, “Bit allocation for joint source/channel coding of scalable video”, IEEE Trans. Image Processing, Vol. 9, pp. 340–356, March 2000. B. L. Cioffi, J. M. Jagannathan, S. M. Mohseni, “Gigabit DSL”, IEEE Transactions on Communications, 55(9), pp. 1689–1692, Sep. 2007. T. C. Clancy, “Formalizing the interference temperature model”, Wireless Communications and Mobile Computing, Vol. 11, No. 9, pp. 1077–1086, Nov. 2007. R. H. Clarke, “A Statistical Theory of Mobile Radio Reception”, Bell Systems Technical Journal 47 (6), pp. 957–1000, July–August 1968. J. P. Costas, “Synchronous communications”, Proceedings of the IRE 44 (12), pp. 1713–1718, 1956. J. W. Cooley, J. W. Tukey, “An algorithm for the machine calculation of complex Fourier series”, Math. Comput. 19: pp. 297–301, 1965. D. Crivelli, et al., “A 40 nm CMOS single-Chip 50Gb/s DP-QPSK/BPSK transceiver with electronic dispersion compensation for coherent optical channels”, in ISSCC Dig. Tech. Papers, Feb. 2012.
414 References
[Daetal09]
[Daetal12]
[DaNe05] [Dau90] [Dav10] [Dav96] [Deetal87] [Deetal93] [Deu96] [Dietal94] [DiHe76] [Djetal09] [DOCSIS] [DoEl03]
[Doetal04] [Dol46]
[Dra04] [Duk08] [DuKi09] [DVB09]
[DVB97] [Dwo07]
D. Datla, A. M. Wyglinski, G. J. Minden, “A spectrum surveying framework for dynamic spectrum access networks”, IEEE Trans. Vehicle Technology, Vol. 58, No. 8, pp. 4158–4168, Oct. 2009. M. A. Davenport, M. F. Duarte, G. Kutyniok, “Introduction to Compresses Sensing”, in Compresses Sensing: Theory and Applications. Cambridge, U. K., Cambridge Univ. Press, 2012. L. Day, I. McNeal, “Biographical Dictionary of the History of Technology”, Routledge, London, 2005. I. Daubechies, “The Wavelet Transform, Time-Frequency Localization and Signal Analysis”, IEEE Trans. on Inform. Theory, Vol. 36, No. 5, pp. 961–1005, 1990. M. Davenport, “Random Observations on Random Observations: Sparse Signal Acquisition and Processing”, Ph.D. thesis, Rice University, 2010. K. David, T. Benker, “Digitale Mobilfunksysteme”, Teubner, 1996. E. Desurvire, J. Simpson, P. C. Becker, “High-gain erbium-doped traveling-wave fiber amplifier”, Optics Letters, Vol. 12, No. 11, 1987, pp. 888–890. J. R. Deller, J. G. Proakis, J. Hansen, “Discrete-Time Processing of Speech Signals”, Macmillan Publishing Company, New York 1993. L. P. Deutsch, “DEFLATE Compressed Data Format Specification version 1.3”, IETF, p.1. sec. Abstract. RFC 1951, May 1996. A. G. Dickinson, E. I. Eid, D. A. Inglis, “Active pixel sensor and imaging system having differential mode”, US patent 5631704, 1994. W. Diffie, M. Hellman, “New Directions in Cryptography”, IEEE Transactions on Information Theory, Vol. 22, 1976, pp. 472–492. I.B. Djordjevic et al., IEEE/OSA J. Opt. Commun. Netw. 1 6 (2009), pp. 555–564. CableLabs, “Data Over Cable Service Interface Specifications—DOCSIS 3.0”, 2006. D. Dohono, M. Elad, “Optimality Sparse Representation in General (nonorthogonal) Dictionary via l1 Minimization”, in Proc. Natl. Acad. Sci, Vol. 100, No. 2, pp. 2197– 2202, 2003. C. P. Downing, C. H. Baher, Bandwidth, “Spectral Efficiency and Capacity Variation in Twisted-Pair Cable”, Irish Signals and Systems conference, Belfast, 2004. C. L. Dolph, “A current distribution for broadside arrays which optimizes the relationship between beam width and side-lobe level”, Proceedings of the IRE, Vol. 34, pp. 335–348, 1946. D. B. Drajic: “Introduction into Information Theory and Coding” (in Serb.), Academic mind, Belgrade, Serbia, 2004. M. L. Dukic, “Principi telekomunikacija”, Akademska misao, Beograd, 2008. P. Duhamel, M. Kieffer, “Joint Source-Channel Coding, a Cross-Layer Perspective with Applications in Video Broadcasting”, Academic Prerss, 2009. “Digital Video Broadcasting (DVB): Second generation framing structure, channel coding and modulation systems for Broadcasting, Interactive Services, News Gathering and other broadband satellite applications (DVB-S2)”, ETSI EN 302 307, April 2009. “Digital Video Broadcasting (DVB); Framing structure, channel coding and modulation for 11/12 GHz satellite services”, ETSI EN 300 421, August 1997. M. Dworkin, “Recommendation for Block Cipher Modes of Operation: Galois/Counter Mode (GCM) and GMAC”, NIST Special Publication 800-38D, Nov. 2007.
References 415
[Dwo10]
[ECOST231]
[Elm09] [Elm09-2] [ElSw93]
[EN136101]
[EN300401] [EN300744] [EsAf29] [ETV36] [EUTRA16] [Feetal15]
[Fei61] [FIPS186-2] [For65] [For65-2] [For66] [For66-2] [FoRa13] [Fos96]
[Fou08]
[Fuetal13]
M. Dworkin, “Recommendation for Block Cipher Modes of Operation: The XTS-AES Mode for Confidentiality on Storage Devices”, NIST Special Publication 800-38E, Jan. 2010. European Cooperative in the Field of Science and Technical Research EURO-COST 231, “Urban transmission loss models for mobile radio in the 900 and 1800 MHz bands”, Revision 2, The Hague, Sept. 1991. G. E. Elmore, “Surface wave transmission system over a single conductor having Efields terminating along the conductor”, U.S. Patent 7 567 154, 2009. G. E. Elmore, “Launching TM Mode onto a Single Conductor”, E-Line, Corridor Systems Inc., 2009. R. J. McEliece, L. Swanson, “Reed-Solomon codes and the exploration of the Solar system”, California Institute of Technology’s Jet Propulsion Laboratory, http://trsnew.jpl.nasa. gov/dspace/bitstream/2014/34531/1/94-0881.pdf, 1993. ETSI EN 136 101 V10.3.0: LTE—Evolved Universal Terrestrial Radio Access (E-UTRA)— User Equipment (UE) radio transmission and reception (3GPP TS 36.101 version 10.3.0 Release 10). EN 300401: Digital Audio Broadcasting for mobile, portable and fixed receivers, 2006. ETSI EN 300 744 V1.6.1: Digital Video Broadcasting (DVB)—Framing structure, channel coding and modulation for digital terrestrial television, 2009. L. Espenschied, H. Affel, “Concentric conducting system”, US Patent 1 835 031 A, 1929. Earlytelevision.org—Early Electronic Television—The 1936 Berlin Olympics. 3GPP TS 36.201: Evolved Universal Terrestrial Radio Access (E-UTRA); Physical layer; General description, Release 13, 2016. H. C. Feng, M. W. Marcellin, A. Bilgin, “A Methodology for Visually Lossless JPEG2000 Compression Stereo Images”, IEEE Trans. Image Process., Vol. 24, No. 2, pp. 560–572, 2015. A. Feinstein, “Foundations of Information Theory”, Mc-Grow-Hill Book Company, Inc., New York, 1961. FIPS 186-2, American National Standards Institute and Federal Information Processing Standards: DIGITAL SIGNATURE STANDARD (DSS), Jan. 2007. G. D. Jr. Forney, “Concatenated Codes”, Technical Report 440, Research Laboratory MIT, Cambridge, 1965. G. Jr. Forney, “On Decoding BCH Codes”, IEEE Transactions on Information Theory, Vol. 11, no. 4, pp. 549–557, 1965. G. D. Jr. Forney, “Concatenated Codes”, WITH Press, Cambridge, 1966. G. D. Jr. Forney, “Generalized Minimum Distance Decoding”, IEEE Trans. Inform. Theory, Vol.12, April 1966. S. Foucart and H. Rauhut, “A Mathematical Introduction to Compressive Sensing”, Springer, 2013. G. J. Foschini, “Layered space-time architecture for wireless communication in a fading environment using multiple antennas”, Labs Systems Technical Journal (Bell), Vol. 1, No.2, pp. 41–59, 1996. J. Fourier, “Mémoire sur la propagation de la chaleur dans les corps solides”, Nouveau Bulletin des sciences par la Société philomatique de Paris, pp. 112–116, March 1808. T. Fujimori, T. Koike-Akino, T. Sugihara, et al., “A Study on the Effectiveness of Turbo Equalization with FEC for Nonlinearity Compensatin in Coherent WDM Trans-
416 References
[Ful08] [G652] [G653] [G655] [G657] [G694.1] [G9700] [G991.2] [G992.1] [G992.5] [G993.2] [GaHe87] [GaJo03]
[Gal62] [Gal63] [GaMe05] [Gar59] [Gar87] [Geetal03]
[Gha05] [GhSo08]
[Gib16] [Gib27]
missions”, OptoElectronics and Communications Conference and International Conference on Photonics in Switching (OECC/PS), June 2013. M. Fuller, “Is DP-QPSK the endgame for 100 Gbits/sec”, Lightwave Technology, November 2008. ITU-T Recommendation G.652, “Characteristics of a single-mode optical fibre cable”, Geneve, 2009. ITU-T Recommendation G.653, “Characteristics of a dispersion-shifted, singlemode optical fibre and cable”, Geneve, 2010. ITU-T Recommendation G.655, “Characteristics of a non-zero dispersion-shifted single-mode optical fibre and cable”, Geneve, 2009. ITU-T Recommendation G.657, “Characteristics of a bending-loss insensitive singlemode optical fibre and cable for the access network”, Geneve, 2012. ITU-T Recommendation G.694.1, “Spectral grids for WDM applications: DWDM frequency grid”, Geneva, 2012. ITU-T 2014-12-19 (Retrieved 2015-02-03), “G.9700: Fast access to subscriber terminals (G.fast)—Power spectral density specification.” ITU-T Recommendation G.991.2, “Single-Pair High-Speed Digital Subscriber Line (SHDSL) transceivers”, Geneva, 2001. ITU-T Recommendation G.992.1, “Asymmetric Digital Subscriber Line transceivers (ADSL) ”, Geneva, 1999. ITU-T Recommendation G.992.5, “Asymmetric Digital Subscriber Line 2 transceivers (ADSL2)—Extended bandwidth ADSL2 (ADSL2plus)”, Geneva, 2009. ITU-T Recommendation G.993.2, “Very high speed digital subscriber line transceivers 2 (VDSL2)”, Geneva, 2011. S. Gade, H. Herlufsen, “Use of Weighting Functions in DFT/FFT analysis”, Brüel & Kjær Technical Reviews No. 3 & 4, 1987. J. Gastler, E. Jovanov, “Distributed Intelligent Sound Processing System”, Proc. of the 35th Southeastern Symposium on System Theory (SSST2003), Morgantown, West Virginia, pp. 409–412. March 2003. R. G. Gallager, “Low-Density Parity-Check Codes”, IRE Transactions on Information Theory, 1962. R. G. Gallager, “Low-Density Parity-Check Codes”, Cambridge, MA: MIT Press, 1963. A. Galtarossa, C. R. Menyuk, “Polarization Mode Dispersion”, Springer, New York, 2005. W. W. Gartner, “Depletion Layer Photoeffects in Semiconductors”, Phys. Rev., Vol. 116, p. 84, 1959. W. Gardner, “Introduction to Einstein’s Contribution to Time/Series Analysis”, IEEE ASSP Mag., Vol. 4, No. 4, pp. 44–45, 1987. D. Gesbert, M. Shafi, D.-S. Shiu, P. J. Smith, and A. Naguib, “From theory to practice: an overview of MIMO space-time coded wireless systems,” IEEE Journal on Selected Areas in Communications, vol. 21, no. 4, pp. 281–282, 2003. S. Ghahramani, “Fundamentals of probability with stochastic processes”, Pearson Prentice Hall, New Jersey, 2005. A. Ghasemi, E. S. Sousa, “Spectrum sensing in cognitive radio networks— Requirements, challenges and design trade-off”, IEEE Communications Magazine, Vl. 46, No.4, pp. 32–39, May 2008. J. Gibson, “Mobile Communications Handbook”, CRC Press, Roca Barton, 2016. J. Gibbs, Letter in Nature 59, 1898–1899—Also in Collected Works, Vol. II, pp. 258– 259, Longmans, Green & Co., New York, 1927.
References 417
[Gib97] [Gietal03] [Gietal90]
[Goetal07] [Goetal09]
[Goi13] [Gol05] [Gol49] [Gol67] [Gou50] [Gretal12]
[Gro12] [Gui06] [Hag94] [HaHo89] [Ham50] [HaNo06] [Har28] [Har78] [Hat80]
[Hay05]
[Hay89] [Hay94] [HDMI] [Hea25]
J. Gibson, “The Communications Handbook”, CRC Press, IEEE Press, USA, 1997. A. Giuiletti, B. Bougard, L. Perre, “Turbo Codes: Desirable and Designable”, Kluwer Academic Publishers, 2003. R. D. Gitlin et al., “Method and apparatus for wideband transmission of digital signals between, for example, a telephone central office and customer premises”, US Patent 4 924 492, 1990. P. Golden, H. Dedieu, K. S. Jacobsen, “Implementation and Applications of DSL Technology”, Taylor & Francis, 2007, ISBN 978-1-4200-1307-8, p. 479. A. Goldsmith, S. Jafar, I. Maric, S. Srinivasa, “Breaking spectrum gridlock with cognitive radios— An information theoretic perspective”, Proc. IEEE, Vol. 97, No. 5, pp. 894–914, May 2009. A. M. J. Goiser, “Handbuch der Spread-Spectrum Technik”, Springer, 2013. A. Goldsmith, “Wireless Communications”, Cambridge University Press, 2005. M. Golay, “Notes on Digital Coding”, Proc. IRE 37: 657, 1949. R. Gold, “Optimal binary sequences for spread spectrum multiplexing”, IEEE Transactions on Information Theory, vol. IT-13, no. 4, , pp. 619–621, October, 1967. G. Goubau, “Surface waves and their Application to Transmission Lines”, Journal of Applied Physics, Volume 21, 1950. L. Gruner-Nielsen, Y. Sun, J.W. Nicholson, D. Jakobsen, R. Lingle, B. Palsdottir, “Few Mode Transmission Fiber with low DGD, low Mode Coupling and low Loss”, Proc. Optical Fiber Communication Conference (OFC), Paper PDP5A.1, 2012. T. H. Gronwall, “Fiber die Gibbsche Erscheinung und die trigonometrischen Summen sin x+½sin 2x + ... + (1/n)sin nx”, Math. Ann. 72, pp. 228–243, 1912. M. Guizani, “Wireless Communications Systems and Networks”, Springer, 2006. J. Hagenauer, “Soft is Better than Hard—Communications, Coding and Cryptology”, Kluwer-Verlag, Leiden, January 1994. J. Hagenauer, P. Hoeher, “A Viterbi algorithm with soft decision outputs and its applications”, Proc. IEEE GLOBECOM, pp. 47.11–47.17, Dallas, 1989. R. W. Hamming, “Error Detection and Error Correction Codes”, The Bell System Technical Journal, Vol. XXIX 2, 1950, pp. 147–160. J. Haupt, R. Nowak, “Signal reconstruction from noisy random projections”, IEEE Trans. Inform. Theory, Vol. 52, No. 9, pp. 4036–4048, 2006. R. V. L. Hartley, “Transmission of Information”, Bell Syst. Tech. Journal, Vol. VII, pp. 535–563, 1928. F. J. Harris, “On the use of windows for harmonic analysis with the discrete Fourier transform”, Proceedings of the IEEE 66 (1): pp. 51–83, 1978. M. Hata, “Empirical Formula for Propagation Loss in Land Mobile Radio Services”, IEEE Transattions on Vehicular Technology, Vol. VT-29, No. 3, pp. 317–325, August 1980. S. Haykin, “Cognitive radio—Brain-empowered wireless communications”, IEEE Journal on Selected Areas of Communications, Vol. 23, No. 2, pp. 201–220, Feb. 2005. W. Hayt, “Engineering Electromagnetics”, McGraw-Hill 1989, ISBN 0-07-027406-1. S. Haykin, “Communication Systems”, 3rd Ed., John Wiley & Sons, Inc., New York, 1994. HDMI.org, “HDMI – High Definition Multimedia Interface”, April 2002 (Retrieved June 2008). O. Heaviside, Electrical Papers, Vol 1, pp. 139–140, Boston, 1925.
418 References
[Hea87]
O. Heaviside, “Electromagnetic Induction and its propagation”, “The Electrician”, June 1887. [Hel87] G. Held, “Data Compression: Techniques and Applications, Hardware and Software Considerations”, John Wiley&Sons, 1987. [HeMa96] G. Held, T.R. Marshall, “Data and image compression: tools and techniques”, John Wiley&Sons, 1996. [Hil88] R. Hill, “A First Course In Coding Theory”, Oxford University Press, 1988, ISBN 0-19853803-0. [Hol96] G. l’Hopital, “Analyse des Infiniment Petits pour l'Intelligence des Lignes Courbes”, pp. 145–146, 1696. [HSPA11] 3GPP release 7: High Speed Packet data Access (HSPA), 2011. [Huf52] D. A. Huffman, “A Method for the Reconstruction of Minimal-Redundancy Codes”, Proceedings of IRE, Vol. 49 (9), pp. 0198–1101, 1952. [IEEE1619] “IEEE Standard for Cryptographic Protection of Data on Block-Oriented Storage Devices”, IEEE Std 1619-2007, April 18, 2008, doi: 10.1109/IEEESTD.2008.4493450. [IEEE802.1] 802.11n—IEEE Standard for Information technology—Local and metropolitan area networks—Specific requirements—Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications Amendment 5: Enhancements for Higher Throughput, 2009. [IEEE802.11] IEEE 802.11: “Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications”, IEEE-SA, 5 April 2012, doi:10.1109/IEEE STD.2012.6178212. [IEEE802.11ac] 802.11ac—IEEE Standard for Information technology—Telecommunications and information exchange between systems—Local and metropolitan area networks— Specific requirements—Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications—Amendment 4: Enhancements for Very High Throughput for Operation in Bands below 6 GHz, 2013. [IEEE802.16] 802.16—IEEE Standard for Local and metropolitan area networks Part 16: Air Interface for Broadband Wireless Access Systems, 2009. [IEEE802.3.1] IEEE 802.3.1: “IEEE Standard for Management Information Base (MIB) Definitions for Ethernet”, 2011. [IEEE802.5] IEEE 802.5: “IEEE Standard for Local Area Networks—Token Ring Access Method and Physical Layer Specifications”, 1989. [IEEE802.9] IEEE 802.9: “Local and Metropolitan Area Networks—IEEE Standard for Integrated Services (IS), LAN Interface at the Medium Access Control (MAC) and Physical (PHY) Layers”, 1994. [IETF09] IETF (2009-07-06), SILK Speech Codec – draft-vos-silk-00.txt, (https://tools.ietf. org/html/draft-vos-silk-00#page-4) [Isetal84] H. Ishio, J. Minowa, K. Nosu, “Review and status of wavelength-divisionmultiplexing technology and its application”, Journal of Lightwave Technology, Vol. 2, Issue: 4, pp. 448–463, Aug 1984. [ISO10116] ISO/IEC 10116:2006, Information technology—Security techniques—Modes of Operation for an n-bit block cipher algorithm. [ISO10118-1] ISO/IEC 10118-1:2000, Information technology—Security techniques—Hashfunctions—Part 1: General. [ISO10118-3] ISO/IEC 10118-3:2004, Information technology—Security techniques—Hashfunctions—Part 3: Dedicated hash-functions. [ISO10918-1] ISO/IEC 10918-1:1994, Information technology—Digital compression and coding of continuous-tone still images: Requirements and guidelines.
References 419
[ISO11172]
[ISO11770-1] [ISO11770-3] [ISO11770-6] [ISO11801] [ISO14888-1] [ISO14888-2] [ISO14888-3] [ISO15444-1] [ISO15946-1] [ISO18031] [ISO18033-1] [ISO18033-3] [ISO18033-4] [ISO19772] [ISO29192-2] [ISO29192-3] [ISO7498-2] [ISO9796-2]
[ISO9796-3]
[ISO9797-1] [ISO9797-2]
ISO/IEC Standard 11172-3, “Information technology—Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s—Part 3: Audio”, ISO, 1993. ISO/IEC 11770-1:2010, Information technology—Security techniques—Key management—Part 1: Framework. ISO/IEC 11770-3:2015, Information technology—Security techniques—Key management—Part 3: Mechanisms using asymmetric techniques. ISO/IEC FDIS 11770-6:2015 Information technology—Security techniques—Key management—Part 6: Key derivation. ISO/IEC Recommendation 11801, “Information technology – Generic cabling for customer premises”, Geneve, 2002. ISO/IEC 14888-1:2008, Information technology—Security techniques—Digital signatures with appendix—Part 1: General. ISO/IEC 14888-2:2008, Information technology—Security techniques—Digital signatures with appendix—Part 2: Integer factorization based mechanisms. ISO/IEC 14888-3:2006, Information technology—Security techniques—Digital signatures with appendix—Part 3: Discrete logarithm based mechanisms. ISO/IEC 15444-1:2000, Information technology—JPEG 2000 image coding system— Part 1: Core coding system. ISO/IEC 15946-1:2008, Information technology—Security techniques— Cryptographic techniques based on elliptic curves—Part 1: General. ISO/IEC 18031:2011, Information technology—Security techniques—Random bit generation. ISO/IEC 18033-1:2015, Information technology—Security techniques—Encryption algorithms—Part 1: General. ISO/IEC 18033-3:2010, Information technology—Security techniques—Encryption algorithms—Part 3: Block ciphers. ISO/IEC 18033-4:2011, Information technology—Security techniques—Encryption algorithms—Part 4: Stream ciphers. ISO/IEC 19772:2009, Information technology—Security techniques—Authenticated encryption. ISO/IEC 29192-2:2012, Information technology—Security techniques—Lightweight cryptography—Part 2: Block ciphers. ISO/IEC 29192-3:2012, Information technology—Security techniques—Lightweight cryptography—Part 3: Stream ciphers. ISO/IEC 7498-2:1989, Information processing systems—Open Systems Interconnection—Basic Reference Model—Part 2: Security Architecture. ISO/IEC 9796-2:2010, Information technology—Security techniques—Digital signature schemes giving message recovery—Part 2: Integer factorization based mechanisms. ISO/IEC 9796-3:2006, Information technology—Security techniques—Digital signature schemes giving message recovery—Part 3: Discrete logarithm based mechanisms. ISO/IEC 9797-1:2011, Information technology—Security techniques—Message Authentication Codes (MACs)—Part 1: Mechanisms using a block cipher. ISO/IEC 9797-2:2011, Information technology—Security techniques—Message Authentication Codes (MACs)—Part 2: Mechanisms using a dedicated hashfunction.
420 References
[ISO9797-3]
ISO/IEC 9797-3:2011, Information technology—Security techniques—Message Authentication Codes (MACs)—Part 3: Mechanisms using a universal hash-function. [ISO9798-1] ISO/IEC 9798-1:2010, Information technology—Security techniques—Entity authentication—Part 1: General. [IT15444-12] Information Technology – JPEG2000 Image Coding System: ISO Base Media File Format, ISO/IEC Final Draft International Standard 15444-12, July 2003. [IT15444-13] Information Technology – JPEG2000 Image Coding System: An Entry Level JPEG 2000 Encoder, ISO/IEC Final Draft International Standard 15444-13, July 2008. [IT15444-14] Information Technology – JPEG2000 Image Coding System: XML Representation and Reference, ISO/IEC Final Draft International Standard 15444-14, July 2013. [IT15444-6] Information Technology – JPEG2000 Image Coding System: Compound Image File Format, ISO/IEC Final Draft International Standard 15444-6, April 2003. [ITU-G703] ITU-T Recommendation G.703: “Physical/electrical characteristics of hierarchical digital interfaces”, 2016. [ITU-G704] Recommendation ITU-T G.704: Synchronous Frame Structure at 1544, 6312, 2048, 8448 and 44 736 kbit/s hierarchical levels, 1998. [ITU-G707] Recommendation ITU-T G.707: Network node interface for the synchronous digital hierarchy. [ITU-G722] Recommendation ITU-T G.722: 7 kHz audio-coding within 64 kbit/s, September 2012. [ITU-G722.1] Recommendation ITU-T G.722.1: Low-complexity coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss, May 2005. [ITU-G722.2] Recommendation ITU-T G.722.2: Wideband coding of speech at around 16 kbit/s using Adaptive Multi-Rate Wideband (AMR-WB), July 2003. [ITU-G723.1] Recommendation ITU-T G.723.1: Dual rate speech coder for multimedia communications transmitting at 5.3 and 6.3 kbit/s, May 2006. [ITU-G726] Recommendation ITU-T G.726: 40, 32, 24, 16 kbit/s Adaptive Differential Pulse Code Modulation (ADPCM) , December 1990. [ITU-G729] Recommendation ITU-T G.729: Coding of speech at 8 kbit/s using conjugatestructure algebraic-code-excited linear prediction (CS-ACELP), June 2012. [ITU-G742] ITU-T Recommendation G.742: “Second order digital multiplex equipment operating at 8448 kbit/s and using positive justification”, 1988. [ITU-G783/784] Recommendations ITU-T G.783, G.784: Characteristics of synchronous digital hierarchy (SDH) equipment functional blocks, 1988. [ITU-G783] Recommendation ITU-T G.783: Characteristics of synchronous digital hierarchy (SDH) equipment functional blocks, 1988. [ITU-G803] Recommendation ITU-T G.803: Architecture of transport networks based on the synchronous digital hierarchy (SDH), 2000. [ITU-G961] ITU-T Recommendation G.961: “Digital transmission system on metallic local lines for ISDN basic rate access”, 1993. [ITU-I430] ITU-T Recommendation I.430: “Basic user-network interface—Layer 1 specification”, 1995. [ITU-M1677] ITU-R M.1677-1, “International Morse code Recommendation”, ITU-T, October 2009, Retrieved 2011. [ITU-Q763] Recommendation ITU-T Q.763: Signaling System No.7—IDSN User Part formats and codes, 1999. [ITU-T4] ITU-T Recommendation T.4, “Standardization of Group 3 Facsimile Apparatus for Document Transmission”, Geneva, 1988.
References 421
[ITU-T81]
[ITU-V28]
[ITU-V29]
[ITU-V32]
[ITU-V34]
[ITU-V42]
[JaNo84] [Jeetal00] [Jeetal99]
[JoSe03] [JoZi99] [JPEG2000] [Kaetal05]
[KaHo66] [Kam96] [KaNe95] [Kar03] [Kas66]
[KaSc80]
[Kat79]
ITU-T Recommendation T.81, “Information technology - Digital compression and coding of continuous-tone still images - Requirements and guidelines”, Geneva, 1992. ITU-T Recommendation V.28: “Data Communication Over the Telephone Network— Electrical Characteristics for Unbalanced Double-Current Interchange Circuits”, 1993. ITU-T Recommendation V.29: “Data Communication Over the Telephone Network— 9600 bits per second modem standardized for use on point-to-point 4-wire leased telephone-type circuits”, 1988. ITU-T Recommendation V.32: “Data Communication Over the Telephone Network—A family of 2‑wire, duplex modems operating at data signalling rates of up to 9600 bit/s for use on the general switched telephone network and on leased telephone-type circuits”, 1993. ITU-T Recommendation V.34: “Data Communication Over the Telephone Network—A modem operating at data signalling rates of up to 33 600 bit/s for use on the general switched telephone network and on leased point-to-point 2-wire telephonetype circuits”, 1998. ITU-T Recommendation V.42: “Data Communication Over the Telephone Network— Error-correcting procedures for DCEs using asynchronous-to-synchronous conversion”, 2002. N. S. Jayant, P. Noll, “Digital Coding of Waveforms”, Prentice-Hall, 1984. M. C. Jeruchim, P. Balaban, K. S. Shanmugan, “Simmulation of Communication Systems”, 2nd Ed., Kluwer Academic – Plenum Publishers, New York, 2000. W. G. Jeon, K. H. Chang, Y. S. Cho, “An equalization technique for orthogonal frequency-division multiplexing systems in time-variant multipath channels”. IEEE Transactions on Communications 47 (1): 27–32, (1999). C. R. Johnson, W. A. Sethares, “Telecommunication Breakdown: Concepts of communication transmitted via software defined radio”, Prentice Hall, 2003. R. Johannesson, K. S. Zigangirov, “Fundamentals of Convolutional Coding”, IEEE Press, New York, 1999. “Overview of JPEG2000”, http://www.jpeg.org/jpeg2000/index.html, accessed on 21.11.2015., Accessed on 25.11.2015. G. Kabatiansky, E. Krouk, S. Semenov, “Error Correcting Coding and Security for Data Networks—Analysis of the Superchannel Concept”, John Wily and Sons, Ltd 2005. K. Kao, G. Hockham, “Dielectric-fibre surface waveguides for optical frequencies, Electrical Engineers”, Proceedings of IEE, 113(7), pp. 1151–1158, (1966). K. D. Kammeyer, “Nachrichtenuebertragung”, B.G. Teubner Stuttgart, 1996. D. MacKay, R. Neal, “Good codes based on very sparse matrices”, Cryptography and Coding, 5th IMA Conference, LNCS, pp. 100–111, Berlin, October 1995. S. V. Kartalopoulos, “DWDM: Networks, Devices and Technologies”, Wiley, 2003. T. Kasami, “Weight distribution formula for some class of cyclic codes”, Tech. Rep. R-285, Coordinated Science Laboratory, University of Illinois, Urbana, IL, April 1966. J. F. Kaiser, R. W. Schafer, „On the use of the I0-sinh window for spectrum analysis”, IEEE Transactions on Acoustics, Speech, and Signal Processing 28, pp. 105– 107, 1980. N. Katsuhiko, “InGaAsP heterostructure avalanche photodiodes with high avalanche gain”, Applied Physics Letters 35 (3), p. 251. (1979).
422 References
[Kay99] [Ker83] [Khi34] [Kiv93] [Kob87] [Koetal12]
[Koetal14]
[Kol33] [Kot33]
[Kut09] [Lau96] [Lec91] [Lee08] [Lee60] [Lee93] [Len66] [LiCo04] [Lietal07]
[Lietal11]
[LiJa13] [LT10] [Lue92] [LuTa02] [Mae91]
D. MacKay, “Good error-correcting codes based on very sparse matrices”, IEEE Trans. Inform. Theory, Vol. 45, no. 2, pp. 399–431, March, 1999. A. Kerckhoffs, “La cryptographie militaire”, Journal des sciences militaires, Vol. IX, pp. 5–38, “II. DESIDERATA DE LA CRYPTOGRAPHIE MILITAIRE.”, Jan. 1883. A. Khinchin, “Correlation Theory of Stationary Stochastic Processes”, Math. Ann., Vol. 109, pp. 604–615, 1934. Y. S. Kivshar, Phys. Rev. A 42, 1757 (1990); IEEE J. Quantum Electron. 29, 250 (1993). N. Koblitz, “Elliptic Curve Cryptosystems”, In: Mathematics of Computation, 48, Nr. 177, American Mathematical Society, 1987, pp. 203−209. A. P. Kouzov, N.I. Egorova, M. Chrysos, F. Rachet, “Non-linear optical channels of the polarizability induction in a pair of interacting molecules”, NANOSYSTEMS: PHYSICS, CHEMISTRY, MATHEMATICS, 2012, 3 (2), p. 55. T. Koike-Akino, D. S. Millar, K. Kojima, et al, “Turbo Demodulation for LDPC-coded High-order QAM in Presence of Transmitter Angular Skew”, European Conference on Optical Communication (ECOC), Paper Th.1.3.2, September 2014. A. N. Kolmogorov, “Grundbegriffe der Wahrscheinlichkeitsrechnung”, SpringerVerlag, Berlin-Heilderberg, 1933. V. A. Kotelnikov, “On the carrying capacity of the ether and wire in telecommunications”, Material for the First All-Union Conference on Questions of Communication, Izd. Red. Upr. Svyazi RKKA, Moscow, 1933. H. Kuttruff, “Room Acoustics”, 5th ed., Spon Press, New York, 2009. T. Lauterbach, “Digital audio Broadcasting. Franzis–Verlag”, Feldkirchen 1996. J. W. Lechleider, “High Bit Rate Digital Subscriber Lines: A Review of HDSL Progress”, IEEE Journal on Selected Areas in Communications 9 (6): pp. 769–784, 1991. C.Y. Lee, “Mobile Communications Engineering-Theory and Applications”, Second edition, Mc-Graw Hill, 2008. Y. W. Lee, “Statistical Theory of Communication”, John Wiley & Sons, Inc., New York, 1960. W.C.Y. Lee, “Mobile Communications Design Fundamentals”, John Wiley & Sons, New York, 1993. A. Lender, “Correlative Level Coding for Binary-Data Transmission”, IEEE Spectrum, Vol. 3, pp. 104–115, February 1966. S. Lin, D. J. Costello, “Error Control Coding”, Pearson Prentice Hall, USA. 2004. Z. Li, Q. Sun, Y. Lian, C. W. Chen, “Joint Source-Channel-Authentication Resource Allocation and Unequal Authenticity Protection for Multimedia over Wireless Networks”, IEEE Trans. On multimedia, Vol. 9, no. 4, Jun 2007. Y. C. Liang, K. W. Chen, G. Y. Li, P. Maehoenen, “Cognitive radio networking and communications—An overview”, IEEE Trans. Vehicle Technology, Vol. 60, No.7, pp. 3386–3407, Sep. 2011. S. G. Lingala, M. Jacob, “Blind Compressive Sensing Dynamic MIR”, in IEEE Trans. Medical Imaging, Vol. 32, No. 6, pp. 1132–1145, June, 2013. “Fujitsu introduces compact integrated DP-QPSK receiver for 100Gbps digital coherent receiving systems”, Lightwave Technology, March 2010. H. D. Lueke, “Correlationssignale”, Springer, 1992. R. Ludwig, J. Taylor, “Voyager Telecommunications Manual”, JPL DESCANSO (Design and Performance Summary Series), March 2002. R. Maeusl, “Digitale Modulationsverfahren”, Huethig Buch Verlag Heildelberg, 1991.
References 423
[Maetal62]
[Mai60] [Mai67] [MaIn11] [Man47] [Mar64] [Mar98] [Mas69] [MaSc72] [Maz93] [Meetal87] [Meetal96] [Mel04] [Mer79] [Mid96] [Mij11] [Mil86] [Mil89] [MiMa99] [Mit92] [Nah02] [NiCh99] [NICT15] [NiSu95]
N. I. Marshall, W. P. Dumke, G. Burns, F. H. Dill Jr., G. Lasher, “Stimulated Emission of Radiation from GaAs p-n Junctions”, Applied Physics Letters 1 (3), pp. 62–64, 1962. T. H. Maiman, Speech at a Press Conference at the Hotel Delmonico, New York. July 7, 1960 (Retrieved December 31, 2013). T. H. Maiman, “Ruby laser systems”, U.S. Patent 3 353 115, 1967. D. Manolakis, V. K. Ingle: Applied Digital Signal Processing, Cambridge University Press, 2011. L. Mandeno, “Rural Power Supply Especially in Back Country Areas”, Proceedings of the New Zealand Institute of Engineers, Vol. 33, p. 234, 1947. E. A. Marland, “Early Electrical Communication”, Abelard-Schuman Ltd, London 1964, no ISBN, Library of Congress 64-20875, pp. 17–19. A. Marincic, “Opticke Telekomunikacije”, Univerzitet u Beogradu, Beograd, 1998, pp. 37–38, 42–46. J. L. Massey: Shift-register synthesis and BCH decoding, IEEE Transactions on Information Theory, IT-15, pp. 122–127, 1969. R. D. Maurer, P. C. Schultz, “Fused Silica Optical Waveguide”, Corning Glass Works, Corning, N.Y., assignee. U.S. Patent 3 659 915, 1972. F. Mazda, “Telecommunications Engineer’s Reference Book”, Butterworth Heinemann Ltd., 1993. R.J. Mears, L. Reekie, I. M. Jauncey, D. Payne, “Low-noise Erbium-doped fibre amplifier at 1.54 μm”, Electron. Letters, 23 (19), pp. 1026–1028, February 1987. A. J. Menezes, P.C. van Oorschot, S.A. Vanstone, Handbook of applied cryptography, CRC Press, Boca Raton, 1996. W. L. Melvin: “A STAP Overview”, IEEE Aerospace and Electronics Systems Magazine, Vol. 19, Issue 1, Part 2, pp.19–35, January 2004. R.C. Merkle, “Secrecy, authentication, and public key systems”, Stanford Ph.D. thesis 1979, pp. 13–15. D. Middleton, “An Introduction to Statistical Communication Theory”, McGraw-Hill Book Company, New York, 1958 (2nd reprint Ed., IEEE Press, New York 1996). M.Mijic, “Audio Systems”, Academic mind, Belgrade 2011. V. S. Miller, “Use of Elliptic Curves in Cryptography”, Advances in Cryptology— CRYPTO' 85 Proceedings, LNCS 218, Springer, 1986, pp. 417−426. S. Milojkovic, “Teorija Elektricnih kola”, Svjetlost Sarajevo, 1989, ISBN 86-0102368-1, p. 367. J. Mitola, J. G. Q. Maguire, “Cognitive radio—Making software radios more personal”, IEEE Personal Communications, Vol. 6, No. 4, pp. 13–18, Aug. 1999. J. Mitola, “Software Radios: Survey, Critical Evaluation and Future Directions”, IEEE National Telesystems Conference, pp. 13–15, 1992. P. J. Nahin, “Oliver Heaviside—The Life, Work, and Times of an Electrical Genius of the Victorian Age”, Johns Hopkins University Press, 2002, ISBN 0-8018-6909-9. D. Nister, C. Christopoulos, “Lossless Region of Interest with Embedded Wavelet image Coding”, Signal Processing, Vol. 78, No. 1, pp. 1–17, 1999. NICT, “Going beyond the Limits of Optical Fibers”, Press release April 24, 2015. (www.nict.go.jp/en/press/2015/04/24-1.html) C. Nill, C. Sundberg, “List and soft symbol output Viterbi algorithms: extensions and comparisons”, IEEE Trans. on Communications , Vol. 43, Issue: 2/3/4, pp. 277–287, 1995.
424 References
[Nut81] [Nyq28] [Ode70] [Ohetal13] [OhLu02] [Oketal68]
[Paetal15] [Pap84] [Par06]
[PeBr61] [Peetal83] [PeHo78] [PeMi93] [PfHa94] [Pha99] [Poh05] [PoWa86]
[Pro00] [Pro93] [Pup00] [R128] [RaBo98] [RaCi96] [RaGo75] [Raj08]
A. H. Nuttall, “Some windows with very good sidelobe behaviour”, IEEE Transactions on Acoustics, Speech, Signal Processing, ASSP-29, pp. 84–91. February 1981. H. Nyquist, “Certain topics in telegraph transmission theory”, Transactions of the American Institute of Electrical Engineers, Vol. 47, Issue 2, pp. 617–644, April 1928. J. P. Odenwalder, “Optimal decoding of convolutional codes”, Doctor Thesis, UCLA, 1970. H. Oh, A. Bilgin, M. W. Marcellin, “Visually Lossless Encoding for JPEG2000”, IEEE Trans. Image Process., Vol. 22, no. 1, pp. 189–201, 2013. J. R. Ohm, H. D. Lueke: Signaluebertragung, Springer, 2002. Y. Okumura, E. Ohmori, T. Kawano, K. Fukuda, “Field Strength and Its Variability in VHF and UHF Land-Mobile Radio Service”, Review of the Electrical Communication Laboratory, Vol. 16, No. 9-10, pp. 825–873, Sep–Oct. 1968. G. Patrick; P. Kristensen, S. Ramachandran, “Conservation of orbital angular momentum in air-core optical fibers”, Optica 2 (3): 267–270, 2015. A. Papoulis, “Probability, Random Variables and Stochastic Processes”, 2nd Ed., McGraw-Hill Book Company, New York, 1984. M. A. Parseval, in Mémoires présentés à l’Institut des Sciences, ettres et Arts, par divers savans, et lus dans ses assemblées. Sciences, mathématiques et physiques (Savans étrangers.), Vol. 1, pp. 638–648, 1806. W. W. Peterson, D. T. Brown, “Cyclic Codes for Error Detection”, Proceedings of the IRE 49 (1): 228–235, January 1961. T. P. Pearsall, L. Eaves and J.C. Portal, “Photoluminescence and Impurity Concentration in GaInAsP Lattice-matched to InP”, J. Appl. Phys. 52 pp. 1037–1047 (1983). T. P. Pearsall, R. W. Hopson Jr, Electronic Materials Conference, Cornell University, 1977, published in J. Electron. Mat. 7, pp. 133–146, (1978). W. B. Pennebaker, J. L. Mitchell, “JPEG Still Image Data Compression Standard”, Chapman & Hall, New York, 1993. J. Pfanzagl, R. Hamböker, “Parametric statistical theory”, pp.207–208, de Gruyter, Berlin DE, 1994, isbn=3-11-013863-8. A. G. Phadke, “Handbook of Electrical Engineering Calculations”, Marcel Dekker Inc, New York, 1999. K. C. Pohlmann, “Principles of Digital Audio”, 5th ed., McGraw-Hill, New York, 2005. C. D. Poole, R. E. Wagner, “Phenomenological approach to polarization dispersion in long single-mode fibers”, Electron. Letters, Vol. 22, no. 19, pp. 1029–1030, Sep. 1986. J. G. Proakis, “Digital Communications”, McGraw Hill, New York, 2000. A. Prosser, “Standards in Rechnernetzen”, Springer Verlag, Wien, 1993. M. Pupin, “Art of Reducing Attenuation of Electrical Waves and Apparatus Therefore”, US patent 0 652 230, 1900. https://tech.ebu.ch/docs/r/r128.pdf R. M. Rao, A. S. Bopardikar, “Wavelet Transforms: Introduction to Theory and Applications”, Addison-Wesley, MA, 1998. G. Raleigh, J. M. Cioffi, “Spatio-temporal coding for wireless communications”, Global Telecommunications Conference, London, UK, 1996. L. R. Rabiner, B. Gold, “Theory and Application of Digital Signal Processing”, Prentice-Hall, 1975. K. Rajagopal, “Textbook on Engineering Physics”, PHI, New Delhi, 2008, part I, Ch. 3.
References 425
[RaJo02]
[Ram28] [Rap02] [RBS1770-4] [RBS1771-1] [RBS2088-0] [RBS646-1] [ReCh99] [ReSo60]
[RFC4867]
[RFC7253] [Rie62] [Rietal78] [Ris76] [Roetal62] [Ron23] [Ros07] [RoSt95]
[RS232]
[Rul93] [Ryetal12]
[Ryetal12-2]
M. Rabbani, R. Joshi, “An Overview of the JPEG 2000 Still Image Compression Standard”, Signal Processing: Image Communication, Vol. 17, No. 1, pp. 3–48, 2002. C. V. Raman, “A new radiation”, Indian J. Phys. 2: 387–398, 1928 (Retrieved April 2013). T. S. Rappaport, “Wireless Communications”, Prentice Hall, 2002. Recommendation ITU-R BS.1770-4, “Algorithms to measure audio programme loudness and true-peak audio level”, 10/2015. Recommendation ITU-R BS.1771-1, “Requirements for loudness and true-peak indicating meters”, 01/2012. Recommendation ITU-R BS.2088-0, “Long form file format for the international exchange of audio programme materials with metadata”, 10/2015. Recommendation ITU-R BS.646-1, “Source encoding for digital sound signals in broadcasting studios”, 1986–1992. I. S. Reed, X. Chen, “Error-Control Coding for Data Networks”, MA: Kluwer Academic Publishers, Boston, 1999, ISBN 0-7923-8528-4. I. S. Reed, G. Solomon, “Polynomial codes over certain finite fields”, Journal of the Society for Industrial and Applied Mathematics. [SIAM J.], Vol. 8, pp. 300–304, 1960. RFC 4867, “RTP Payload Format and File Storage Format for the Adaptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) Audio Codecs”, p. 35, IETF, 2007. RFC 7253, “The OCB Authenticated-Encryption Algorithm”, IETF, 2014. R. P. Riesz, “High speed semiconductor photodiodes”, Rev. Sci. lustrum., 33 (994), 1962. R. Rivest, A. Shamir, L. Adleman, “A Method for Obtaining Digital Signatures”, CACM 21, 1978, pp. 120–126. J. Rissanen, “Generalized Kraft Inequality and Arithmetic Coding”, IBM Journal of Research and Development, Vol. 20(3), pp. 198–203, 1976. H. N. Robert et al., “Coherent Light Emission from GaAs Junctions”, Physical Review Letters 9 (9), 1962. F. Ronalds, “Descriptions of an Electrical Telegraph”, and of some other Electrical Apparatus, London, 1823. T. D. Rossing, “Handbook of Acoustics”, Springer, New York, 2007. T. O’Rourke, R. Stevenson, “Human Visual System Based Wavelet Decomposition for Image Compression”, J. Visual Commun. Image Represent, Vol. 6, pp. 109–121, 1995. EIA standard RS-232-C, “Interface between Data Terminal Equipment and Data Communication Equipment Employing Serial Binary Data Interchange”, Washington: Electronic Industries Association, Engineering Dept., 1969 (OCLC 38637094). C. Ruland, “Informationssicherheit in Datennetzen”, Datacom Verlag, Bergheim, 1993. ISBN 3-89238-081-3 R. Ryf, R. Essiambre, A. H. Gnauck, et al., “Space-Division Multiplexed Transmission over 4200 km 3-Core Microstructured Fiber”, Proc. National Fiber Optic Engineers Conference (NFOEC), Paper PDP5C.2, March 2012. R. Ryf, S. Randel, A. H. Gnauck, et al., “Mode-Division Multiplexing Over 96 km of Few-Mode Fiber Using Coherent 6 × 6 MIMO Processing”, Journal of Lightwave Technology, Vol. 30, Issue 4, pp. 521–531, 2012.
426 References
[Saetal12]
[Saetal12-2]
[Saetal15]
[Sal00] [ScAt84]
[Scetal96] [Sch05] [See05] [SeNi08] [SeSu94] [Sha02] [Sha48] [Sha49] [ShCa16] [Shetal09] [Sietal15]
[Sin64] [Skl97] [Skl97] [Sletal12]
[Smi01] [Spetal07] [Sti02] [Str71]
J. Sakaguchi, B. J. Puttnam, W. Klaus, et al., “19-core fiber transmission of 19x100x172-Gb/s SDM-WDM-PDM-QPSK signals at 305Tb/s”, Proc. Optical Fiber Communication Conference (OFCC), Paper PDP5C.1, March 2012. J. Sakaguchi, Y. Awaji, N. Wada, et al., “Space Division Multiplexed Transmission of 109-Tb/s Data Signals Using Homogeneous Seven-Core Fiber”, Journal of Lightwave Technology, Vol. 30, Issue 4, pp. 658–665, 2012. J. Sakaguchi, W. Klaus, J. M. Mendinueta, et al., “Realizing a 36-core, 3-mode fiber with 108 spatial channels”, Optical Fiber Communication Conference (OFCC), Paper PDP Th5C.2, March 2015. D. Salomon, “Data Compression”, Springer-Verlag New York, Inc., 2nd Ed., 2000. M. Schroeder, B. Atal, “Code-excited linear prediction (CELP): High-quality speech at very low bit rates”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 937–940, 1984. M. Schwarz, W. R. Bennett, S. Stein, “Communication Systems and Techniques”, IEEE Press, Inc., New York, 1996. M. Schwarz, “Mobile Wireless Communications”, Cambridge University Press, 2005. P. Seebach, “Standards and specs: The ins and outs of USB”, IBM, 26 April 2005. M. W. Seeger, H. Nickish, “Compressed Sensing and Bayesian Experimental Design”, in Proc. Int. Conf. on Machine learning (ICML), Helsinki, Finland, Jul. 2008. N. Seshadri, C. Sundberg, “List Viterbi decoding algorithms with applications”, IEEE Trans. Commun., Vol. 42, pp. 313–323, 1994. P. M. Shankar, “Introduction to Wireless Systems”, John Wiley & Sons, 2002. C.E. Shannon, “A Mathematical Theory of Communication”, Bell System Technical Journal, Vol. 27, pp. 379–423, 623–656, 1948. C. E. Shannon, “Communication in the presence of noise”, Proc. Institute of Radio Engineers, vol. 37, No.1, pp. 10–21, Jan. 1949. A. She, F. Capasso, “Parallel Polarization State Generation”, Scientific Reports, Nature, 2016. N. Shental, A. Amir, O. Zuk, “Identification of Rare Alleles and Their Carriers Using Compressed Se(que)nsing”, Nucleic Acids Research, Vol.38, No. 19, p. 179, 2009. L. Sibomana, H. J. Zepernick, H. Tran, “On the Outage Capacity of an Underlay Cognitive Radio Network”, Proc. IEEE Conference on Signal Processing and Communication Systems, Cairns, Australia, Dec. 2015. R. C. Singleton, “Maximum distance q-nary codes”, IEEE Trans. Inf. Theory 10 (2): 116–118, 1964. B. Sklar, “Rayleigh Fading Channels in Mobile Digital Communication Systems— Part 1: Characterization”, pp. 90–100, IEEE Communication Magazine, July 1997. B. Sklar, “Rayleigh fading channels in mobile digital communication systems—Part I”, IEEE Communication Magazine, pp 102–109, 1997. V. Sleiffer, Y. Jung, V. Veljanovski, et al., “73.7 Tb/s (96x3x256-Gb/s) modedivision-multiplexed DP-16QAM transmission with inline MM-EDFA”, Optics Express, 20(26), pp. B428–B438, 2012. G. Smith, “Teletype Communication Codes”, 2001. A. Spanias, T. Painter, V. Atti, “Audio Signal Processing and Coding”, John Wiley & Sons, Inc., Hoboken, New Jersey, 2007. J. Stillwell, “Mathematics and Its History”, Springer, 2002. J. Strutt, “On the scattering of light by small particles”, Philosophical Magazine, Series 4, Vol. 41, (1871), pp. 447–454.
References 427
[STS00] [Suetal08]
[Suetal75]
[Swe12] [Swetal12]
[T800]
[T801]
[T802]
[T803]
[T804]
[T807]
[T808]
[T809]
[T810] [Taetal12]
[TaMa02] [Tan81] [Tau00]
STS00, “A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications”, NIST Special Publication 800-22, 2000. P. D. Sutton, K. E. Nolan, L. E. Doyle, “Cycylostationary signatures in practical cognitive radio applications”, IEEE Journal on Selected Areas of Communications, Vol. 26, No. 1, pp. 13–24, January 2008. Y. Sugiyama, M. Kasahara, S. Hirasawa, T. Namekawa, “A Method for Solving Key Equation for Decoding Goppa Codes”, Information and Control, Vol. 27, no. 1, pp. 87–99, 1975. N. Swenson, “Combining 40G DP-QPSK with 10G OOK channels on metro/regional networks”, Lightwave Technology, December, 2012. N. Swenson, S. C. Wang, J. Cho, et al., “Use of High Gain FEC to Counteract XPM in Metro Networks Combining 40G Coherent DP-QPSK and 10G OOK Channels”, in IEEE Photonics Conference 2012, pp. 530–531, September 2012. Information technology – JPEG2000 Image Coding System: Core Coding System, ISO/IEC International Standard 15444-1, ITU Recommendation T.800, December 2000. Information Technology – JPEG2000 Image Coding System: Part II Extensions, ISO/IEC Final Draft International Standard 15444-2, ITU Recommendation T.801, August 2001. Information Technology – JPEG2000 Image Coding System: Motion JPEG2000, ISO/IEC Final Draft International Standard 15444-3, ITU Recommendation T.802, September 2001. Information Technology – JPEG2000 Image Coding System: Conformance Testing, ISO/IEC Final Draft International Standard 15444-4, ITU Recommendation T.803, May 2002. Information Technology – JPEG2000 Image Coding System: Reference Software, ISO/IEC Final Draft International Standard 15444-5, ITU Recommendation T.804, November 2001. Information Technology – JPEG2000 Image Coding System: Secure JPEG2000, ISO/IEC Final Draft International Standard 15444-8, ITU Recommendation T.807, July 2006. Information Technology – JPEG2000 Image Coding System: Interactivity Tools, APIs and Protocols, ISO/IEC Final Draft International Standard 15444-9, ITU Recommendation T.808, October 2004. Information Technology – JPEG2000 Image Coding System: Extensions for Threedimensional Data, ISO/IEC Final Draft International Standard 15444-10, ITU Recommendation T.809, December 2008. Information Technology – JPEG2000 Image Coding System: Wireless, ISO/IEC Final Draft International Standard 15444-11, ITU Recommendation T.810, June 2007. H. Takara, A. Sano, T. Kobayashi, et al., “1.01-Pb/s (12 SDM/222 WDM/456 Gb/s) Crosstalk-managed Transmission with 91.4-b/s/Hz Aggregate Spectral Efficiency”, European Conference and Exhibition on Optical Communication (ECOC), Paper Th.3.C.1, September 2012. D. Taubman, M. Marcellin, “JPEG2000: Standard for Interactive Imaging”, Proc. of the IEEE, Vol. 90, No. 8, pp. 1336–1357, 2002. R. M. Tanner, “A recursive approach to low complexity codes”, IEEE Trans. Information Theory, Vol. 27, no. 5, pp. 533–547, September 1981. D. Taubman, “High Performance Scalable Image Compression with EBCOT”, IEEE Trans. Image Process., Vol. 9, No. 7, pp. 1158–1170, 2000.
428 References
[Tes91]
[Tho69] [ToLi78] [Tom92] [Tor05] [TrGi07]
[Tyn70] [Umb10]
[VDSL99] [Veetal14] [Vic08] [Vit67]
[WaBe88] [Wea48] [Wei08] [Whi15]
[Wie30] [Wietal10] [Wil96] [Wil48] [WiSl77] [WuXi05]
N. Tesla, “Experiments with Alternate Currents of Very High Frequency and Their Application to Methods of Artificial Illumination”, American Institute of Electrical Engineers, Columbia College, New York, 1891. J. B. Thomas, “An Introduction to Statistical Communication Theory”, John Wiley & Sons, Inc., New York, 1969. W. J. Tomlinson, C. Lin, “Optical wavelength-division multiplexer for the 1–1.4micron spectral region”, Electronics Letters, Vol. 14, pp. 345–347, May 1978. W. J. Tomlinson et al., J. Opt. Soc. Am. B 9, 1134 (1992). D. Torrieri, “Principles of Spread-Spectrum Communication Systems”, Springer, 2005. J. Tropp, A. Gilbert, “Signal Recovery from Partial Information via Orthogonal Matching Pursuit”, IEEE Trans. Inform. Theory, Vol. 53, No. 12, pp. 4655–4666, 2007. J. Tyndall, “Total Reflexion, from Notes of a Course of Nine Lectures on Light”, Royal institution of Great Britain, London, 1870. S. E. Umbaugh, “Digital Image Processing and Analysis: Human and Computer Vision Applications with CVIPtools”, Second Edition, CRC Press, November 2010, ISBN 978-1-4398-0205-2. ETSI TS 101 270-1, V1.2.1, “Very high speed Digital Subscriber Line (VDSL); Part 1: Functional requirements”, 1999. M. Vetterli, J. Kovacevic, V. K. Goyal, “Foundations of Signal Processing”, Cambridge University Press, 2014. J. R. Victor, “Samuel Thomas von Sömmering's Space Multiplexed Electrochemical Telegraph (1808-1810)”, Harvard University website. A. J. Viterbi, “Error bounds for convolutional codes and an asymptotically optimum decoding algorithm”, IEEE Transactions on Information Theory, Vol.IT-13, no. 2, pp. 260–269, 1967. J. Walfisch, H.L. Bertoni, “A theoretical model of UHF propagation in urban environments”, IEEE Trans. Antennas and Propagation, pp. 1788–1796, Oct. 1988. W. Weawer, “Probability, rarity, interest and surprise”, Scientific Monthly, Vol. 67, pp. 390–392, 1948. P. Weinberger, “John Kerr and his Effects Found in 1877 and 1878”, Philosophical Magazine Letters 88 (12), pp. 897–907, 2008. E. T. Whittaker, “On the Functions Which are Represented by the Expansions of the Interpolation Theory”, Proc. Royal Soc. Edinburgh, Sec. A, Vol.35, pp. 181–194, 1915. N. Wiener, “Generalized Harmonic Analysis”, Acta Mathematica, Vol. 55, pp. 117– 258, 1930. M. P. Wilson, K. Narayanan, G. Caire, IEEE Trans. On Inform. Theory, Vol. 56, pp. 4922–4940, September 2010. S. G. Wilson, “Digital Modulation and Coding”, Prentice Hall, Upper Saddle River, NJ, 1996. H. Wilbraham, “On a certain periodic function”, Cambridge & Dublin Math. J. 3, pp. 198–201, 1848. F. J. MacWilliams, N. Sloane, “The Theory of Error-Correcting Codes”, NorthHolland, Amsterdam, 1977, ISBN 0-444-85193-3. X. Wu, H. Xiang, IEEE commun. lett. 9 8 (2005), pp. 735–737.
References 429
[Yaetal13]
[YuAs09]
[ZhBo90] [Zhetal09]
[Zhetal10] [ZiLe78] [Zit10]
J. Yang, J. Thompson, X. Huang, T. Jin, Z. Zhou, “Random-Frequency SAR Imaging Based on Compressed Sensing”, in IEEE Trans. Geoscience and Remote Sensing, Vol. 51, No. 2, pp. 983–994, February 2013. T. Yuecek, H. Aslan, “A survey of spectrum sensing algorithms for cognitive radio applications”, IEEE Communications Surveys & Tutorials, Vol. 11, No. 1, pp. 116– 130, First Quarter 2009. W. Zhao, E. Bourkoff, Opt. Lett. 14, 703 (1989); Opt. Lett. 14, 808 (1989); Opt. Lett. 15, 405 (1990). G. Zhao, G. Y. Li, C. Yang, “Proactive detection of spectrum opportunities in primary systems with power control”, IEEE Trans. Wireless Communications, Vol. 8, No. 9, pp. 4815–4823, September 2009. X. Zhou et al., Proc. OFC’10, PDPB9 (2010). J. Ziv, A. Lempel, “Compression of information sequences via variable-rate coding”, IEEE Transactions on Information Theory, 24 (5) (1978), pp. 530–536. G. Zitkovic, “Introduction to stochastic processes—Lecture notes”, The University of Texas at Austin, 2010.
List of Acronyms 2PSK
2 Phase Shift Keying
3R
Re-transmit, Re-time, Re-shape
4B3T
4 Binary 3 Ternary
4PSK
4 Phase Shift Keying
A/D
Analog/Digital
AAC
Advanced Audio Coding
AC
Alternating Current
ADPCM
Adaptive Differential Pulse Code Modulation
ADS
Audio Data Storage
ADSL
Asymmetric Digital Subscriber Line
ADSL-RE
Asymmetric Digital Subscriber Line – Reach Extended
AES1
Advanced Encryption Standard
AES
Audio-Embedded Standard
2
AIFF
Audio Interchange File Forma
Al2O3
Aluminium oxide
Al-PE
Aluminium-Polyethylene
AM/FM
Amplitude Modulation / Frequency Modulation
AMI
Alternate Mark Inversion
AMR
Adaptive Multi Rate
ANSI
American National Standard Institute
APP
A Posteriori Probability
APSK
Amplitude (or Asymmetric) Phase Shift Keying
ASCII
American Standard Code for Information Interchange
ASK
Amplitude Shift Keying
ASK/PSK
Amplitude Shift-Keying / Phase Shift-Keying
ATM
Asynchronous Transfer Mode
ATSC
Advanced Television Systems Committee
ATSC-M/H
Advanced Television Systems Committee – Mobile/Handheld
AU
Administrative Unit
432 List of Acronyms
AWGN
Additive White Gaussian Noise
B-bit
Balancing bit
B2O3
Boron trioxide
BASE-T
Baseband (transmission over) twisted-pair cable
BASE-TX
Baseband (transmission over) two-pair Cat5 (or better) cable
BASK
Binary Amplitude Shift Keying
BCH
Bose Chaudhuri Hocquenghem
BCJR
Bahl Cocke Jelinik Raviv
BER
Bit Error Rate
BPSK
Binary Phase Shift Keying
BS
Base Station
BSC
Binary Symmetric Channel
BT
Bandwidth-Time
CA
Certification Authority
Cat
Category
CATV
Cable Television
CBC
Cipher Block Chaining
CBC MAC
Cipher Block Chaining Message Authentication Code
CCD
Charge Coupled Device
CCITT
Comité Consultatif International Téléphonique et Télégraphique
CCS
Common Channel Signalization
CCTV
Closed Circuit TeleVision (Video surveillance)
CD
Compact Disc
CDM
Code Division Multiplexing
CDMA
Code Division Multiple Access
CD-ROM
Compact Disc – Read-Only Memory
CELP
Codebook Excitation Linear Prediction
CFB
Cipher Feedback
CLT
Central Limit Theorem
C-MFSK
Coherent Multiple Frequency Shift Keying
CMOS
Complementary Metal Oxide Semiconductor
List of Acronyms 433
COFDM
Coded Orthogonal Frequency Division Multiplexing
COST
(European) Cooperation in Science and Technology
CPFSK
Coherent Phase Frequency Shift Keying
CR
Cognitive Radio
CRC
Cyclic Redundancy Check
CRC-16-CCITT
Cyclic Redundancy Check - 16 - Comité Consultatif International Téléphonique et Télégraphique
CS
Compressive Sensing
CSI
Channel-State Information
CSPRNG
Cryptographically Secure Pseudorandom Number Generator
CSS
Chirp Spread Spectrum
CTR
Counter
CWDM
Coarse Wavelength Division Multiplexing
D&SP
Decoding and Signal Processing
DAB
Digital Audio Broadcasting
DAS
Digital Signature Algorithm
DASH
Dynamic Adaptive Streaming over HTTP
DAT
Digital Audio Tape
DBPSK
Differential Binary Phase Shift Keying
DC
Direct Current
DCC
Digital Compact Cassette
DCE
Data Circuit Terminating Equipment, Data Communication Equipment, Data Carrier Equipment
DCT
Discrete Cosine Transform
DDC
Digital Downconverter
DEBPSK
Differential Encoded Binary Phase Shift Keying
DEQPSK
Differential Encoded Quadrature Phase Shift Keying
DEMUX
Demultiplexer
DES
Data Encryption Standard
DFT
Discrete Fourier Transform
DGD
Differential Group Delay
DIF
Decimation-In-Frequency
434 List of Acronyms
DIT DM
Decimation-In-Time 1
Dieselhorst Martin
DM2
Delta Modulation
DMB
Digital Multimedia Broadcasting
DMT
Discrete Multi-Tone
DOCSIS
Data Over Cable Service Interface Specification
DPCM
Differential Pulse Code Modulation
DP-QPSK
Dual Polarization Quadrature Phase Shift Keying
DQPSK
Differential Quadrature Phase Shift Keying
DR
Dynamic Range
DRM
Digital Radio Mondiale
DS1
Digital Signal 1
DSD
Direct Stream Digital
DSF
Dispersion Shifted Fiber
DSL
Digital Subscriber Line
DSLAM
Digital Subscriber Line Access Multiplexer
DSP
Digital Signal Processing
DS
Direct Sequence
DSSS
Direct Sequence Spread Spectrum
DSS
Digital Signature Standard
DTE
Data Terminal Equipment
DTMB
Digital Terrestrial Multimedia Broadcast
DTS
Digital Theatre Systems
DTV
Digital Television
DUC
Digital Up Converter
DVB
Digital Video Broadcasting
DVB-C/C2
Digital Video Broadcasting – Cable
DVB-H/NGH
Digital Video Broadcasting – Handheld / Next Generation Handheld
DVB-S/S2/S2X /SH DVB-T/T2
Digital Video Broadcasting – Satellite / 2nd-generation Satellite / Satellite services to Handhelds Digital Video Broadcasting – Terrestrial / 2nd-generation Terrestrial
List of Acronyms 435
DVD
Digital Versatile Disc
DVD-A
Digital Versatile Disc – Audio
DWDM
Dense Wavelength Division Multiplexing
DWT
Discrete Wavelet Transform
E1
E-carrier, level 1
E/O
Electrical/Optical
EBCOT
Embedded Block Coding with Optimized Truncation
EBU
European Broadcasting Union
ECB
Electronic Codebook
ECDSA
Elliptic Curve Digital Signature Algorithm
EDFA
Erbium Doped Fiber Amplifier
EDTV
Enhanced Definition Television
EH
Hybrid electromagnetic wave with dominant electric field
EIA
Electronic Industries Alliance
EIRP
Effective Isotropic Radiated Power
EM
Electro-Magnetic
EOF
End-Of-File
erf
Error-function
ERP
Effective Radiated Power
ETSI
European Telecommunications Standards Institute
EURO COST
European Cooperation in Science and Technology
EVS
Enhanced Voice Services
EXIT
EXtrinsic Information Transfer
F/UTP
Foiled/Unshielded Twisted Pair
FDM
Frequency Division Multiplexing
FEC
Forward Error Correction
FEF
Future Extension Frame
FEXT
Far-End Crosstalk
FFH
Fast Frequency Hopping
FFT
Fast Fourier Transform
FH
Frequency Hopping
436 List of Acronyms
FHSS
Frequency Hopping Spread Spectrum
FIPS
Federal Information Processing Standard
FIR
Finite Impulse Response
FPGA
Field Programmable Gate Array
1
Frame Relay
2
FR
Field Representation
FSK
Frequency Shift Keying
FTP
Foil-shielded Twisted Pair
FTTH
Fiber-To-The-Home
FTTx
Fiber-To-The-x
GCM
Galois Counter Mode
GDSL
Gigabit Digital Subscriber Line
GE
Gigabit Ethernet
GeO2
Germanium dioxide
GF
Galois Field
GI
Graded-Index
GMAC
Galois Message Authentication Code
GMSK
Gaussian Minimum Shift Keying
GSM
Global System for Mobile Communications
HAD
Hybrid Digital Analog
HDB
High Definition Broadcast
HDBn
High Density Bipolar n
HDB-T2
High Definition Broadcast– Terrestrial (2nd-generation)
HDMI
High Definition Multimedia Interface
HDSL
High Data Rate Digital Subscriber Line
HDTV
High Definition Television
HE
Hybrid electromagnetic wave with dominant magnetic field
HFC
Hybrid Fiber-Coaxial
HH
High-High
Hi-Fi
High Fidelity
HL
High-Low
FR
List of Acronyms 437
HMAC
Hashed Message Authentication Code
HSPA
High-Speed Packet-Access
I&D
Integrate & Dump
IAMI
Inverted Alternate Mark Inversion
ICI
Inter-Carrier Interference
ICT
Irreversible Color Transformation
IDFT
Inverse Discrete Fourier Transform
IEC
International Electro-technical Commission
IEEE
Institute of Electrical and Electronics Engineers
IIR
Infinite Impulse Response
ILBC
Internet Low Bitrate Codec
IP
1
Intellectual Property
IP
2
Internet Protocol
IPTV
Internet Protocol Television
IR
Infrared
IS
International Standard
ISDB
Integrated Services Digital Broadcasting
ISDB-C
Integrated Services Digital Broadcasting – Cable
ISDB-S
Integrated Services Digital Broadcasting – Satellite
ISDB-T
Integrated Services Digital Broadcasting – Terrestrial
ISDN
Integrated Services Digital Network
ISDN BRA
Integrated Services Digital Network Basic Rate Interface
ISDN PRI
Integrated Services Digital Network Primary Rate Interface
ISI
Inter Symbol Interference
ISO
International Organization for Standardization
ISO/IEC
International Organization for Standardization / International Electro-technical Commission
ISO/IEC IS
International Organization for Standardization / International Elec-
ISP
Internet Service Provider
IT
Information Technology
ITU
International Telecommunication Union
tro-technical Commission International Standard
438 List of Acronyms
ITU-R
International Telecommunication Union – Radio-communication Sector
ITU-T
International Telecommunication Union – Telecommunication Standardization Sector
JP3D
JPEG 2000 – Extensions for 3-dimensional data (standard ISO/IEC 15444-10 or ITU T.809)
JPEG
Joint Photographic Experts Group
JPIP
JPEG 2000 – Interactive Protocol (standard ISO/IEC 15444-9 or ITU T.808)
JPSEC
JPEG 2000 Secure
JPWL
JPEG 2000 – Wireless applications
JPXML
JPEG 2000 XML
LAN
Local Area Network
LASER
Light-Amplification by Stimulated Emission of Radiation
LDPC
Low-Density Parity-Check
LED
Light-Emitting Diode
LFSR
Linear Feedback Shift Register
LH
Low-High
LL
Low-Low
LLR
Log Likelihood Ratio
LMS
Least Mean Square
LoS
Line-of-Sight
LOVA
List Output Viterbi Algorithm
LP
Linear Polarization
LTAC
Lossless Transform Coding of Audio
LTE
Long Term Evolution
LTE-A
Long Term Evolution – Advanced
LTI
Linear Time Invariant
LUFS
Loudness Unit Full Scale
MAC
Message Authentication Code
MAP
Maximum A-Posteriori Probability
MCF
Multi-Core Fiber
List of Acronyms 439
MD
Message-Digest
MDM
Mode Division Multiplexing
MDS
Maximum Distance Separable
MDx MAC
Message Digest x Message Authentication Code
MFSK
Multiple Frequency Shift Keying
MIMO
Multiple Input Multiple Output
MIT
Massachusetts Institute of Technology
ML
Maximum Likelihood
MM
Multi-Mode
MMF
Multi-Mode Fiber
MMS43
Modified Monitored Sum-43
MMSE
Minimum Mean Squared Error
MP3
MPEG-1 or MPEG-2 Audio Layer III
MPEG
Moving Picture Experts Group
MPLP
Multiple Physical Layer Pipe
MPSK
M-Phase Shift Keying
M-QAM
M-Quadrature Amplitude Modulation
MRI
Magnetic Resonance Imaging
MS
Mobile Station
MSAN
Multiservice Access Node
MSE
Mean Square Error
MSK
Minimum Shift Keying
MSOH
Multiplexer Section OverHead
MTS
Multichannel Television Sound
MU-MIMO
Multi-User Multiple Input Multiple Output
MUX
Multiplexer
NA
Numerical Aperture
NC-MFSK
Non-Coherent Multiple Frequency-Shift Keying
NDSF
Non-Dispersion-Shifted Fiber
NEXT
Near-End Crosstalk
NIST
National Institute of Standards and Technology
440 List of Acronyms
NLoS
Non-Line-of-Sight
NP
Non-deterministic Polynomial-time
NRZ
Non-Return-to-Zero
NRZI
Non-Return-to-Zero Inverted
NSP
Null-Space Property
NT
Network Terminator
NZD
Non-Zero Dispersion (positive or negative)
NZ-DSF
Non-Zero Dispersion-Shifted Fiber
O/E
Optical/Electrical
OAM
Orbital Angular Momentum
OAMMDM
Orbital Angular Momentum Mode Division Multiplexing
OCB
Offset Codebook Mode
OFB
Output Feedback
OFDM
Orthogonal Frequency Division Multiplexing
OH
Hydroxide
OMx
Optical Mode-x
OPGW
Optical Ground-Wire
OQPSK
Offset Quadrature Phase-Shift Keying
PBC
Polarization Beam Combiner
PBS
Polarization Beam Splitter
PC
Personal Computer
PCM
Pulse Code Modulation
PCS
Plastic-Clad-Silica
pdf
Probability density function
PDH
Plesiochronous Digital Hierarchy
PDL
Polarization-Dependent Loss
PDM
Polarization Division Multiplexing
PE
Polyethylene
Pels
Picture elements
PG
Primary Group
PKCS
Public Key Cryptography Standards
List of Acronyms 441
PLL
Phase Locked Loop
PM
Polarization-Maintaining
PMD
Polarization Mode Dispersion
PM-DQPSK
Polarization-Multiplexed Differential Quaternary Phase ShiftKeying
PM-QPSK
Polarization-Multiplexed Quaternary Phase Shift-Keying
pmf
Probability mass function
PN
Pseudo-Noise
POF
Plastic Optical Fiber
POH
Path Overhead
POTS
Plain Old Telephone Service
PRBG
Pseudo random bit generators
PS
Parametric Stereo
PSD
Power Spectral Density
PSK
Phase Shift Keying
PSTN
Public Switched Telephone Network
PU
Primary User
PVC
Polyvinylchloride
QAM
Quadrature Amplitude Modulation
QG
Quaternary Group
QoS
Quality of Service
QPSK
Quaternary Phase Shift Keying
RADSL
Rate Adaptive Digital Subscriber Line
RC4
Rivest Cipher-4
RCT
Reversible Color Transformation
RDS
Running Digital Sum
RF
Radio Frequency
RFC
Request-For-Comments
RFID
Radio Frequency IDentification
RIP
Restricted Isometry Property
RIPEMD
RACE Integrity Primitives Evaluation Message Digest
RJ-45
Registered Jack-45 (network interface / connector type)
442 List of Acronyms
RLE
Run Length Encoding
ROI
Region-Of-Interest
ROM
Read Only Memory
RS
Reed Solomon
RS-232
Recommended Standard-232 for serial data transmission(by EIA)
RS-485
Recommended Standard-485 for balanced serial data transmission(by ANSI/TIA/EIA)
RSA
Rivest-Shamir-Adleman
RSC
Recursive Systematic Encoder
RSOH
Regenerator Section OverHead
RZ
Return-To-Zero
RZI
Return-To-Zero Inverted
S/FTP
(overall-) Shielded / Foil-shielded Twisted Pair
S/UTP
(overall-) Shielded / Unshielded Twisted Pair
SACD
Super Audio Compact Disc
SAM
Spin Angular Momentum
SAN
Storage Area Network
SAOC
Spatial Audio Object Coding
SATA
Serial Advanced Technology Attachment
SCADA
Supervisory Control and Data Acquisition
ScTP
Screened Twisted Pair
SD
Secure Digital
SDF
Spectral Density Function
SDH
Synchronous Digital Hierarchy
SDM
Space Division Multiplexing
S-DMB
Satellite Digital Multimedia Broadcasting
SDR
Software Defined Radio
SDSL
Symmetric Digital Subscriber Line
SDTV
Standard Definition Television
SEAL
Software Optimized Encryption Algorithm
SER
Symbol Error Rate
SFH
Slow Frequency Hopping
List of Acronyms 443
SFN
Single Frequency Network
SF/UTP
Shielded Foil / Unshielded Twisted Pair
SG
Secondary Group
SHA
Secure Hash Algorithm 1
Symmetric High-speed Digital Subscriber Line
2
SHDSL
Single-Pair High-Speed Digital Subscriber Line
SI
Step-Index
SILK
Audio compression format/codec developed by Skype Ltd.
SiO2
Silicon dioxide, Silica, Quartz
SISO
Soft Input Soft Output
SLVA
Serial List Output Viterbi Algorithm
SM
Single-Mode
SMF
Standard Mono-mode Fiber
SNR
Signal-to-Noise Ratio
SOH
Section Overhead
SONET
Synchronous Optical NETworking
SOVA
Soft-Output Viterbi Algorithm
SP&C
Signal Processing and Coding
SQPSK
Staggered Quadrature Phase Shift-Keying
SS
Spread Spectrum
STDM
Statistical Time Division Multiplexing
STM
Synchronous Transfer Module
STM-1
Synchronous Transfer Module, level-1
STP
Shielded Twisted Pair
STS
Synchronous Transport Signal
SU
Secondary User
SWER
Single-Wire Earth-Return
T1
Transmission System 1 (T-carrier)
TCP
Transmission Control Protocol
TCP/IP
Transmission Control Protocol/Internet Protocol
T-DMB
Terrestrial Digital Multimedia Broadcasting
SHDSL
444 List of Acronyms
TE
Transversal Electric (field)
TEM
Transversal Electromagnetic (field)
TG
Ternary Group
TH
Time Hopping
THSS
Time Hopping Spread Spectrum
TIA
Telecommunications Industry Association
TDM
Time Division Multiplexing
TM
Transversal Magnetic (field)
TU
Tributary Unit
TUG
Tributary Unit Group
TV
Television
UDP
User Datagram Protocol
U/FTP
(overall-) Unshielded / Foil-shielded Twisted Pair
UHDSL
Universal High-Bit-Rate Digital Subscriber Line
UHDTV
Ultra High Definition Television
USAC
Unified Speech and Audio Coding
USB
Universal Serial Bus
UTP
Unshielded Twisted Pair
UV
Ultraviolet
V-bit
Violation bit
VC
Virtual Container
VCO
Voltage Controlled Oscillator
VCSEL
Vertical-Cavity Surface-Emitting Laser
VDSL
Very high-speed Digital Subscriber Line
VoIP
Voice over Internet Protocol
WAV
Waveform audio file format
WDM
Wavelength Division Multiplexing
WiMAX
Worldwide interoperability for Microwave Access
WLAN
Wireless Local Area Network
xDSL
x-Digital Subscriber Line
XML
Extensible Markup Language
List of Acronyms 445
XTS
cipherteXT Stealing
ZRE
Zero Forcing Equalizer
ZrF4
Zirconium fluoride
Index “0”-Suppression 137 1000BASE-T 288 100BASE-TX 288 10GBASE-T 288 4 Binary 3 Ternary 194 4G/LTE 96 4PSK 219 μ-law 65 A/D converter 65, 75, 83, 84, 129, 130, 355 absorption 306, 307 AC 88, 277 acoustic 60, 70 additivity 42 ADPCM 67, 68, 77 ADS 73 ADSL 285, 286, 289, 290, 292 ADSL-RE 289 AES 376, 377, 389 AIFF 79 A-law 65, 66 algebraic code 172 algebraic curve 402 algebraically 171 alphabet 138, 141, 142, 175, 177, 212, 217 AM/FM 354 AMI 191, 193, 194 amplitude quantization 65 amplitude spectrum 8, 10, 18, 19, 20, 21, 22, 34, 35, 200 ANSI 137, 151, 235, 375, 408 antenna array 349 Antheil 261 anti-symmetric 44, 45 a-posteriori information 173 a-posteriory probability 177 approximation theory 33 a-priori distribution 166 arithmetic coding 140, 142, 144 artificial intelligence 82 ASCII 137 ASK 208, 210, 211, 225 ASK/PSK 225 asymmetric cryptography 365 asymmetric encryption 363, 395 asymmetric key pair 365
asynchronous TDM 238 ATM 150, 151, 235, 236, 291, 292, 384 ATSC 96 attacker 361, 362, 363, 368, 371, 380, 391, 392 attenuation coefficient 280, 285, 305, 306, 309 attenuation constant 280, 281, 283 attenuation 329, 331, 332, 336, 354 AU 236 audibility 72, 74 audio embedded standard 79 audio processing 69 auditory 66, 70, 71, 72, 73 authenticated encryption 386 authentication 361, 366, 371, 386, 388, 390, 398, 408 authenticity 368 autoconvolution 23, 27, 30 autocorrelation 22, 25, 63, 67, 257, 258, 259, 260, 261, 266 avalanche effect 362, 373, 382 average power 62, 63, 330, 341 average SNR 344, 346 AWGN 130, 171, 177, 178, 179, 180, 342 background noise 329 balanced line 275, 276, 277, 278 balancing (B) bit 194 balun 275, 287 band-pass 31, 32 band-stop 31, 32 Bartlett window 49, 50 baseband 129, 175, 180, 186, 341, 355 BASK 208, 209 baud 231 Baudot 231 BCH codes 149, 250 BCJR 166 beamforming 349, 351 bearer (B) channel 289 Bell 69, 284 bend-insensitive fiber 315 BER 130, 145, 179, 204, 348 Berlekamp-Massey 154, 387 Bessel function 51, 302, 303, 304, 338 BFSK 211 binary cyclic code 149
448 Index
bipolar 188, 191 birthday paradox 371 Blackman window 49, 50 blind de-convolution 86 block algorithm 375, 395 block cipher 374, 375, 376, 377, 382, 389 block coding 145, 146, 152, 172 block encryption 375 block length 147, 171 bluetooth 79, 151, 261 Blum-Blum-Shub pseudo random bit generator 409 Boltzmann constant 358 BPSK 177, 216, 217, 218 Braille alphabet 136 brightness 85 broadband 2, 3, 129, 247, 268, 284 broadcast 69, 96, broadcasting 80, 81, 95 brute force attack 361, 391, 408 BS 329, 330, 340, 343, 354 BSC 176 burst 150, 152, 172, 238, 380, 381 butterfly 53, 54 camera 83, 84, 86, 94 Candès 39 cardinal theorem of interpolation 33 cardinality 38 CATV 93, 96, 272, 289, 296, 320 CBC 379, 381, 382, 384, 389 CCTV 296 CDM 230, 257, 261 cellular 356 CELP 68 central limit theorem 136 certificate 368, 369 certification authority 368, 369 CFB 379, 382, 383, 386, 387 channel decoder 130, 165, 173 channel encoder 129, 130, 159, 173 channel matrix 175, 176 channel state information 250, 349, 351, 352 characteristic equation 302, 303, 304 charge-coupled device 94 check node 156, 157, 158, 159 checksum 150, 151, 152, 370, 371 Chien-search-algorithm 154 chip length 264, 265
chip sequence 264, 266 chip 261, 264, 265, 266, 268, 269 chirp spread spectrum 261, 268 chosen plaintext attack 380 chromatic dispersion 310, 311, 314, 317 cipher text 361, 362, 363, 374, 375, 379, 380, 381, 382, 383, 384, 385, 386, 395, 396 circuit switched network 239 circular-shifting interleaver 164 CLEFIA 379 cluster 350 C-MFSK 213 CMOS 94 coaxial cable 284, 293, 294, 295, 296 coaxial pair 272, 276 code efficiency 140 code list 138, 139 code rate 140, 144, 159, 160, 163, 179 code redundancy 140 code sequence 257, 268 code symbol 139 code table 139 code tree 142 code 230, 250, 257, 258, 261, 262, 264, 268 codebook excitation 68 codec 67, 79 coding gain 145 COFDM 97, 249, 250, 251, 252 cognitive radio 356, 357, 359 coherence time 341 collision attack 392 collision resistance 370 colour level 85 combinatorial algorithms 41 common channel signalization 233 compact disc 69, 76, 78, 83, 150 companding 65, 66 complex additions 52 complex coefficient 18 complex Fourier coefficient 15 complex Fourier series 15 complex multiplications 52 complex signal 9 complexity theory 365 compressing 67, 68, 93 compressive sensing 32, 36, 37, 38, 39, 40 computer vision 23 concatenated codes 171, 172 conditional biphase 193
Index 449
conditional probability 175, 176 confidentiality 361, 386, 395 confusion 362 consonants 57, 58, 61, 65 constellation diagram 217, 218, 220, 223, 224, 225, 226, 227, 329 constellation points 217 constructive interference 331, 335, 351 contrast improvement 84 contrast reparation 83 contrast stretching 84 controlled ISI 203 conventional CD 76 convex optimization 41 convolution 22, 23, 26, 27, 29, 30, 42, 43, 48, 49, 58, 86, 159, 160, 163 convolutional codes 23, 159, 165, 250 conjugate complex 9, 23, 28 conjugate 205 Cooley-Tukey algorithm 53 cooperative spectrum sensing 358 correlation receiver 205, 209, 211 correlation 22, 24, 25, 28, 29, 341, 342, 343 correlative coding 203 Costas loop 221 Counter (CTR) 379, 384 coupling transformer 275 CPFSK 214, 215 CRC 150, 233, 234 CRC-16-CCITT 151 cross power spectrum 24, 28 cross-convolution 23 cross-correlation 257, 258, 259, 266 crosstalk cancellation 287 crosstalk 276, 285, 286, 287, 290, 293, 326 cryptanalysis 361 cryptographic algorithm 361, 363, 371 cryptographic transformation 363 cryptography 361, 364, 365, 368, 388, 405, 406 cryptology 361 cryptosystem 361, 363, 398 CSPRNG 409 current power 62 cut-off frequency 286 cut-off wavelength 304, 313, 314 CWDM 252 cyclic autocorrelation 357 cyclic code 148, 150
cyclic linear block codes 152 cyclic shifting 149 cyclostationary 60 D/A converter 75, 355 DAB 80, 81, 242, 249, 250 DAT 76 data (D) channel 289 data link layer 239 data origin authentication 366 DBPSK 218, 219, 222 DCE 1 DCT 87 DEBPSK 218 decimation-in-time 53 decipherment 363 decision depth 162 dedicated hash function 373 DEFLATE 142 delay 329, 350 delta function 201 delta modulation 66, 77 delta-pulse 33, 35 demultiplexer 229, 239, 253 DEQPSK 222 descrambler 196 destructive interference 249, 331, 335 DESXL 379 detection filter 213 deterministic signal 6 DFT 47, 48, 49, 51, 52, 53, 54, 55, 248 dictionary attack 361, 382 Dieselhorst-Martin (DM - quad) 273 DIF FFT 56 differential coding 134 differential demodulator 218 differential entropy 135 differential group delay 312 differential Manchester code 192 differential mode transmission 284 diffraction 329, 354 diffusion 362, 377 digital backbone 293 digital downconverter 355 digital image processing 82, 83 digital multimedia broadcasting 96 digital optical disc 83 digital signature with message recovery 394, 395
450 Index
digital signature 366, 371, 388, 392, 395 digital up converter 355 digitalization 33, 69, 76, 83, 93 Dirac impulse 42, 43, 45 direct sequence 264, 265, 267 direct spectrum sensing 358 direct wave 277, 279, 281, 282 Dirichlet’s condition 14 discontinuity 276 discrete convolution 159 discrete logarithm problem 398, 400, 405 discretization 33 dispersion-shifted fiber 314 distance coding 137 diversity coding 352 diversity 342, 343, 345, 350, 352, 358 DMB-satellite 96 DMB-terrestrial 96 DMT 97, 292 DOCSIS 289, 295 Dolby 75, 77, 78, 80 Dolph-Chebyshev window 49 Doppler effect 339, 341, 342 Doppler shift 334, 339, 340, 341, 342 Doppler spectrum 341 Doppler spread 341 DPCM 66, 67, 77, 88 DP-QPSK 323 DQBPSK 222 DQPSK 222 DRM 81, 249 DS1, 232 DSA 400, 401, 405, 407 DSL 241, 272, 287, 289, 290, 291, 292, 293, 320 DSLAM 291 DSP 76, 323, 325, 246, 247, 248, 250, 251, 252, 255, 355 DSS 400 DSSS 261 DTE 1 DTMB 96 DTS 77, 155, 172 DVB 96 DVB-H/NGH 96 DVB-S/S2/S2X/SH 96 DVB-T/T2 81, 93, 96, 97, 242, 249 DVD-Audio 76 DWDM 252, 314, 321, 322, 323
DWT 88, 90, 91 dynamic range 183 dynamic spectrum allocation 358 E1 194, 232, 233, 234, 235, 236, 237 EBCOT 92 ECB 379, 380, 382, 384 ECDSA 402, 405 EDFA 253, 322 edge detection 83, 84, 85, 92 Edison 69 EH 303, 304, 305 EIA 288, 316 EIRP 353 elastic medium 70 electric permittivity 302 Electro-technical Commission 77 El Gamal 398, 399, 400, 401 elliptic curve cryptography 406 elliptic curve 402, 403, 404, 405, 406 encipherment 363 energy detection 357 energy signals 24, 27, 29, 30 Enocoro 387 entity 365, 366, 368, 388 entropy 132, 133, 134, 135, 136, 140, 172 envelope detector 209, 210 envelope 209, 210, 212, 334, 335, 336 EOF 142 equal gain combining 348 equalization 130, 206 equalizer 206, 207, 242 erfc-function 179 ERP 353 error correcting codes 147, 159, 172, 333 error function 11 error probability 170 Euclidean algorithm 154, 399, 401 Euclidean space 37 Euler function 397, 398, 400 Eureka-147 249 EURO COST model 354 EVS 79 EXIT 173 expander 75 extended Euclidean algorithm 397 extended source 139 external noise 177, 188, 275 eye diagram 181
Index 451
F/UTP 288 fast fading 343 FDM 230, 240, 241, 242, 243, 252, 261 FEC 249, 292, 327 feedback 130, 163, 325, 382, 387, 408 FEF 81 Fermat number 397 few-mode multi-core fiber 349 FFT 48, 51, 52, 53, 54, 55, 246, 248 FHH 263 FHSS 261 field distribution 305 filter 22, 23, 34, 43, 44, 45, 51, 58, 59, 67, 76, 77, 85 fingerprint 370 FIR 23, 43, 44, 45, 46, 51, 355 first Nyquist criterion 199 flux 277, 295 forgery attack 391 formant 57, 58, 59 Forney 154, 171 Fourier coefficient 11, 20, 23 Fourier series 11, 14, 15, 17, 18, 20, 23, 24, 26 Fourier transform pair 48 Fourier transform 10, 14, 21, 24, 26, 28, 29, 33, 34, 45, 47, 48, 52, 59, 197 FPGA 355 frequency domain 20, 43, 47, 49, 59, 86 frequency function 8, 21, 22 frequency hopping 261 frequency orthogonality 243 frequency response 43, 44, 45, 69 FSK 208, 210, 211, 212, 213, 214 FTP 287 FTTH 289 FTTx 289, 320 fusion splice 308, 309 G.fast 287 Gallager codes 163 Gamma distribution 64 Gauss 178, 204 Gaussian distribution 136, 338, 350 Gaussian noise 182, 204 Gauss-Jordan elimination 156 GCM 386 generator matrix 146, 156 generator polynomial 148, 149, 150, 151, 152, 153, 154
geometric optics 297, 299, 302, 310 Gibbs phenomenon 14, 51 Gigabit DSL 287, 290 Gigabit Ethernet 272, 288 glass fiber 271, 297, 299, 305, 307, 313 GMAC 386 GMSK 215 Golay codes 149 Gold sequences 261 graded-index fibers 317 greedy algorithms 41 group delay 311, 312 GSM 150, 215, 221 guard band 240 guard interval 333 Hagenauer 169 half byte packing 137 Hamming bound 148, 149, 150, 169 Hamming code 147, 148, 150 Hamming distance 145, 147, 152, 162, 164 Hamming weight 145, 397 Hamming window 49, 50 Hanning (Hann) window 49, 50 hard decision 130, 165, 166, 169, 177 harmonic 7, 9, 10, 12, 14, 26, 47, 59, 283 Hartley 131, 182 hash code 371, 373 hash function 370, 371, 373, 374, 388, 390, 398 hash value 370, 371, 373, 392, 395, 398, 401 Hata model 354 HDA 174 HDBn 193 HDB-T2 96 HDMI 288 HDSL 290 HE 303, 304, 305 Heaviside 283, 284 Heaviside’s transmission line 283 Helmholtz equations 301 HH 90 hidden node 358 hierarchical access 359 high fidelity 284 high-pass 31, 291 HIGHT 379 histogram equalization 84, 85 HL 90
452 Index
HMAC 390, 391 Hoeher 169 homogeneity 42 homogenous line 276, 277, 281, 282 Huffman coding 140, 141, 142, 144 hybrid approach 359 hybrid coding 68 hybrid compression 68 hybrid fiber-coaxial 295 hypothesis 177 I-axis 217 ICT 90 ideal isotropic antenna 353 identity matrix 146, 156 IDFT 48, 246 IEEE 155 ILBC 68 image processing 23, 136 image restoration 83, 86 image sharpness 86 impulse response 42, 43, 45, 341, 350, 355 in phase 217 in quadrature 217 incident wave 279 Infinite Impulse Response (IIR) 43, 44, 45 information sequence 129, 130, 144, 157, 166 Information Technology (IT) 80 information theory 131, 134, 182 infrared 306 initialization vector 381, 384 ink jet 83 inner code 171, 172 instantaneous signal-to noise ratio 74 instantaneous SNR 344, 347 integrate dump receiver 206, 209, 211 integrity 292, 361, 366, 388, 390 intellectual property (IP) 355 intensity 81, 85, 94 Intercarrier Interference 240 interleaver 164, 172, 173 intermediate frequency 355 intermodal dispersion 310, 314, 317 intermodulation distortion 75, 76 internal noise 177, 222 International Telegraph Alphabet No. 2, 137 Internet 68, 77, 79, 83, 86, 88 interpolation function 35 interval 142, 143, 144
interweave approach 359 intrinsic impedance 296 inverse filtering 86 inverse Fourier transform 21, 29 inverse operation 365, 378 Inverted AMI 191 IP 291, 292 IPTV 93, 96, 284, 292, 320 ISDB 96 ISDB-C 96 ISDB-S 96 ISDB-T 96 ISDB-T International 96 ISDN 1, 151, 191, 194, 195, 289 ISDN BRA 289 ISI 180, 182, 196, 197, 199, 201, 202, 203, 206, 207, 332, 333 IT Networking 81 iterative decoding 165 iterative joint source and channel coding 174 jitter 182 joint source and channel coding 130, 172, 173 JPEG 87, 142 JPEG 2000 88, 89, 90, 91, 92 Kaiser window 48, 51 Kasami sequences 261 Kerckhoffs’s principle 363 Kerr effect 323, 327 key length 361, 376, 406 key pair 365, 405 key recovery attack 391 key space 361 Kotelnikov 33 l’Hôspital rule 185 Lamarr 261 LAN 272, 287, 288, 320, 321 Laplace distribution 64 Laplace operator 301 laser 271, 297, 310, 313, 315, 316, 322 LDPC 155, 156, 163, 172, 173, 186, 250 LED 307, 310, 315 Lee model 354 Lena image 85 length extension attack 374, 390, 391 LFSR 150, 258, 387 LH 90
Index 453
licence 356 lightwave 255 lightweight cipher 377, 379 line coding 2, 186 line decoder 130, 165, 175 line encoder 129, 175 line propagation constant 280 linear phase 22, 43, 44 linear system 40, 42 LL 90 LLR 165, 166, 169 logarithmic decibel scale (dB) 70 longitudinal component 300, 305 longitudinal waves 70 LoS 329, 331, 336, 337, 338 lossless 136, 140, 142 lossy 77, 87, 89, 90, 136 loudness 72, 74, 75, 80 loudspeaker 69 LOVA 166 low water peak fiber 315 lower bound 133, 135, 136, 169 low-pass 31, 198, 223, 246, 266, 291, 332, 355 LP 305, 325 LTE 207, 242 LTE-A 242 LTI 42, 58, 59, 67 LUFS 74 L-value 165, 250 MAC 371, 388, 389, 390, 391, 392, 408 machine learning 37 macro-bending 308 magnetic permeability 302, 311 main lobe 49, 51 man in the middle attack 368 Manchester code 192, 193 MAP 165, 166, 167, 177 Markov chain 134 Markov process 166, 167 masking 72, 77 master key 410 matched filter 205, 206, 209, 212, 358 matched impedance 275 matrix channel 350 maximal ratio combining 343, 346, 347, 348 maximal sequences 258 maximum likelihood path 162 maximum likelihood principle 162
MAXSHIFT method 91 Maxwell equations 277, 299 MDM 324 MDS 152 measurement matrix 38, 39, 40 medium 229, 240, 257 memory length 159 memoryless 132, 145, 162, 175, 176 Merkle-Damgård 374 metadata 79, 80 MFSK 212, 213, 215 micro-bending 308, 318 micro-diversity 343 microphone 57, 69, 93, 94 microwaves 299 MIMO 230, 255, 325, 348, 349, 350, 351 MIMO-OFDM 250 minimum error criterion 11 MIT 155, 171 Mitola 355, 356 ML 165 MMS43, 195 MMSE 173 modem 2, 150, 208 modulator 175, 208, 216 monochrome 81 Morse alphabet 136 MP3 142 MPEG 77, 78, 79, 80, 151 MPEG-1/-2 layer III (MP3) 77 MPEG-3D 79 MPEG-4 78, 79 MPEG AAC 77 MPEG DASH 79 MPEG-D USAC 79 MPLP 81 MPSK 224, 225 M-QAM 226, 227 MRI 37 MS 329, 330, 340 MSAN 272, 320 MSE 92 MSK 215 MSOH 236 MU - MIMO 351 multi-core fiber 324 multi-dimensional signal processing 348 multi-pair cable 275, 287
454 Index
multipath propagation 180, 249, 267, 268, 329, 331, 332, 333, 336, 340, 349, 350 multiple diversity combining 343 multi-pulse excitation 68 natural language processing 23 NC-MFSK 213 NDSF 314 near–far problem 264, 268 NEXT 286, 288 NIST 386, 400, 407 NLoS 329, 331 No7 signalization 233 noise cancelation 83 noise floor 74, 75 noise reduction 75, 85 noiseless coding theorem 140 noisy-channel coding theorem 182, 183 non-coherent demodulation 209, 211 nonlinear optimization 37 non-repudiation 366 non-symmetric signal 275 normalized correlation 24 normalized critical frequency 304 normalized difference 299 normed vector space 37, 38 NP-hard problem 40 NRZ 188, 189, 190, 192 NRZI 188, 189 NSP 38, 39, 40 NT 2 numerical aperture 298, 299 Nyquist band 200, 203 Nyquist 33, 182, 183, 199, 200, 201, 202, 244 Nyquist’s criterion 244 Nyquist’s formula 183 NZ-DSF 314 OAM 229, 230, 256, 257 object description 83 OCB 386 OFB 379, 383, 384, 386, 387 OFDM 97, 207, 242, 230, 242, 243, 245, 246, 247, 248, 249, 250, 255, 349 Okumura model 354 one time key stream 387 one time pad 387 one way collision resistant hash function 370, 371
one way hash function 370 one way trapdoor function 365 one way propagation 334 on-off keying 188, 322 optical fiber 297, 299, 304, 306, 309, 310, 312, 316, 318 optical mode 316 optical window 253, 307, 313, 314, 315, 322 optimal algorithm 142 optimal code 139, 140, 142, 152 optimal filter 204, 205 optimal length 139, 142, 144 optimum receiver 204, 205 OQPSK 223, 224 oscilloscope 180 outer code 171, 172 overlay approach 359 packet switched network 239 padding 374, 375, 379, 389 parity check matrix 156 parity-check 146, 147, 148, 150, 153, 154, 156 partial breaking 361 partial entropy 134 partial response signalling 203 Pascal 70 patch-cord 309, 318 patch-panel 309 path gain 352, 353 path loss 180, 334, 352, 353, 354 path metric 169 PCM 76, 232 PCM/PDH 235 PCM30 233 PDH 233, 234, 235, 236, 237 PDM 230, 254, 255, 322 peak factor 63 peak level 74 peak power 62 Pederson 210 pels 81 perceptual coding 73 perceptual response 72 perfect code 147 perfect signal reconstruction 33 phase characteristic 22, 43, 44 phase coefficient 300, 302, 303, 305 phase constant 280, 281 phase differential modulation 226
Index 455
phase spectrum 8 18, 19, 20 phase tree 214, 215, 216 phase velocity 281, 283, 310 phoneme 58, 59, 63, 64 photo diode 309, 322 pitch 57, 58, 59, 67, 68, 72, 73 pixel 81, 84, 89, 94 PKCS#7 padding 375 plastic clad silica 313 plastic optical fiber 313 playback 69 PLL 221 PM 315 PMD 312 PN 258, 259 point addition 402, 404 point doubling 403, 404 pointer 8, 9, 236 Poisson window 49 polar 188, 190, 214 polarization 229, 230, 254, 255, 256 polyethylene 273, 297, 318, 320 POTS 287 Poulsen 210 power consumption 76, 269 power spectral density 178, 205, 341 power-line communication 348 Poynting vector 277 PRBG 407 precoding 203, 204, 350, 351 prediction 67, 351, 354, 357 pre-image resistance 370 PRESENT 377, 378, 379 Prewitt Canny and Sobel 85 PRI 194 Primary ISDN 290 primary parameters 277, 279, 281, 283, 285, 295 primary protection 297, 314, 318 primary receiver detection 358 primary user 356, 357, 358, 359 primitive element 153 private key 361, 364, 365, 366, 367, 368, 388, 394, 398, 400, 405, 406 private transformation 365 proactive detection 358 probability density function 5, 64, 135, 136, 177, 337 process gain 267
propagating mode 303, 304, 305, 310, 312 propagation coefficient 300, 303, 304, 311, 312 propagation constant 280, 283 proximity effect 295 pseudo random number generator 370, 384, 408, 409 pseudonoise 258 pseudorandom sequence 195 PSK 208, 216, 217, 219, 225 PSTN 93, 290, 291 public key certificate 369 public key cryptography 368 public key database 369 public key 365, 366, 368, 369, 396, 405 public transformation 365 PVC 273, 293, 319 QAM 97, 245, 246, 247, 248, 255, 329 Q-axis 217 QoS 348, 356 QPSK 97, 179, 219, 220, 221, 223, 224, 225, 245, 246, 247, 248, 255 quantization 65, 66, 67, 75, 76, 77, 87, 90, 91, quartz glass 306 quasi-periodic 60 radio link 254, 349 radix-2, 53 raised-cosine 201, 202 random number generator 407, 409 random oracle 362, 373 random output 362 random process 60 random sequence 266 Rate-Adaptive DSL 289 Rayleigh distribution 337 Rayleigh fading 336, 341 Rayleigh scattering 306 RC4 387 RCT 90 RDS 195 Rectangular window (Dirichlet) 50 recursive filters 45 redundancy 129, 140, 144, 150, 171 Reed-Solomon codes 149, 154, 172, 250 reflectance factor 282 reflected wave 279, 281 refractive index 297, 302, 316, 323, 326 repetitive code 380
456 Index
replay attack 389 resolution 65, 73, 76, 79, 90, 94. resonator 57, 58 RFID 268, 377 Rician distribution 338 Rician factor 338 Rician fading 336, 338 RIP 39 RIPEMD-160 373 RJ-45 288 RLE 88, 137 robustness 76, 79, 187, 242, 252, 269, 350 ROI 90 roll-off-factor 200 RS-485 274, 288 RSA pseudo random bit generator 409 RSA 395, 406, 409 RSC 163 RSOH 236 run length limited 193 RZ 189 RZI 190 S/FTP 288 S/UTP 287 Salt & Pepper 85 SAM 256 sampling theorem 33, 36 SAN 321 SATA 151 scrambler 173, 196 scrambling 189, 195 ScTP 288 SD memory card 151 SDH/SONET 235, 237, 238, 253, 254 SDM 230, 324 SEAL 387 second Nyquist criterion 201 second pre-image resistance 370 secondary parameters 280 secondary user 356, 357 secret key 361, 363, 371, 375, 381, 388, 396, secure channel 364 security 361, 363, 370, 387, 390, 395, 397, seed 386, 407 segmentation 83, 92 selection combining 343 selective fading 250, 252 self-information 131, 133
semi-random interleaver 164 sensitivity 69, 71 separation theorem 172 SER 179 set-top box 93 SF/UTP 288 SFN 249 SHA-1 373, 390 SHA-256 373, 390 SHA-384 373 SHA-512 373 shadowing 329 Shannon-Hartley theorem 182 Shannon entropy 140 Shannon 33, 36, 131, 140, 163, 172, 182, 185, 387 Shannon’s boundary 186 Shannon’s source coding theorem 140 Shannon’s theorem 182 shared key 388 shortened Reed-Solomon codes 154 Shutter telegraph 136 side lobe 49, 51 si-function 17, 35, 196, 198. signal to quantization noise ratio 66 signalization 191, 194, 214, 219 signal to intereference ratio 266 signature 361, 366, 392, 394, 399, 401, 405 single-mode 304, 310, 313, 315, 317, 322 Singleton bound 152 SISO 165, 173 skin effect 285, 295 SLOVA 166 slow fading 330 slow frequency hopping 263 SMF 314 SNR 145, 178, 183, 186, 204, 250, 266, 286, 323, 344, 347, 348 soft decision 130, 177, 250 software defined radio 354, 356 SOH 236 Sound Forge 59 sound intensity 71 source coding 129, 132, 136, 138, 159, 172 source encoder 129, 172 source information 129, 172 SOVA 165, 169 space coordinates 81 space diversity reception 343
Index 457
space-time coding 352 sparsity 36, 40 spatial audio 69 spatial diversity 350 spatial multiplexing 349 spectral characteristic 59 spectral domain 21 spectral efficiency 184, 215, 224, 287, 295, 322 spectral function 17, 35 spectral power density 61, 178 spectrum management 356 spectrum property rights 358 spectrum sensing 357 spectrum sharing 356, 359 spectrum utilisation 356, 359 speech activity coefficient 65 speech interpolation 65 speech processing 62 splicing 308, 314, 318 splitter 287, 290 spread spectrum 257, 261, 265 square-and-multiply-algorithm 397 stability 39, 44 standard deviation 64, 337 state diagram 134, 160 state transition 167 stationary 59, 67, 84 statistical coding 137 statistical independent 134 statistical properties 36 statistical TDM 238 step-index fiber 317 stereo-audio 230 STM-1 235 storage 69, 76, 82, 85 STP 287 stream cipher 382, 386 stuffing 234, 237 substitution 362, 376, 378 super audio CD 76 super channel 172 super decoder 172 super encoder 172 surprisal 131 surprise index 132 survivor 162, 170 SWER 274 switched combining 343, 345 symbol rate 46, 231, 245, 341
symbol sequence 138 symmetric block cipher 376 symmetric cryptography 364, 368, 388 symmetric encryption 363, 366 Symmetric High-speed DSL 286 symmetric key 388 symmetric pair 274 synchronization 130, 188, 191, 193, 232, 233, 235, 380, 382 syndrome 148, 154 systematic code 146, 163 T1 232, 235 Tanner graph 155 Tao 39 Taylor series 280 TCP/IP 236 TDM 230, 238, 268 TE 303, 305 teleconferencing 79 telefax 141 telegrapher’s equations 278 telegraphy 186, 229, 271 telephone 61, 65, 69, 272, 284, 289, 295 telephony 57, 60, 65, 69, 232, 238, 241, 261, 272, 285, 320 TEM 294 thermal noise 183 threshold combining 346 threshold 70, 75, 77, 177, 266, 345, 357, 359 THSS 261 TIA 288, 315 Tier coding 91 time domain 21, 47, 48, 341 time hopping 268 TM 303, 305 Token Ring 192 total reflection 297, 299 transfer function 59, 198, 200, 204, 212, 285, 332, 351 transform compression 68 transition matrix 134 transverse component 300 transverse wave 305 trellis code modulation 250 trellis diagram 162, 167 triangular window (Bartlett) 50 tributary unit 236 tributary unit group 236
458 Index
trusted third party 368 turbo code 130, 155, 163, 164 172, 250 turbo coding 164 TV 92 twin-leads 276 twisted pair 186, 230, 232, 273, 276, 284, 287 U/FTP 287 ultraviolet 306 unbalanced circuit 275 unbalanced line 275, 277 unbalanced transfer 287 underlay approach 359 underutilized spectrum 358 uniform distribution 340 unipolar 188, 190, 202 Universal High-Bit-Rate DSL 290 upper bound 133, 135, 182, 185 USB 189 U-shape function 341 UTP 287 variance 136, 171 VCSEL 317 VDSL 285, 287, 289, 292 verification 361, 371, 389, 392, 394, 405 Vernam-Chiffre 387 vertices 156 Very high-speed DSL 287 vibrations 57, 69 violation (V-) bit 193 virtual container 236 virtual pitch 72 Viterbi algorithm 162, 166, 169 Viterbi decoding 250 VoIP 68
Voyager 172 Walfish/Betroni model 354 Walky-Talky 296 Walsh functions 260 water peaks 307 WAV 79 wave 57, 70 waveform compression 68 waveform 67 waveguide 297, 299, 302 wavelet restoration 86 WDM 230, 252 Weaver 132 web jukeboxe 79 Weierstrass form 402 WHIRLPOOL 373 Whitaker 33 white noise 178 wideband speech 68 wideband WDM 252 Wiener filtering 86 WiFi 155 WiMAX 155, 242, 289 window function 48 wired channel 129, 163, 329 wireless channel 129, 163, 172, 329, 336, 359 word compression 137 word error probability 179 xDSL 284, 286, 289 zero differential code 193 zero-complement differential code 193 ZFE 206