321 13 31MB
English Pages 940 [947] Year 2021
Communication Engineering Principles
Communication Engineering Principles 2nd Edition
Ifiok Otung University of South Wales, Pontypridd, UK
This edition first published 2021 © 2021 John Wiley & Sons Ltd Edition History 2001 (1e, Palgrave Macmillan) All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions. The right of Ifiok Otung to be identified as the author of this work has been asserted in accordance with law. Registered Offices John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK Editorial Office The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com. Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats. Limit of Liability/Disclaimer of Warranty While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. Library of Congress Cataloging-in-Publication Data Names: Otung, Ifiok, author. Title: Communication engineering principles / Ifiok Otung, University of South Wales, Pontypridd, UK. Other titles: Digital communications Description: 2nd edition. | Hoboken, NJ : John Wiley & Sons, Inc., USA, [2021] | Revised edition of Digital communications : principles and systems / Ifiok Otung. 2014. | Includes bibliographical references and index. Identifiers: LCCN 2020020271 (print) | LCCN 2020020272 (ebook) | ISBN 9781119274025 (cloth) | ISBN 9781119273967 (adobe pdf) | ISBN 9781119274070 (epub) Subjects: LCSH: Digital communications. | Digital communications–Equipment and supplies–Design and construction. Classification: LCC TK5103.7 .O88 2021 (print) | LCC TK5103.7 (ebook) | DDC 621.382–dc23 LC record available at https://lccn.loc.gov/2020020271 LC ebook record available at https://lccn.loc.gov/2020020272 Cover Design: Wiley Cover Image: © greenbutterfly/Shutterstock Set in 9.5/12.5pt STIXTwoText by SPi Global, Chennai, India
10 9 8 7 6 5 4 3 2 1
In memory of the short long lives of my parents Charles and Sylvia and the long short lives of my sisters Alice, Theresa, and Lucy. Go back in time until you can go no further, For trillions of years if you must. What do you see at this beginning? You see a particle, or you see God. Neither is more scientific, neither more religious. Both involve a leap of faith beyond reason. This book is dedicated to God, Who In the beginning created the heavens and the earth.
vii
Contents Preface xxi Acknowledgements xxiii About the Companion Website 1 1.1 1.2 1.2.1 1.2.2 1.2.2.1 1.2.2.2 1.2.2.3 1.2.2.4 1.3 1.3.1 1.3.1.1 1.3.1.2 1.3.1.3 1.3.1.4 1.3.1.5 1.3.1.6 1.3.1.7 1.3.2 1.3.2.1 1.3.2.2 1.3.2.3 1.3.2.4 1.3.3 1.3.3.1 1.3.3.2 1.3.3.3 1.4 1.4.1 1.4.1.1 1.4.1.2
xxv
Overview of Communication Systems 1 Introduction 1 Nonelectrical Telecommunication 2 Verbal Nonelectrical Telecommunication 2 Visual Nonelectrical Telecommunication 3 Flags, Smoke, and Bonfires 3 Heliography 4 Semaphore 4 Demerits of Visual Nonelectrical Telecommunication 5 Modern Telecommunication 5 Developments in Character Codes 7 Morse Code 7 Baudot Code 7 Hollerith Code 8 EBCDIC Code 9 ASCII Code 9 ISO 8859 Code 10 Unicode 11 Developments in Services 13 Telegram 13 Telex 14 Facsimile 14 The Digital Era 15 Developments in Transmission Media 16 Copper Cable 17 Radio 18 Optical Fibre 19 Communication System Elements 21 Information Source 21 Audio Input Devices 22 Video Input Devices 23
viii
Contents
1.4.1.3 1.4.1.4 1.4.2 1.4.2.1 1.4.2.2 1.4.2.3 1.4.3 1.4.4 1.5 1.5.1 1.5.2 1.5.3 1.5.3.1 1.5.3.2 1.5.3.3 1.5.3.4 1.5.4 1.5.4.1 1.5.4.2 1.6
Data Input Devices 23 Sensors 23 Information Sink 24 Audio Output Device 24 Visual Display Devices 26 Storage Devices 28 Transmitter 29 Receiver 31 Classification of Communication Systems 32 Simplex Versus Duplex Communication Systems 32 Analogue Versus Digital Communication Systems 33 Baseband Versus Modulated Communication Systems 35 Analogue Baseband Communication System 35 Discrete Baseband Communication System 36 Digital Baseband Communication System 41 Modulated Communication Systems 44 Circuit Versus Packet Switching 47 Circuit Switching 48 Packet Switching 50 Epilogue 53 References 53 Review Questions 53
2 2.1 2.2 2.3 2.4 2.4.1 2.4.2 2.4.3 2.4.4 2.4.5 2.4.6 2.5 2.5.1 2.5.2 2.5.3 2.5.4 2.5.5 2.6 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.6.6
Introduction to Signals and Systems 57 Introduction 57 What Is a Signal? 58 Forms of Telecommunication Signals 58 Subjective Classification of Telecommunication Signals 60 Speech 60 Music 62 Video 63 Digital Data 64 Facsimile 64 Ancillary and Control Signals 65 Objective Classification of Telecommunication Signals 65 Analogue or Digital 65 Periodic or Nonperiodic 67 Deterministic or Random 68 Power or Energy 69 Even or Odd 69 Special Waveforms and Signals 71 Unit Step Function 74 Signum Function 74 Rectangular Pulse 75 Ramp Pulse 76 Triangular Pulse 77 Sawtooth and Trapezoidal Pulses 77
Contents
2.6.7 2.6.8 2.7 2.7.1 2.7.2 2.7.2.1 2.7.2.2 2.7.2.3 2.7.2.4 2.7.2.5 2.7.2.6 2.7.2.7 2.7.2.8 2.7.3 2.7.3.1 2.7.3.2 2.7.3.3 2.7.3.4 2.7.4 2.8 2.8.1 2.8.2 2.8.3 2.9 2.10 2.10.1 2.10.2 2.10.3 2.10.4 2.10.5 2.10.6 2.11
Unit Impulse Function 78 Sinc Function 79 Sinusoidal Signals 81 Qualitative Introduction 82 Parameters of a Sinusoidal Signal 83 Angle 86 Amplitude 87 Angular Frequency 87 Frequency 87 Period 88 Wavelength 88 Initial Phase 88 Phase Difference 89 Addition of Sinusoids 92 Same Frequency and Phase 93 Same Frequency but Different Phases 93 Multiple Sinusoids of Different Frequencies 97 Beats Involving Two Sinusoids 97 Multiplication of Sinusoids 99 Logarithmic Units 99 Logarithmic Units for System Gain 101 Logarithmic Units for Voltage, Power, and Other Quantities 102 Logarithmic Unit Dos and Don’ts 104 Calibration of a Signal Transmission Path 107 Systems and Their Properties 109 Memory 109 Stability 111 Causality 112 Linearity 113 Time Invariance 116 Invertibility 118 Summary 121 Questions 122
3 3.1 3.2 3.2.1 3.2.2 3.2.3 3.3 3.3.1 3.3.2 3.3.3 3.4 3.4.1 3.4.2
Time Domain Analysis of Signals and Systems 127 Introduction 127 Basic Signal Operations 128 Time Shifting (Signal Delay and Advance) 128 Time Reversal 130 Time Scaling 132 Random Signals 134 Random Processes 134 Random Signal Parameters 135 Stationarity and Ergodicity 138 Standard Distribution Functions 139 Gaussian or Normal Distribution 139 Rayleigh Distribution 143
ix
x
Contents
3.4.3 3.4.4 3.4.5 3.5 3.5.1 3.5.2 3.5.3 3.5.4 3.5.5 3.5.6 3.6 3.6.1 3.6.2 3.6.3 3.6.4 3.7
Lognormal Distribution 148 Rician Distribution 153 Exponential and Poisson Distributions 157 Signal Characterisation 162 Mean 162 Power 163 Energy 167 Root-mean-square Value 168 Autocorrelation 171 Covariance and Correlation Coefficient 176 Linear Time Invariant System Analysis 180 LTI System Response 181 Evaluation of Convolution Integral 186 Evaluation of Convolution Sum 190 Autocorrelation and Convolution 196 Summary 197 References 198 Questions 198
4 4.1 4.2 4.2.1 4.2.1.1 4.2.1.2 4.2.1.3 4.2.1.4 4.2.2 4.2.3 4.2.3.1 4.2.3.2 4.2.4 4.2.4.1 4.2.4.2 4.2.4.3 4.3 4.3.1 4.3.1.1 4.3.1.2 4.3.1.3 4.3.1.4 4.3.1.5 4.3.1.6 4.3.1.7 4.3.1.8 4.3.1.9 4.3.1.10
Frequency Domain Analysis of Signals and Systems 203 Introduction 203 Fourier Series 205 Sinusoidal Form of Fourier Series 206 Time Shifting 211 Time Reversal 212 Even and Odd Functions 212 Piecewise Linear Functions 214 Complex Exponential Form of Fourier Series 222 Amplitude and Phase Spectra 224 Double-sided Spectrum 227 Single-sided Spectrum 228 Fourier Series Application to Selected Waveforms 235 Flat-top-sampled Signal 235 Binary ASK Signal and Sinusoidal Pulse Train 243 Trapezoidal Pulse Train 248 Fourier Transform 253 Properties of the Fourier Transform 257 Even and Odd Functions 257 Linearity 258 Time Shifting 258 Frequency Shifting 258 Time Scaling 258 Time Reversal 259 Complex Conjugation 259 Duality 259 Differentiation 259 Integration 260
Contents
4.3.1.11 4.3.1.12 4.3.1.13 4.3.1.14 4.3.2 4.3.3 4.4 4.4.1 4.4.1.1 4.4.1.2 4.4.2 4.4.3 4.4.3.1 4.4.3.2 4.4.3.3 4.4.3.4 4.4.3.5 4.5 4.5.1 4.5.2 4.6 4.7 4.7.1 4.7.2 4.7.3 4.7.3.1 4.7.3.2 4.7.3.3 4.7.3.4 4.7.3.5 4.7.4 4.7.5 4.7.6 4.8
Multiplication 260 Convolution 260 Areas 260 Energy 261 Table of Fourier Transforms 263 Fourier Transform of Periodic Signals 268 Discrete Fourier Transform 270 Properties of the Discrete Fourier Transform 275 Periodicity 275 Symmetry 276 Fast Fourier Transform 277 Practical Issues in DFT Implementation 283 Aliasing 283 Frequency Resolution 284 Spectral Leakage 285 Spectral Smearing 285 Spectral Density and Its Variance 288 Laplace and z-transforms 291 Laplace Transform 291 z-transform 292 Inverse Relationship Between Time and Frequency Domains 295 Frequency Domain Characterisation of LTI Systems 297 Transfer Function 297 Output Spectral Density of LTI Systems 301 Signal and System Bandwidths 302 Subjective Bandwidth 303 Null Bandwidth 303 3 dB Bandwidth 304 Fractional Power Containment Bandwidth 306 Noise Equivalent Bandwidth 308 Distortionless Transmission 311 Attenuation and Delay Distortions 313 Nonlinear Distortions 314 Summary 316 References 317 Questions 317
5 5.1 5.2 5.2.1 5.2.2 5.2.3 5.3 5.3.1 5.3.2 5.3.3
Transmission Media 327 Introduction 327 Metallic Line Systems 328 Wire Pairs 328 Coaxial Cable 332 Attenuation in Metallic Lines 333 Transmission Line Theory 334 Incident and Reflected Waves 337 Secondary Line Constants 338 Characteristic Impedance 340
xi
xii
Contents
5.3.4 5.3.5 5.3.6 5.3.7 5.3.8 5.3.9 5.4 5.4.1 5.4.1.1 5.4.1.2 5.4.1.3 5.4.2 5.4.3 5.4.3.1 5.4.3.2 5.4.4 5.4.4.1 5.4.4.2 5.5 5.5.1 5.5.2 5.5.2.1 5.5.2.2 5.5.2.3 5.5.2.4 5.5.2.5 5.5.2.6 5.5.2.7 5.5.3 5.5.3.1 5.5.3.2 5.5.3.3 5.5.3.4 5.5.3.5 5.5.4 5.5.5 5.5.6 5.5.6.1 5.5.6.2 5.5.6.3 5.5.7 5.5.7.1 5.5.7.2 5.5.7.3 5.5.8 5.6
Reflection and Transmission Coefficients 342 Standing Waves 345 Line Impedance and Admittance 347 Line Termination and Impedance Matching 353 Scattering Parameters 359 Smith Chart 363 Optical Fibre 365 Optical Fibre Types 367 Single-mode Step Index 368 Multimode Step Index Fibre 368 Multimode Graded Index 369 Coupling of Light into Fibre 369 Attenuation in Optical Fibre 371 Intrinsic Fibre Loss 371 Extrinsic Fibre Loss 375 Dispersion in Optical Fibre 376 Intermodal Dispersion 376 Intramodal Dispersion 377 Radio 380 Maxwell’s Equations 382 Radio Wave Propagation Modes 384 Ground Wave 386 Sky Wave 386 Line-of-sight (LOS) 387 Satellite Communications 387 Mobile Communications 388 Ionospheric Scatter 388 Tropospheric Scatter 388 Radio Wave Propagation Effects 388 Ionospheric Effects 388 Tropospheric Attenuation 390 Tropospheric Scintillation 393 Depolarisation 394 Tropospheric Refraction 395 Reflection and Refraction 397 Rough Surface Scattering 406 Diffraction 408 Diffraction Configuration and Terms 408 Fresnel Zones 410 Knife-edge Diffraction Loss 411 Path Loss 416 Free Space Path Loss 416 Plane Earth Propagation Path Loss 418 Terrestrial Cellular Radio Path Loss 421 Radio Frequency Allocation 424 Summary 424 References 425 Questions 426
Contents
6 6.1 6.2 6.2.1 6.2.2 6.2.3 6.2.4 6.2.5 6.2.6 6.2.7 6.3 6.3.1 6.3.2 6.3.3 6.4 6.4.1 6.4.2 6.4.3 6.4.4 6.4.5 6.4.6 6.5 6.5.1 6.5.2 6.6
Noise in Communication Systems 431 Introduction 431 Physical Sources of Random Noise 432 Thermal or Johnson Noise 432 Quantisation Noise 433 Radio or Sky Noise 433 Shot Noise 435 Partition Noise 435 Quantum Noise 435 Flicker or 1/f Noise 436 Additive White Gaussian Noise 437 Gaussian PDF of Noise 438 White Noise 439 Canonical and Envelope Representations of Noise System Noise Calculations 448 Available Noise Power 448 Equivalent Noise Temperature 450 Noise Figure of a Single System 451 Noise Figure of Cascaded Systems 454 Overall System Noise Temperature 457 Signal-to-noise Ratio 459 Noise Effects in Communication Systems 462 SNR in Analogue Communication Systems 462 BER in Digital Communication Systems 465 Summary 469 References 470 Questions 470
7 7.1 7.2 7.2.1 7.2.2 7.2.3 7.3 7.3.1 7.3.2 7.3.3 7.4 7.4.1 7.4.1.1 7.4.1.2 7.4.2 7.4.2.1 7.4.2.2 7.5 7.5.1 7.5.2
Amplitude Modulation 473 Introduction 473 AM Signals: Time Domain Description 474 AM Waveform 474 Sketching AM Waveforms 475 Modulation Factor 476 Spectrum and Power of Amplitude Modulated Signals 480 Sinusoidal Modulating Signal 480 Arbitrary Message Signal 482 Power 485 AM Modulators 488 Generation of AM Signal 488 Linearly-varied-gain Modulator 488 Switching and Square-law Modulators 489 AM Transmitters 491 Low-level Transmitter 491 High-level Transmitter 492 AM Demodulators 492 Diode Demodulator 493 Coherent Demodulator 496
444
xiii
xiv
Contents
7.5.3 7.5.3.1 7.5.3.2 7.6 7.7 7.7.1 7.7.1.1 7.7.1.2 7.7.1.3 7.7.1.4 7.7.2 7.7.2.1 7.7.2.2 7.7.2.3 7.7.2.4 7.7.3 7.7.3.1 7.7.3.2 7.7.3.3 7.7.4 7.7.4.1 7.7.4.2 7.8
AM Receivers 498 Tuned Radio Frequency (RF) Receiver 498 Superheterodyne Receiver 499 Merits, Demerits, and Application of AM 501 Variants of AM 502 DSB 502 Waveform and Spectrum of DSB 502 DSB Modulator 504 DSB Demodulator 507 DSB Applications 509 SSB 510 Merits and Demerits of SSB 511 SSB Modulators 514 SSB Demodulator 516 Applications of SSB 517 ISB 518 ISB Modulator 518 ISB Demodulator 518 ISB Merits, Demerit, and Application 520 VSB 520 VSB Modulator 521 VSB Demodulator 522 Summary 524 Questions 525
8 8.1 8.2 8.2.1 8.2.2 8.2.3 8.2.3.1 8.2.3.2 8.3 8.3.1 8.3.2 8.4 8.4.1 8.4.1.1 8.4.1.2 8.4.1.3 8.4.2 8.4.2.1 8.4.2.2 8.4.2.3 8.4.2.4 8.5
Frequency and Phase Modulation 529 Introduction 529 Basic Concepts of FM and PM 530 Frequency Modulation Concepts 531 Phase Modulation Concepts 535 Relationship Between FM and PM 537 Frequency Variations in PM 537 Phase Variations in FM 540 FM and PM Waveforms 543 Sketching Simple Waveforms 543 General Waveform 544 Spectrum and Power of FM and PM 549 Narrowband FM and PM 549 Frequency Components 549 Comparing AM, NBFM, and NBPM 551 Amplitude Variations in NBFM and NBPM 556 Wideband FM and PM 557 Spectrum 558 Power 560 Bandwidth 563 FM or PM? 567 FM and PM Modulators 567
Contents
8.5.1 8.5.2 8.5.3 8.5.3.1 8.5.3.2 8.6 8.6.1 8.6.1.1 8.6.1.2 8.6.2 8.6.2.1 8.6.2.2 8.6.2.3 8.6.3 8.6.4 8.6.4.1 8.6.4.2 8.7 8.7.1 8.7.2 8.7.3 8.7.4 8.8 8.9 8.9.1 8.9.2 8.9.3 8.10
Narrowband Modulators 567 Indirect Wideband Modulators 569 Direct Wideband Modulators 572 LCO Modulator 573 VCO Modulator 575 FM and PM Demodulators 576 Direct Demodulator 577 Filter-based Demodulator 577 Digital Demodulator 577 Indirect Demodulator 577 PLL Demodulation Process 579 PLL States 580 PLL Features 580 Phase Demodulator 580 Frequency Discriminators 581 Differentiators 581 Tuned Circuits 583 FM Transmitter and Receiver 584 Transmitter 584 SNR and Bandwidth Trade-off 586 Pre-emphasis and De-emphasis 586 Receiver 588 Noise Effect in FM 588 Overview of FM and PM Features 594 Merits 594 Demerits 594 Applications 595 Summary 595 Questions 595
9 9.1 9.2 9.3 9.3.1 9.3.2 9.3.3 9.4 9.5 9.6 9.6.1 9.6.2 9.6.3 9.7
Sampling 599 Introduction 599 Sampling Theorem 599 Proof of Sampling Theorem 600 Lowpass Signals 602 Bandpass Signals 603 Sampling at Nyquist Rate 606 Aliasing 607 Anti-alias Filter 613 Non-instantaneous Sampling 615 Natural Sampling 616 Flat-top Sampling 618 Aperture Effect 622 Summary 623 Questions 624 Reference 625
xv
xvi
Contents
10 10.1 10.2 10.3 10.3.1 10.3.2 10.3.3 10.3.4 10.3.5 10.4 10.4.1 10.4.2 10.4.3 10.4.4 10.4.5 10.4.6 10.5 10.5.1 10.5.2 10.5.2.1 10.5.2.2 10.5.2.3 10.5.2.4 10.5.2.5 10.5.2.6 10.6 10.6.1 10.6.2 10.6.2.1 10.6.2.2 10.6.2.3 10.6.3 10.6.3.1 10.6.3.2 10.6.3.3 10.7 10.7.1 10.7.2 10.7.3 10.7.4 10.7.5 10.8
Digital Baseband Coding 627 Introduction 627 Concept and Classes of Quantisation 628 Uniform Quantisation 634 Quantisation Noise 635 Dynamic Range of a Quantiser 636 Signal-to-quantisation-noise Ratio (SQNR) 636 Design Considerations 639 Demerits of Uniform Quantisation 640 Nonuniform Quantisation 641 Compressor Characteristic 642 A-law Companding 644 𝜇-law Companding 645 Companding Gain and Penalty 647 Practical Nonlinear PCM 650 SQNR of Practical Nonlinear PCM 657 Differential PCM (DPCM) 661 Adaptive Differential Pulse Code Modulation (ADPCM) Delta Modulation 664 Quantisation Error 664 Prediction Filter 666 Design Parameters 666 Merits and Demerits of DM 666 Adaptive Delta Modulation (ADM) 668 Delta Sigma Modulation 668 Low Bit Rate Speech Coding 668 Waveform Coders 671 Vocoders 671 IMBE 672 LPC 672 MELP 673 Hybrid Coders 673 APC 673 MPE-LPC 673 CELP 673 Line Codes 674 NRZ Codes 674 RZ Codes 675 Biphase Codes 676 RLL Codes 676 Block Codes 677 Summary 680 Reference 681 Questions 681
11 11.1
Digital Modulated Transmission 683 Introduction 683
664
Contents
11.2 11.3 11.3.1 11.3.2 11.3.3 11.4 11.5 11.6 11.6.1 11.6.2 11.7 11.7.1 11.7.2 11.7.3 11.7.3.1 11.7.3.2 11.7.3.3 11.7.4 11.8 11.8.1 11.8.2 11.8.3 11.9 11.9.1 11.9.2 11.9.3 11.10 11.10.1 11.10.2 11.10.2.1 11.10.2.2 11.10.2.3 11.10.3 11.10.3.1 11.10.3.2 11.10.3.3 11.10.4 11.10.4.1 11.10.4.2 11.10.4.3 11.10.5 11.10.5.1 11.10.5.2 11.11 11.12
Orthogonality of Energy Signals 687 Signal Space 689 Interpretation of Signal-space Diagrams 690 Complex Notation for 2D Signal Space 693 Signal-space Worked Examples 694 Digital Transmission Model 699 Noise Effects 701 Symbol and Bit Error Ratios 703 Special Cases 705 Arbitrary Binary Transmission 708 Binary Modulation 712 ASK 712 PSK 714 FSK 715 Generation 715 Spectrum 716 Frequency Spacing and MSK 716 Minimum Transmission Bandwidth 718 Coherent Binary Detection 719 ASK Detector 719 PSK Detector 721 FSK Detector 721 Noncoherent Binary Detection 723 Noncoherent ASK Detector 725 Noncoherent FSK Detector 727 DPSK 727 M-ary Transmission 730 Bandwidth Efficiency 730 M-ary ASK 732 M-ary ASK Modulator 732 M-ary ASK Detector 734 BER of M-ary ASK 734 M-ary PSK 737 QPSK Modulator and Detector 738 M-ary PSK Modulator and Detector 740 BER of M-ary PSK 743 M-ary FSK 746 M-ary FSK Modulator and Detector 746 BER of M-ary FSK 746 Noise-bandwidth Trade-off in M-ary FSK 748 M-ary APSK 749 16-APSK 749 BER of Square M-ary APSK 752 Design Parameters 754 Summary 757 Reference 758 Questions 759
xvii
xviii
Contents
12 12.1 12.2 12.2.1 12.2.2 12.2.3 12.2.4 12.2.4.1 12.2.4.2 12.2.4.3 12.2.4.4 12.3 12.4 12.4.1 12.4.2 12.4.2.1 12.4.2.2 12.4.2.3 12.4.3 12.4.4 12.5
Pulse Shaping and Detection 763 Introduction 763 Anti-ISI Filtering 765 Nyquist Filtering 767 Raised Cosine Filtering 769 Square Root Raised Cosine Filtering 771 Duobinary Signalling 774 Cosine Filter 774 Signal Power Trade-off 777 Sine Filter 778 Polybinary Signalling 779 Information Capacity Law 780 The Digital Receiver 787 Adaptive Equalisation 787 Matched Filter 787 Specification of a Matched Filter 788 Matched Filter by Correlation 790 Matched Filter Worked Examples 791 Clock Extraction 797 Eye Diagrams 799 Summary 799 References 801 Questions 801
13 13.1 13.2 13.2.1 13.2.2 13.2.3 13.2.4 13.2.4.1 13.2.4.2 13.2.4.3 13.2.4.4 13.2.5 13.3 13.3.1 13.3.2 13.3.2.1 13.3.2.2 13.3.2.3 13.3.3 13.3.3.1 13.3.3.2 13.3.3.3 13.3.4
Multiplexing Strategies 805 Introduction 805 Frequency Division Multiplexing 809 General Concepts 809 Demerits of Flat-level FDM 812 Future of FDM Technology 813 FDM Hierarchies 814 UK System 816 European System 820 Bell System 821 Nonvoice Signals 822 Wavelength Division Multiplexing 823 Time Division Multiplexing 825 General Concepts 825 Plesiochronous Digital Hierarchy 827 E1 System 827 T1 and J1 Systems 832 PDH Problems 838 Synchronous Digital Hierarchy 838 SDH Rates 839 SDH Frame Structure 839 SONET 844 ATM 846
Contents
13.3.4.1 13.3.4.2 13.3.4.3 13.3.4.4 13.3.4.5 13.4 13.4.1 13.4.2 13.4.3 13.4.4 13.4.4.1 13.4.4.2 13.4.4.3 13.4.4.4 13.5 13.5.1 13.5.2 13.5.3 13.5.4 13.6
ATM Layered Architecture 847 ATM Network Components 850 ATM Cell Header 851 ATM Features Summary 852 ATM Versus IP 852 Code Division Multiplexing 853 Types of Spread Spectrum Modulation 853 CDM Transmitter 856 CDM Receiver 858 Crucial Features of CDM 863 Synchronisation 863 Cross-correlation of PN Codes 864 Power Control 864 Processing Gain 866 Multiple Access 867 FDMA 868 TDMA 869 CDMA 871 Hybrid Schemes 872 Summary 873 Questions 874
Appendix A Character Codes 877 Appendix B Trigonometric Identities
883
Appendix C Tables and Constants 885 C.1 Constants 885 C.2 SI Units 886 C.3 Complementary Error Function erfc(x) and Q function Q(x) Index 891
887
xix
xxi
Preface
If nature gives you a free lunch, you pay with your dinner. This second edition of Communication Engineering Principles is a painstaking and comprehensive revision of the original publication, including several new chapters and a complete rewrite of some of the old chapters. I have remained faithful to the approach and philosophy that made the first edition so successful. It is an engineering-first approach inspired by an engineering-is-fun philosophy. I have left no stone unturned to ensure complete clarity and to break complex concepts into their simple bite-sized components for the benefit and enjoyment of every reader and tutor. Communication Engineering Principles is aimed at undergraduate courses in communication engineering, digital communications, and signals and systems analysis. It is also suitable as preparatory material for MSc students and for researchers and practising engineers wishing to fully understand and apply the concepts and principles of the subject in their area of work. The book prioritises clarity and engineering insight above mathematical rigour, although maths is an essential tool that is used when necessary. Analogies, graphs, heuristic arguments, and numerous worked examples are used to deepen the reader’s insight and hone their skills in problem solving and the correct interpretation and application of key concepts and principles. Chapter 1, Overview of Communication Systems, is a nonmathematical overview of communication systems that erects crucial knowledge pegs needed to hang a more detailed treatment of the subject in subsequent chapters. It also presents a carefully selected review of our journey from telegraphy in 1837 to 5G in 2020 and a discussion of the main communication system elements and processes. This is an extensive update of the first chapter of the first edition. In addition to a detailed update to reflect the state of the art of telecoms in 2020, new material has been added on circuit and packet switching, character coding, developments in transmission media, and the digital era. It is in this chapter that we discover that ATM is a slow-start sprinter, whereas IP is an instant-start jogger and we learn the different attitudes of each technique towards sharing the community cake. Chapter 2, Introduction to Signals and Systems, is a new chapter that retains some of the material in the old Chapter 2, which was titled Telecommunication Signals. It is a must-read for everyone, including those who already have some familiarity with some of the topics discussed. We lay a great foundation for dealing with signals and systems in engineering. This includes an exhaustive treatment of sinusoidal signals, the building blocks of all other signals, and an introduction to various system properties. It is also in this chapter that we learn 10 logarithmic dos and don’ts. For example, did you know that you should never add together two dBW values, although you may subtract them? Chapter 3, Time Domain Analysis of Signals and Systems, is a new chapter that deals with various signal operations from time reversal and delay to convolution and autocorrelation. We use a graphical approach and various worked examples to make it easy to fully master these important operations. Random signals are also discussed and the statistical distributions that are most used for telecom systems and services analysis and modelling are fully
xxii
Preface
covered. The last part of the chapter is then devoted to learning how to characterise and analyse linear systems in the time domain. Chapter 4, Frequency Domain Analysis of Signals and Systems, is new. Using a mix of heuristic, graphical and mathematical approaches, we explore the topics of Fourier series, Fourier transform, and discrete Fourier transform at a depth and breadth that are considered complete for the needs of modern engineering. We explore new applications of the tools and at all points emphasise the correct interpretation of results. The chapter ends with careful coaching on the use of a frequency domain approach in system characterisation and analysis. Chapter 5, Transmission Media, is also new. A nonmathematical discussion of the characterisation, signal impairments, and applications of metallic lines, optical fibre, and radio is followed by a more in-depth analysis to develop the tools needed to calculate signal strength at various points in each medium. Transmission line theory is also covered in full. Chapter 6, Noise in Communication Systems, is an update of the old Chapter 9 that went by the same title. The update includes new worked examples and improvements in presentation and discussion. We acquire a good grounding in the quantification of random noise and the assessment of their impact on digital and analogue communication systems. The design parameters that affect SNR in analogue systems and BER in digital systems are explored in detail. Chapter 7, Amplitude Modulation, is an update of the old Chapter 3 that was similarly titled. It gives a comprehensive treatment of amplitude modulation and all its variants. Chapter 8, Frequency and Phase Modulation, retains much of the material of the old Chapter 4 titled Angle Modulation. The treatment of noise effects using a phasor approach is improved. New worked examples are also included. Chapter 9, Sampling, retains much of the old Chapter 5 that was similarly titled. The treatment of bandpass sampling is improved, and new graphical illustrations are employed. Chapter 10, Digital Baseband Coding, is an extensive revision of the previous Chapter 6 titled Digital Baseband Transmission. The treatment of quantisation and PCM is improved. Chapter 11, Digital Modulated Transmission, is an extensive revision of the old Chapter 7 that was similarly titled. New material is introduced on signal orthogonality, signal-space diagrams, bandwidth efficiency, design parameters, and bit error ratios. New worked examples are also introduced. Chapter 12, Pulse Shaping and Detection, is new. We develop various filtering measures for mitigating intersymbol interference (ISI), evaluate the Shannon–Hartley information capacity law, and derive the matched filter for optimum detection of a signal in additive white Gaussian noise. Various worked examples are also presented. Chapter 13, Multiplexing Strategies, is an extensive revision of the previous Chapter 8 that was similarly titled. A new section on multiple access is introduced and the treatment of all topics, including wavelength division multiplexing, is updated and improved. New worked examples are also added. The entire Chapter 1 and up to Section 2.5 of Chapter 2 is nonmathematical. This is in keeping with our engineering-first approach. We wanted engineering, rather than maths, to be our gatekeeper to welcome you to the beauty and fun of telecoms as presented in this volume. Beyond Section 2.5 it is assumed that you have a knowledge of calculus, although a lot of explanation of mathematical manipulations is provided as deemed necessary. In all cases, however, we approach every concept and every problem solving by starting with engineering, bringing in maths if necessary, and then ending with engineering through a careful interpretation of any mathematical results. I hope that you will enjoy using this book as much as I enjoyed writing it. I look forward to hearing how this book has helped your work, whether as student or tutor. Please visit my website at https://professorifiokotung.com/ for further support, including video clips and presentations that could help make your study easier and even more exciting.
xxiii
Acknowledgements The communication engineering principles, technologies, and standards covered in this book are the culmination of the efforts of many people and organisations over several generations. This book owes its very existence to these pillars of our subject and to the stellar work of the International Telecommunication Union (ITU) in developing many of the technical standards reflected within. I am grateful to Simon Haykin, whose writings played a significant role in guiding my first steps into communication systems theory in the 1980s. Since then, my journey in the subject has been further shaped through the contributions of others too numerous to mention. However, this book brings a unique approach born out of many years of teaching the subject matter to international cohorts of students and engineers with diverse mathematical abilities. The book’s style and attention to detail are motivated by a strong belief in simplicity and the necessity of clarity, and an uncompromising dedication to training competent engineers with a complete understanding of the underlying principles of the subject as well as excellent skills in communication system analysis and design. I am indebted to generations of my undergraduate and postgraduate students, short course participants, and users of the first edition of the book whose feedback and learning requirements helped in no small measure to shape the style and content of this second edition. I thank my colleagues Dr Ali Roula and Professor Jonathan Rodriguez for their support with some administrative and research responsibilities while the book was in preparation. I also thank my research student Ms Jinwara Surattanagul for her secretarial contributions to some parts of the later chapters. This book could not have materialised without Sandra Grayson, the Wiley commissioning editor for electrical and computer engineering. I thank her for her faith in the project from conception to completion and for her professional support and patience during the many months it took to combine my university responsibilities with writing. I also thank Jayaprakash Unni and the production team at Wiley who worked diligently to take the book from manuscript through to final publication. I thank my family for their patient support, wholehearted encouragement, and infectious faith in the completion of the project. Finally, I am grateful to God for granting me the privilege and ability to make this contribution to the education, training, career, and reading pleasure of many. March 2020
Ifiok Otung South Wales, United Kingdom
xxv
About the Companion Website This book is accompanied by a companion website: www.wiley.com/go/otung The website includes: 1. Solutions manual featuring solutions to selected end-of-chapter questions. 2. Telecoms software laboratory for a non-mathematical study of Fourier analysis and synthesis. 3. MATLAB code to support simulation-based exercises. 4. Full-colour diagrams/slides. Scan this QR code to visit the companion web site:
1
1 Overview of Communication Systems
Success is not the absence of failure but the triumph of improved attempts. In this Chapter ✓ A quick overview of nonelectrical telecommunication techniques highlighting their gross inadequacies for today’s communication needs. You will find, however, that these ancient techniques are still indispensable in certain situations. ✓ A brief historical sketch of developments in modern (electrical) telecommunication from telegraphy to the Internet and 5G. Key developments in binary codes for data transmission, electronic components, transmission media, signal processing, and telecommunication services are summarised. ✓ A discussion of the elements of a communication system. Modern telecommunication systems may vary widely in complexity and applications, but they are all accurately represented by one block diagram. You will become conversant with the elements of this generic diagram and the roles and signal processing tasks that they perform. Information sources, information sinks, transmitters, and receivers are all briefly introduced. ✓ A detailed overview of the classification of communication systems. Every system is simplex or duplex, analogue or digital, baseband or modulated, circuit-switched or packet-switched. You will learn the features of each of these systems and find out just why we have undergone a transformation to digital and IP dominance.
1.1 Introduction This first chapter provides a panoramic view of communication systems. It is intended to give a lucid and comprehensive introduction to the subject of communication engineering. We follow a nonmathematical approach and attempt to show you how each telecommunication concept fits into the overall picture. Armed with this knowledge, it is hoped that you will have enough inspiration and motivation to go on to the remaining chapters, where the principles and terminology are treated in more detail. If you do not have the time to study the entire book, it is advised that you work carefully through this chapter and the introductory sections of the remaining chapters. The material thus covered is suitable for a short course that presents a survey of modern telecommunications. To drive home the fact that telecommunication generally means communication at a distance, we begin with a quick overview of various nonelectrical means of telecommunicating. The use of the word telecommunication is then narrowed to apply exclusively to the electrical techniques. After an important review of selected significant historical developments from telegraphy to 5G, we present a block diagram that adequately describes all Communication Engineering Principles, Second Edition. Ifiok Otung. © 2021 John Wiley & Sons Ltd. Published 2021 by John Wiley & Sons Ltd. Companion Website: www.wiley.com/go/otung
2
1 Overview of Communication Systems
communication systems. Each component and the processes that it contributes to the overall performance of the system are then briefly discussed in a nonmathematical way. Different types and classifications of communication systems are discussed. Digital and analogue communication systems are compared, and analogue baseband systems are discussed in some detail. The features of digital baseband transmission are introduced. We show that modulated systems are essential to exploit the radio spectrum and the optical fibre medium and discuss the features of this class of systems. Our discussion includes brief references to some of the main modern applications of telecommunications, namely television systems, communication networks, telephony, satellite communications, optical fibre communication systems, and mobile communication systems. We also discuss and compare the different switching technologies of space switching, time switching, circuit switching, connection-oriented (CO) packet switching, and connectionless packet switching (CL) and explain why CL has become the dominant switching technology of the twenty-first century. On completing this chapter, you will be well equipped to make informed decisions regarding the suitability of different types of system blocks (including transmission media) and various classes of communication systems for different applications. You will also have a clear overall picture of telecommunication and a good foundation on which you can build a more detailed knowledge of the subject. I hope that working through this chapter will inspire you towards a career in telecommunication or a very enjoyable further study of this exciting field.
1.2 Nonelectrical Telecommunication This book is concerned with modern telecommunication or communication over a distance, which relies principally on electrical means to convey information from one point S, the sending end, to another point R, the receiving end. After this section, we will use telecommunication, communication systems, and other similar terms to refer exclusively to these modern technologies. However, before the advent in 1837 of telegraphy, the forerunner of modern telecommunications, many forms of nonelectrical ‘telecommunications’ existed. We will briefly introduce these nonelectrical communication systems and discuss their demerits, before turning our attention to various types and classifications of modern communication systems.
1.2.1 Verbal Nonelectrical Telecommunication The most basic nonelectrical telecommunication was verbal, in which, for example, a town crier used a combination of a gong and their voice to broadcast information in an African village. Trumpets were also blown to summon to war, call a truce, or announce victory, depending on the distinctive sounds made. The transmitter in this case is the human vocal system and other suitable instrumental sound sources such as the gong, drum, and trumpet. The signal is a pressure wave called an acoustic signal, the transmission medium is air, and the receiver is every human ear within range of hearing. This form of ‘telecommunication’ is fraught with problems, which include the following. ●
●
●
Interference: the ear also receives other unwanted sound signals in the environment, which serves as a common transmission medium. These sounds, which may be artificial or natural background noise, or the intelligible communication of other people, interfere with and hence corrupt the wanted signal. In the town crier example, an African woman would have to hush her chattering children, bleating goats, and barking dogs, or move away from the noisy zone in order to minimise interference with the town crier’s message. Nuisance: the signal is received by those who do not want it. If you had to use this verbal means to ‘telecommunicate’ with a friend a few blocks away, your neighbours would not be pleased. Huge losses: the signal suffers a lot of attenuation, or reduction in the amplitude of the pressure wave as it is reflected at material boundaries and as it spreads out with distance over a wider wave front. For this reason, it is difficult to hear the town crier from within a closed room or from afar.
1.2 Nonelectrical Telecommunication ●
●
●
●
Limited range: communication can only take place over small distances (i.e. small separations between S and R). To overcome the above losses, sound signals of high SPL (sound pressure level) must be used to extend communication range. However, this increases the nuisance effect and endangers the hearing of those close to the sound source. Besides, the human vocal system and other practical sound sources are severely limited in the SPL that they can generate. Masking effect: the wanted signal can be easily masked by louder signals in the medium. This makes hearing (i.e. detection) of the wanted signal impossible. The threshold of hearing increases with ambient noise. This means that the ear–brain system will be completely oblivious to the presence of one sound if there is another louder sound of about the same frequency. Lack of privacy: privacy is an important requirement that this means of communication is incapable of providing. Everyone (within range) hears your message and is potentially able to understand and make use of it. Therefore, this means of communication is only suitable for the broadcast of public information and would not be suitable for the communication of private, military, commercial, or classified information. Delay: even if we could overcome all the problems discussed above, propagation delay, the time it takes for the signal to travel from S to R, would still make this type of telecommunication unacceptable. Sound travels at a speed v = 330 m/s in standard air. Over very short distances, it seems as though we hear at the same instant as the sound is produced. However, imagine that S and R are separated by, say, d = 2376 km, a realistic international and in some cases national distance. The propagation delay is t=
d 2376000 m = = 7200 s = 2 h v 330 m∕s
Thus, if you made an acoustic ‘phone call’ over this distance, the person would hear each utterance a staggering two hours after it was made. And if the person said, ‘Pardon?’ you would complete your call and hang up without knowing they wanted you to say it again. Real-time interactive communication would only be possible over short distances. This is in sharp contrast to the case of modern telecommunications using electromagnetic waves, which travel at the speed of light (v ≡ c = 3 × 108 m/s). The propagation delay for the distance d = 2376 km is t = 8 ms. Barring other delays, you would hear each other practically instantaneously.
1.2.2 Visual Nonelectrical Telecommunication Various methods of nonverbal telecommunication were developed as a vast improvement on the basic technique discussed above. Here, the information to be conveyed is coded in the form of visually observable events. The transmitter generates these events. Light signals convey the information through a transmission medium that is ideally a vacuum but is in practice air. The receiver is a human eye within visibility range along an unobstructed line-of-sight path. The problems of interference, nuisance, propagation delay, and masking by other signals are negligible compared to the verbal technique. Furthermore, the communication range is extended, especially in clear non-foggy weather. However, the receiver must have an unobstructed view of the signalling event and the major problem of lack of privacy remains. Visual nonelectrical telecommunication systems were identical in transmission medium and receiver, differing only in the type of transmitter. The most common techniques are now discussed. 1.2.2.1 Flags, Smoke, and Bonfires
The practice of raising a highly visible object, or performing complex coded motions with the object, has had universal application in warfare and ordinary social interactions over the centuries. Hand signals remain an indispensable means of communication in modern society. For example, hand signals are used by traffic wardens to control traffic. Hoisting a red flag or red light is still used to warn of danger, and raising a white flag in warfare communicates a message of surrender over distances that would be difficult to reach by verbal means.
3
4
1 Overview of Communication Systems
An interesting record of a military application of this method of visual nonelectrical telecommunication around 1405 BC is given in the Bible. Hebrew soldiers divide into two parties to attack a city. One party waits in ambush outside the city gates while the other attacks the city and fakes defeat in order to draw the city guards away in pursuit. At a good distance from the city, the fleeing party raises a javelin from a vantage point, thus signalling to the ambush party to take the now defenceless city. This party takes the city and quickly starts a bonfire, which ‘telecommunicates’ their success to the retreating party and signals the beginning of a bidirectional attack on the guards, who are in shock at the sight of their burning city. 1.2.2.2 Heliography
Heliography was a system of communication developed in the nineteenth century in which rays of the sun were directed using movable mirrors on to distant points. Some links for heliograph communication were set up and the technique was reliably used, for example in 1880 by General Stewart to give a battle account over some distance. Reading a heliogram (a message sent by heliography) was said to cause eye fatigue. A technique for transmitting information using variations in the pattern of light had been developed by the Greeks as far back as the second century BC. Different combinations and positions of torchlight signals were employed to represent the letters of the Greek alphabet. In this way messages could be coded and transmitted. 1.2.2.3 Semaphore
A semaphore consists of an upright post with one or more arms moving in a vertical plane. Beginning in the eighteenth century, different positions of the arms or flags were used to represent letters. Table 1.1 gives the conventional alphanumeric semaphore codes. The semaphore codes for letters A, Q, and V are illustrated in Figure 1.1. Note that some of the letters of the alphabet are shared with numerals 0–9. A numerical message is distinguished by being preceded with the code (0∘ , 45∘ ). Code AR is used to indicate the end of signal and code R to acknowledge reception. In Table 1.1, the arm positions are given in degrees measured clockwise from vertical. Table 1.1
Semaphore codes. Positions of semaphore flags
Symbol
Symbol
Positions of semaphore flags
2
225∘ 270∘
O
C
3
315∘
P
D
4
0∘
Q
5
45∘
R
6
90∘
S
7
135∘
T
8
(225∘ , 270∘ )
U
(315∘ , 0∘ ) (315∘ , 45∘ )
9
(225∘ , 315∘ )
V
(0∘ , 135∘ )
W
0 (zero)
(0∘ , 90∘ ) (0∘ , 225∘ )
X
(45∘ , 90∘ ) (45∘ , 135∘ )
L
(45∘ , 225∘ )
Y
(90∘ , 315∘ )
M
(90∘ , 225∘ )
Z
(90∘ , 135∘ )
A
1
B
E F G H I J K
Angle is measured clockwise from vertical.
N
(135∘ , 225∘ ) (270∘ , 315∘ ) (270∘ , 0∘ ) (270∘ , 45∘ ) (270∘ , 90∘ ) (270∘ , 135∘ )
1.3 Modern Telecommunication
Figure 1.1
Semaphore codes for letters A, Q, and V.
A
Q
V
Some governments built large semaphore towers on hilltops along strategic routes. For example, during the Napoleonic wars (1792–1815), a relay of semaphore towers was built in England from London to Portsmouth. By the year 1886, semaphore had been almost universally adopted for signalling on railways. Semaphore is still used today at sea by warships preferring radio silence and for mechanical railway signalling. 1.2.2.4 Demerits of Visual Nonelectrical Telecommunication
The major disadvantages of visual nonelectrical telecommunication include the following. ●
●
●
●
●
Low signalling speed: the rates of signal generation at the transmitter and detection at the receiver are extremely low by modern standards. For example, if the torchlight or semaphore flag patterns are manipulated quickly enough to complete the representation of one letter every second, we have a maximum data rate of one character (or byte) per second. In modern systems, a modem that operates at 10 000 times this speed is considered slow. Note, however, that propagation delay is negligible, and the signal is received almost at the same instant as it is generated at the transmitter. Manual operation: a trained signaller is required at the transmitter to code the information and at the distant receiver to view and decode the signal patterns as they are transmitted. Automation of both transmission and reception is impractical. Limited range: the maximum separation that allows visibility between transmission and reception points may range from less than 50 m in thick fog to a few miles in obstruction-free terrain under very clear weather. However, this range is very low compared to distances involved in modern communications. Lack of privacy: even with the use of encrypted codes, the communication process is carried out in full view of all within the communication range. Limited application: a combination of the above factors means that visual nonelectrical telecommunication remains a system that can only accommodate a very few users (low capacity) for very limited applications. This is grossly inadequate to meet modern requirements of broadband communication.
1.3 Modern Telecommunication Modern telecommunication began in 1837 with the invention of telegraphy. Sir Charles Wheatstone (1802–1875) and Sir William Fothergill Cooke (1806–1879) built the world’s first commercial telegraph system in England in 1837. They laid the telegraph lines, consisting of six iron wires, along a 2.4 km railway track from Euston to Camden Town stations. Their invention was first publicly demonstrated on 24th July 1837 and described as a ‘most extraordinary apparatus’ in an advertisement. The receiver at the receiving station consisted of five galvanometer needles arranged in a row across the face of a grid of letters, as shown in Figure 1.2. Each needle (G1, G2, …, G5) could be deflected to the left or right by current flowing in the corresponding wire (W1, W2, …, W5), not shown in the diagram. A sixth wire, W6, was needed to provide a common return path for each of the five circuits. To
5
6
1 Overview of Communication Systems
A B
G1
M
I
G2
G
F
E H
D
N
R
L
K
G3
O
S V
G4
P
G5
T W
Y
Figure 1.2
Wheatstone-Cooke 5-needle electric telegraph display grid.
transmit a letter, current flow was sent from the transmitter down two wires causing two of the needles to deflect in opposite directions and point to the right letter. For example, to transmit the letter A, current flow is established in wires W1 and W5 so that G1 deflects clockwise and G5 counterclockwise from vertical. But to transmit the letter Y, current flow is established in the opposite direction in the two wires so that G1 deflects counterclockwise and G5 clockwise from vertical. This five-needle telegraph system was simple enough to allow unskilled operation, but it required six wires to make the connection between transmitting and receiving stations and only 20 letters could be transmitted, which made it necessary to transmit some words with a modified spelling. It was soon replaced in 1838 by a cheaper two-needle telegraph, which was also eventually superseded in 1845 by a single-needle system. The single-needle telegraph coded each letter using a unique combination of movements of one needle to the left and right. By arranging for the needle to strike small metal pipes that emitted sounds of different pitches, each letter was heard as a unique melody at the receiver. At about the same time that Wheatstone and Cooke developed their needle-deflecting telegraph, two Americans, Samuel Finley Breese Morse (1791–1872) and Alfred Vail (1807–1859), devised a more efficient telegraph system, which was eventually adopted all over the world. Morse and Vail demonstrated their system on 24th May 1844 by sending the message ‘WHAT HATH GOD WROUGHT’ over the first telegraph link in America from the Capitol at Washington to Mount Clare Depot in Baltimore, a distance of 64.5 km. The Morse–Vail telegraph consisted of a key or switch that was pressed to complete a circuit containing a battery connected by a single pair of wires to a distant sounder. The key and battery constituted the transmitter, the pair of wires the transmission medium, and the sounder the receiver. Transmission of textual information was achieved using 39 characters (26 uppercase letters of the alphabet, 10 numerals, and 3 punctuation marks: comma, full stop, and question mark). In what follows, we introduce in chronological order the various coding schemes which were developed to represent characters and then we discuss some of the most significant developments in telecommunication since the advent of telegraphy.
1.3 Modern Telecommunication
1.3.1 Developments in Character Codes 1.3.1.1 Morse Code
Each character in the Morse–Vail telegraph was represented by a pattern of dots (•) and dashes (—) called Morse code, invented in 1838. The innovative and enduring principle behind the Morse code, which has sustained its application into the twenty-first century (e.g. in amateur radio), is that characters that occur frequently are represented using shorter codes, whereas less frequent characters are assigned longer codes. For example, the Morse code for letter E (the most frequent in the English alphabet) is a single dot •; the code for T (the next most frequent letter) is a single dash —; but the code for letter Q (one of the least frequent letters) is — — • —. Morse code has been extended over the years to include other characters and control signals. An international version is shown in Table A.1 of Appendix A. In actual transmission using Morse code, a dot is a voltage pulse of duration one unit (say T) generated by pressing a key to close the transmitter circuit for this duration, a dash is a voltage pulse of duration 3T, and dots and dashes within a Morse code for a character are separated by a silence interval of duration T. Characters are separated by a silence of duration 3T, and words by 7T. At the receiver the incoming message is identified by ear, using the sounds produced. For example, the letter E sounds like ‘dit’, the letter T sounds like ‘dah’, and the letter Q sounds like ‘dahdahdidah’. The value of T used depends on the transmission speed (words per minute) at which the receiving operator can copy accurately. The spacing between characters and the spacing between words can be increased above 3T and 7T, respectively, to accommodate a slow operator. In the original Morse–Vail telegraph system the incoming message was recorded on a moving strip of paper by using an electromagnet to move a marker which sketched the code waveform comprising a sequence of short (dot) and long (dash) pulses. As an example, the statement (excluding quotes) ‘THANK GOD’ would be conveyed by the voltage pulse sequence shown in Figure 1.3. The system, however, developed into receiving by ear from the clicks of the receiver. This was more convenient and faster for a trained operator who could handle 40–50 words per minute. By 1914, automatic operation involving tape-reader machines and multiplexing (transmitting more than one message at a time over a single wire) allowed telegraph transmission speeds of 400 words per minute to be achieved. The development of automated teleprinter communication technology in the 1920s marked the beginning of the end of manual landline (i.e. wire-connected) Morse–Vail telegraphy. Its commercial use was discontinued in 1932 by the British Post Office, and in the 1960s on railroads in America and Australia. Landline application has, however, continued on railroads in some developing countries and amongst a few die-hard enthusiasts in the US. Furthermore, Morse code continues to be used in radio telegraphy by radio amateurs worldwide and by a few ships at sea as a low-cost alternative to satellite communications. 1.3.1.2 Baudot Code
Efforts to improve the telegraph system led to the invention in 1871 of the first device for time division multiplexing (TDM) by the French engineer Jean-Maurice-Émile Baudot (1845–1903). The multiplexer consisted of a copper
V
T
H
Figure 1.3
A
N
K
G
O
D
.
t
Morse code’s sequence of voltage pulses for THANK GOD.
7
8
1 Overview of Communication Systems
ring divided into several equal sectors (say N). Each sector had five contacts which could be opened or closed, giving 32 (= 25 ) possible combinations for coding characters. A brush arm rotated in a circle around the copper ring and sequentially picked up the code combinations from each sector. Connecting this arm to a transmission line allowed N messages to be simultaneously sent. The existing variable-width Morse code was not suitable for the automatic transmission and reception operation of Baudot’s multiplexer, so he devised a new fixed-length code which was patented in 1874. The Baudot code was modified in 1901 by Donald Murray (1865–1945) – a New Zealand born farmer, journalist, telegraph engineer, and entrepreneur. After further modification in 1932 by the International Telegraph and Telephone Consultative Committee (CCITT) – now International Telecommunications Union, Telecommunications Sector (ITU-T), the Baudot–Murray code became known as the International Telegraph Alphabet No. 2 (ITA-2). Table A.2 of Appendix A lists the ITA-2 code. Each character is represented using five symbols drawn from a binary alphabet, namely a mark (voltage pulse) and a space (no voltage pulse), corresponding to binary digits or bits 1 and 0, respectively. With fixed-width codes like this, there is no need for the short pause (between characters) and long pause (between words), as in the Morse code. Rather a codeword ‘space–space–mark–space–space’ or 00100 is used to separate words. Characters need no separation between them, except that before actual transmission each character is framed by preceding the 5-bit character code with a start bit (usually a space) and terminating with a stop bit (usually a mark), giving seven transmitted bits per character. Using 5-bit codes, only 25 = 32 different characters can be represented. This is insufficient to cover 52 uppercase and lowercase letters of the alphabet, 10 numbers, and various punctuation and control characters. Raising the number of bits above five was not an option, owing to hardware constraints in the electromechanical technology employed at the time. To work around this 5-bit constraint, lowercase letters were not represented and two ‘shift codes’ were used. More specifically, a letter shift code 11111 is transmitted to indicate that all subsequent characters are uppercase letters until a figure shift code 11011 is encountered, which causes all following characters to be interpreted as numbers or punctuation marks. Baudot can therefore be credited with inventing the use of an escape code – still essential in modern computer systems – which is a code that changes the way the system interprets subsequent codes. Thus, 26 of the 32 codes were used to represent uppercase letters; 4 were used to designate blank, space, carriage return and line feed; and the remaining 2 were used for the letter and figure shift codes. Issuing a figure shift code allowed the 26 codes to be reused for representing numbers and punctuation marks, as shown in Table A.2. The ITA-2 code was adopted in teleprinters and the telex (telegraph exchange) network operating at a transmission speed of 50 symbols per second. By the middle of the twentieth century it had replaced Morse as the most widely used telegraph code and was still used in telex networks operating at speeds ≤300 symbols/second until the turn of the century. In other data communication systems Baudot code was replaced by American Standard Code for Information Interchange (ASCII) and extended binary coded decimal interchange code (EBCDIC) codes, which could represent a larger number of characters. The memory of Baudot, however, lives on today, transmission speed in symbols/second being called baud (Bd) in his honour. 1.3.1.3 Hollerith Code
In 1881, Herman Hollerith (1860–1929), a young American statistician working for the US Census Bureau, devised a means of recording census data on punched card for automatic machine reading. The punched card was later standardised to 186 mm × 82 mm in size, with a clipped top left-hand corner and 12 rows and 80 columns. A column was used to represent one character by punching holes in a selection of the 12 row positions available. One card could hold 80 characters. Recorded data were read electronically by moving the punched card between brass pins in a ‘tabulating machine’. At positions with holes, the pins on opposite sides of the card made contact, thereby completing an electrical circuit and registering a value. The pattern of holes for each character was defined by the Hollerith code. With 12 binary digits – i.e. 12 hole-or-no-hole positions – a total of 212 = 4096 characters can be represented. However, only 69 characters (52
1.3 Modern Telecommunication
upper and lowercase letters, 10 numerals, and 7 punctuation marks and symbols) were defined. The use of 12 rows (rather than, say, 7) allowed each character to be uniquely represented using only a few holes. Bear in mind that the holes were initially manually punched. Besides, many holes in one row would have made the card more liable to tear. Thus, all 10 numerals and 2 letters were represented with just one hole, the remaining letters were represented using two holes, and two or more holes were used to code the less frequent punctuation marks and symbols. Hollerith’s punch card technology was first employed in 1887 for calculating mortality statistics, but it gained popularity when it was used for the 1890 US census. Thanks to the new technology, it took just six weeks to analyse the 1890 census data more thoroughly than the previous 1880 census, which took seven years of toil by hand. To market his highly successful invention Hollerith founded the Tabulating Machine Company in 1896, which in 1911 merged with two other machine manufacturers to form the Computer-Tabulating-Recording Company. In 1924, the company was renamed International Business Machines Corporation (IBM) and it dominated the world of computing until the 1970s. The Hollerith code continued to be used for data representation until the 1960s, when IBM developed a new character code, EBCDIC, for its mainframe computers. But punched cards remained the primary means of data input to computers until the early 1970s, and the technology continued to be used in some establishments up to the early 1980s. Many today regard Herman Hollerith as the father of information processing. 1.3.1.4 EBCDIC Code
The extended binary coded decimal interchange code (EBCDIC – pronounced ebb-sea-dick) is a proprietary 8-bit character code developed in the 1960s by IBM for use on its computing machines. It provides 28 = 256 codewords for the representation of 256 different characters including a wide range of control characters. Table A.3 gives a list of EBCDIC codewords. Note the following features. ● ●
●
The first 64 codewords (hexadecimal 00 → 3F) specify control characters. The codewords for uppercase and corresponding lowercase letters differ in only one bit position (bit 7). For example, the codewords for letters ‘m’ and ‘M’ are 10010100 and 11010100, respectively. The 62 alpha-numeric characters (52 uppercase and lowercase letters and 10 numbers) occupy 7 noncontiguous blocks in the code table. Their codewords are restricted to those whose lowest four bits (i.e. first hexadecimal digit) have decimal values in the range 0–9. It is for this reason that the term binary-coded-decimal (BCD) features in the name of the coding scheme. A drawback of this noncontiguous arrangement of letters is that it makes computer programs for manipulating textual information more complex.
The EBCDIC character code has always been perceived as an IBM ploy for locking in its customers. Other computer manufacturers and the rest of the world adopted the ASCII code in the 1960s. The rapid growth in personal computers (PCs) starting in the late 1970s brought IBM’s domination of the computer industry to an end. EBCDIC’s use has therefore since become confined to a vanishing minority of historical computer systems. It is worth emphasising that EBCDIC is inherently incompatible with ASCII. A PC requires code conversion to be able to communicate with an IBM mainframe computer that uses EBCDIC, and a PC software program that manipulates letters of the alphabet by the relative decimal values of their ASCII codewords would not run properly on an EBCDIC system. 1.3.1.5 ASCII Code
The ASCII code was first developed in 1963, and updated in 1968 as ANSI X3.4, by the American National Standards Institute (ANSI) – formerly the American Standards Association. ASCII uses seven bits, which allows representation of 27 = 128 characters covering 26 lowercase letters, 26 uppercase letters, 10 numbers, punctuation marks and mathematical symbols, and a wide range of control characters. Table A.4 of Appendix A gives a complete listing of 7-bit ASCII codes. To use this table, note that the number in the top left-hand corner of each character entry is the decimal value of the code. The first and second columns
9
10
1 Overview of Communication Systems
of the table give, respectively, the first hexadecimal (hex) digit and the least significant four bits b4 b3 b2 b1 of the ASCII code for the character in the corresponding row. Similarly, the first and second rows give the second hex digit and the most significant three bits b7 b6 b5 of the code. Thus, the ASCII code for the character M is decimal 77, hex 4D, or binary 1001101. We may write these, respectively, as 7710 , 4D16 or 10011012 . The following features of the ASCII coding scheme may be observed in Table A.4. ●
●
●
The three most significant bits indicate the type of character. For example, the C0 control characters are in the first two columns b7 b6 b5 = 000 and 001; uppercase letters of the alphabet are in the fifth and sixth columns b7 b6 b5 = 100 and 101; and lowercase letters are in the seventh and eighth columns b7 b6 b5 = 110 and 111. The allocation of codewords to numbers (0–9) and letters of the alphabet (A–Z and a–z) follows a binary progression. For example, the codewords for the numbers 0, 1, and 2 are respectively 0110000, 0110001, and 0110010, the last four bits being the binary equivalent of each decimal number. The codewords for the letters R, S, and T are 1010010, 1010011, and 1010100, respectively. Mathematical operations can therefore be performed on the codewords that represent numbers and alphabetisation can be achieved through binary mathematical operations on the codewords for letters. For this reason, ASCII codes are described as computable codes. For ease of generation on a keyboard, lowercase and uppercase letters differ only at the sixth bit position (b6 ). For example, the ASCII codes for letter ‘A’ and ‘a’ are 1000001 and 1100001, respectively. This bit position (b6 ) is changed on the keyboard by holding down the shift key.
ASCII was an instant success and became a de facto international standard. However, as you can see from Table A.4, the code was inadequate even to cover international characters such as é, ü, ô, etc. based on the Latin alphabet. It was therefore necessary to internationalise ASCII. This task was undertaken by the International Organization for Standardization (ISO) leading to an international standard ISO 646 in 1972 and a subsequent revision in 1991. The ASCII code table shown in Table A.4 was designated the international reference version (IRV) and identified as ISO-646-IRV, which is synonymous with the US ASCII version ISO-646-US. Provision was then made for national versions to be created by substituting other graphic characters in place of the least needed 10 characters, namely @ [ \ ] ^ ‘ { | } ∼. For example, the German version (ISO-646-DE) is identical to Table A.4 except that these 10 characters are replaced by the characters § Ä Ö Ü ^ ‘ ä ö ü ß, respectively. Similarly, the French version (ISO-646-FR) has the characters à ∘ ç § ^ ‘ é ù è ̈ , respectively; the Italian version (ISO-646-IT) has the characters § ∘ ç é ^ù à ò è ì, respectively; and so on. Furthermore, allowance was made for versions to be created in which the IRV was altered in two other character entries by substituting the pound sign £ for the number sign # and/or substituting the currency sign ⦻ ⚫ for the dollar sign $. The process of internationalisation of the 7-bit ASCII code was extended to include languages that are not based on the Latin alphabet, such as Arabic, Greek, and Hebrew. As a result, up to 180 ASCII-based character code versions were eventually registered. The main drawback of a multiplicity of versions is that the character represented by a given code number (say decimal 91) is not unique but depends on the ASCII version of the source system. This poses a nuisance in information interchange between computers using dissimilar versions. National variants of ISO-646 are now obsolete having been replaced by less problematic schemes. 1.3.1.6 ISO 8859 Code
A more interoperable character code scheme was developed by the European Computer Manufacturers Association (ECMA) in the mid-1980s by extending the codeword from seven to eight bits. This allowed up to 256 characters to be represented. The new scheme was endorsed by ISO as ISO-8859. To maintain compatibility with the well-established ISO-646-IRV the 8-bit code table was designed as shown in Figure 1.4, with the left half (most significant bit, or MSB, = 0) identical to the ISO-646-IRV table and the right half (MSB = 1) providing an extra 32 control characters and 96 graphic characters. A US-ASCII character with code b7 b6 b5 b4 b3 b2 b1 was therefore assigned an 8-bit code 0b7 b6 b5 b4 b3 b2 b1 under the new scheme.
1.3 Modern Telecommunication
8 9 A B C D E
4
5
6
7
8
9
A
B
C
D
E
F
Regional Graphic Characters
7
3
Control (C1) Characters
4 5 6
2 SP
ISO-646-IRV Graphic Characters
3
1
ISO-646- IRV Control (C0) Characters
0 0 1 2
DEL
F ISO-646-IRV (Table A.4)
Figure 1.4 Layout of ISO 8859 (8-bit) character code table. The column label is the second hex digit (or most significant 4 bits) and the row label is the first hex digit.
Since there are more than 96 foreign graphic characters, it was necessary to develop multiple character sets, one set for a specified geographical region. This gave rise to more than 10 sets of 8-bit codes, which differed only in the area labelled Regional Graphic Characters in Figure 1.4. Escape sequences were defined for switching between character sets – analogous to changing typeheads on a typewriter. Thus, there was ISO-8859-1 or Latin Alphabet No. 1 for the needs of Western Europe. This is listed in Table A.5 of Appendix A, where the number in the top left-hand corner of each cell is the hex code of the character. So, for example, the 8-bit code for the ligature æ is hex E6 or binary 11100110 and the pound sign £ has code 10100011 (equivalent to decimal 163). Other character sets in the scheme included ISO-8859-2, which covered Eastern European languages (Albanian, Hungarian, Romanian, and Slavic), ISO-8859-3 to -8, which respectively covered Southern Europe, Northern Europe, Cyrillic (i.e. Bulgarian, Macedonian, Russian, Serbian, Ukrainian), Arabic, Greek, and Hebrew, and so on. The 8-bit ISO-8859 scheme was an improvement on the 7-bit ASCII. It reduced the number of character sets involved in global information exchange to a smaller number of regional variants. However, it still involved switching between character sets and did not cover all the world’s languages, most notably the East Asian languages. 1.3.1.7 Unicode
In 1987, three engineers from Apple Computer and Xerox – Joe Becker, Lee Collins, and Mark Davis – started a project to develop the ultimate character code: one that would be universal, uniform, and unique. Becker coined the term Unicode to describe this scheme, which aimed to cover all of the world’s languages, assign a unique character to each code number or bit sequence, employ fixed-width codes (i.e. same number of bits in every codeword), and wherever possible maintain some compatibility with existing standards (e.g. ISO-646-IRV and ISO-8859-1). This was a mammoth task the success of which required the support of influential global computer companies such as Microsoft Corporation and Sun Microsystems.
11
12
1 Overview of Communication Systems
The group did gain support and grew into an incorporated consortium in 1991. And in 1992 Unicode merged with and subsumed IS0/IEC 10646, a multilingual character code independently developed by ISO and at that time incompatible with Unicode. Version 13.0 of Unicode was released on 10th March 2020 as ISO/IEC 10646:2020, containing a staggering 143 859 characters covering almost all modern and historical writing systems in the world. Unicode has ample room for all graphic characters, format characters, control characters, and user-defined characters as well as future growth. The graphic characters include a wide range of symbols (technical, optical character recognition, Braille pattern, dingbats, geometric shapes, emoji, etc.) as well as ideographic characters used in China, Japan, Korea (CJK), Taiwan, Vietnam, and Singapore. Duplicate encoding is avoided by assigning a single code to equivalent characters irrespective of usage or language of occurrence. There are ongoing efforts to identify and define additional characters for inclusion in future versions of the standard, subject to a policy of not altering the codes already assigned to characters in previous versions, and not encoding logos, graphics, and font variants, nor ‘characters’ deemed idiosyncratic, novel, private-use, or rarely exchanged. To fully cater for all the world’s written languages, Unicode makes provision for 1 114 112 characters using integers or code points in the range 0 → 10FFFF16 . A code point is referred to by its numeric hex value prefixed by U+. For compatibility with US-ASCII and ISO-8859-1, the first 256 code points (i.e. 0000 → 00FF) are assigned as in Tables A.4 and A.5 of Appendix A. Each encoded character has a unique official Unicode name. For example, the character G is assigned code point U + 0047 and named LATIN CAPITAL LETTER G, and the character Ç, assigned code point U + 00C7, is named LATIN CAPITAL LETTER C WITH CEDILLA. It is convenient to think of the 1 114 112 Unicode code points as divided into a total of 17 planes, each containing 216 code points. Almost all common-use characters of all the world’s languages are contained in the 65 536 code points (0000 → FFFF) available in the basic multilingual plane (BMP) or Plane 0. These Unicode characters can be represented in computer systems using fixed-width 16-bit codewords, unlike the seven bits of ASCII and the eight bits of ISO-8859. However, to cover all the characters in all 17 planes using unique (non-overlapping) codewords, the Unicode Standard specifies the following three distinct encoding forms, called Unicode Transformation Formats (UTF): ●
●
●
UTF-32: each character is expressed using one 32-bit code unit which has the same value as the code point for the character. This is a fixed-width 32-bit character format, which quadruples the size of text files compared to the 8-bit ISO-8859. But, importantly, all characters of the world are uniquely covered using a single code unit access. UTF-32 is the preferred encoding form on some Unix platforms. UTF-16: most common-use characters of all the modern scripts of the world lie in the BMP (U + 0000 → U + FFFF). These characters can be represented by a single 16-bit code unit. This was the original design of the Unicode Standard, to be a fixed-width 16-bit character code. However, as the Standard evolved it became clear that not all characters could be fitted into code points ≤ U + FFFF. Thus, the code points’ range was extended to include the supplementary planes (1 → 16). An area of the BMP that was unallocated up until the extension was then set aside for surrogates, and a pair of 16-bit code units (called surrogate pair) in this area was used to represent (i.e. point to) a code point in a supplementary plane. Note that neither of the code units of a surrogate pair is used to represent a character in the BMP. Thus UTF-16 is a variable-width encoding form that uniquely codes characters by using one 16-bit code unit for each of the more frequent BMP characters, and two 16-bit code units for each of the much less frequent characters of the supplementary planes. A code unit conversion is required to convert a surrogate pair to the Unicode code point (> U + FFFF) and hence the supplementary character represented by the pair. UTF-16 typically requires half the memory size of UTF-32 and is the preferred encoding form in most implementations of Unicode other than on Unix platforms. UTF-8: this is the preferred encoding form for the Internet. It was prescribed to provide transparency of Unicode implementation in ASCII-based systems. Each Unicode code point in the range U + 0000 → U + 007F is represented as a single byte in the range 00 → 7F having the same value as the corresponding ASCII character. Then, using 8-bit code units or bytes of value >7F, the remaining code points are represented by two bytes (for U + 0080 → U + 07FF), three bytes (for U + 0800 → U + FFFF), and four bytes (for U + 10 000 → U + 10FFFF).
1.3 Modern Telecommunication
Thus UTF-8 is also a variable-width encoding form that uses codewords ranging in width from one byte for the first 128 (ASCII) characters to four bytes for the supplementary Unicode characters. The UTF-8 form works automatically as follows: if a byte – let’s call it Byte 1 – starts with bit 0 then it represents a character; or else if Byte 1 starts with bits 110 then the character is represented by the concatenation of Byte 1 and the next byte (i.e. Byte 2); or else if Byte 1 starts with bits 1110 then the character is represented by the concatenation of Byte 1 and the next two bytes (i.e. Byte 2 and Byte 3); or else if Byte 1 starts with bits 11110 then the character is represented by the concatenation of Byte 1 and the next three bytes (i.e. Byte 2, Byte 3, and Byte 4). Thus, a 1-byte codeword has prefix 0; a 2-byte codeword has prefix 110; a 3-byte codeword has prefix 1110; and a 4-byte codeword has prefix 11110. Furthermore, all Bytes 2, 3, and 4 have prefix 10, and this distinctively identifies them as not Byte 1, which prevents the error that would otherwise have been possible if one, for example, started reading a multibyte codeword from other than its first byte. Therefore, picking any byte at random inside a UTF-8 file, one can correctly read the character as follows: if the byte’s prefix is 0 then that’s the entire 1-byte character; or else the byte is part of a multibyte codeword formed by concatenating byte(s) to its left and/or right until the longest possible multibyte (≤4 bytes) codeword is formed that has a first byte starting with 11 and a last byte starting with 10. Unicode has been described as the ultimate character code for the twenty-first century. It has gained worldwide acceptance, including in East Asia, where it has the capacity to fully cater for all the ideographic characters in JIS (Japan Industrial Standards) and other East Asian standards (e.g. Chinese GB 2312-1980, Korean KS C 5601-1992, and Taiwanese Big-5 and CNS 11643-1992). Unicode may not be as efficient in handling a language as its regional character code set (e.g. ISO-8859-8 for Hebrew), but Unicode’s multilingual capability is a compelling advantage in an era of globalisation.
1.3.2 Developments in Services Data transmission by telegraphy was soon followed by voice transmission with the invention of the telephone by Alexander Graham Bell in 1876. Initially, voice communication was only possible over short distances between two people using permanently linked telephone sets. The novelty of switching made it possible for two subscriber terminals connected to the same switching office or local exchange to be manually linked by an operator. In 1897, an undertaker, A. B. Strowger, developed a step-by-step electromechanical switch that made automatic switching possible. The technology was thus in place for the communication of voice and data over wire line connections, although the speed and quality of transmission left much to be desired by today’s standards. A wide range of digital communication services, too many to discuss here, have been developed since the 1830s and especially during the last five decades. Here we briefly introduce a few of the most significant services in roughly chronological order. Some services, such as the telex and facsimile, were largely limited to business applications, whereas others, such as the telegram and email, made significant penetration into both home and business. 1.3.2.1 Telegram
Digital communication services began with the telegram in the late 1830s when members of the public sent messages on the Wheatstone–Cooke telegraph from Paddington railway station to Slough at one shilling per message – a large sum in those days. With the widespread use of the Morse–Vail telegraph all over the world, the telegram became a well-established communication service starting from 1845. The military used the telegram to communicate intelligence, deploy troops, and disseminate battle news; railway companies used it to regulate trains, and this significantly extended railway capacity; industry used it for a wide range of business communication (e.g. transmitting racing results, stock prices, urgent memos, etc.); and the police quickly adopted it in their fight against crime. A much-publicised early application was the arrest of John Tawell
13
14
1 Overview of Communication Systems
for the murder of his mistress on 1st January 1845. Tawell fled the crime scene and got on a train departing Slough to London where he hoped to make good his escape. However, his failure to reckon with the power of the new telecommunication invention proved his undoing. A telegram giving a description of the murder suspect beat him to the London station and the police were waiting when his train pulled in. Tawell was arrested, later tried, found guilty of murder, and hanged. The telegram was also a great success with the general public. Messages – charged according to length – were transmitted via telegraph, received and printed on paper at the destination telegraph station, and delivered by hand to the recipient’s address. By 1874, the United Kingdom had over 650 000 miles of telegraph wire, and more than 20 000 towns and villages were part of the UK network. However, by the second half of the twentieth century, the telegram service was forced into steady decline due to developments in other communication services – both analogue (e.g. telephony) and digital. The UK inland telegram service was eventually discontinued in 1982. 1.3.2.2 Telex
The telex, started in 1932 by the British Post Office, was another major digital communication service to be introduced. A teleprinter (or teletypewriter) in the premises of one subscriber was connected to another subscriber’s teleprinter via the public switched telephone network (PSTN) that had evolved since the advent of analogue telephony in 1876. A text message, typed in at the keyboard of the sending teleprinter, was represented in 5-bit Baudot (ITA-2) code, and transmitted down the line using on (mark) – off (space) – keying (OOK) of a 1500 Hz carrier voltage at a speed of 50 Bd. The message was received and automatically printed on paper at the destination teleprinter. Telex quickly grew into a global service particularly popular with government departments and a wide range of institutions and business organisations. Documents sent by telex had legal status. However, although the technology behind telex was steadily improved over the years – from manual to automatic switching, from 50 Bd to 300 Bd transmission speed, and from mechanical teleprinters to PCs – demand for the service began to decline steadily in favour of facsimile in the 1980s and email in the 1990s. 1.3.2.3 Facsimile
Facsimile service, or fax for short, allows paper documents – handwritten, printed, drawings, photographs, etc. – to be electronically duplicated over a distance. In 1980, the CCITT specified the first digital transmission standard for fax, called Group 3 (G3) standard. The connection between the sending and receiving fax machines is in most cases via the PSTN, and sometimes by radio. The G3 fax standard became the most widely used and could transmit an A4 page in less than one minute, typically 15–30 seconds. A later G4 standard required a digital-grade telephone line and could transmit an A4 page in less than five seconds with a resolution of 400 lines per inch (lpi). The G3 standard has a resolution of 200 lpi, which means that the paper is divided into a rectangular grid of picture elements (pixels), with 200 pixels per inch along the vertical and horizontal directions, amounting to 62 pixels per square millimetre. Starting from the top left-hand corner, the (A4) paper is scanned from left to right, one grid line at a time until the bottom right-hand corner of the paper is reached. For black-and-white-only reproduction, each pixel is coded as either white or black using one bit (0 or 1). For better-quality reproduction in G3 Fax, up to five bits per pixel may be used to represent up to 25 = 32 shades of grey. The resulting bit stream is compressed at the transmitter and decompressed at the receiver in order to increase the effective transmission bit rate. At the receiver, each pixel on a blank (A4) paper is printed black, white, or a shade of grey according to the value of the bit(s) for the pixel location. In this way the transmitted picture pattern is reproduced. Early fax machines used a drum scanner to scan the paper directly, but later machines formed an image of the paper onto a matrix of charge-coupled devices (CCDs), which build up a charge proportional to incident light intensity. Fax had its origin in chemical telegraphy invented in 1842 by Alexander Bain (1810–1877) and involved damp electrolytic paper. Fax did not, however, come into any serious use until 1902 when a German, Dr Arthur Korn (1870–1945), developed a suitably sensitive photoelectric cell, which allowed a mechanism to be devised for converting a photographic negative of the picture into an analogue electrical signal. In 1924, AT&T (American
1.3 Modern Telecommunication
Telephone and Telegraph Corporation) demonstrated telephotography, as fax was then known, by transmitting pictures over a telephone line from Cleveland to New York. By the end of the 1920s, pictures of distant events for publication in newspapers were being routinely sent by fax. In 1935, the service began to come into more widespread use amongst businesses when the fax machine became more affordable. This came about through eliminating the need for a photographic negative and producing the electrical signal using light reflected directly off the picture being scanned. At the turn of the century, the popularity of fax for business communication began to be challenged by email, which is particularly convenient for sending documents that are in electronic (word-processed) form as it can be done without the sender leaving their desk or feeding any paper into a machine. And by first scanning the documents into electronic form and sending them as file attachments, email may even be used for transmitting hard copies and documents that must be supported by a signature and/or headed paper. 1.3.2.4 The Digital Era
Digital communication services have grown at an astounding rate during the last five decades. Computer networking started in 1971 with the Advanced Research Projects Agency Network (ARPANET). By the end of the 1990s, this simple project had evolved into an Internet revolution having enormous global impact on every aspect of business and social life, and spurning completely new jobs, business methods, work patterns, social pastimes, vocabulary, crimes, etc. Social media, search engines, e-commerce, video/music streaming, e-learning, and online transactions are just a few of the countless services made possible by the Internet, which is now a global network of computers, devices, and (in short) things. A cellular telephone concept, which was first demonstrated by Motorola in 1972, quickly caught on and led to a global mobile communication revolution, driven by steady improvements in successive generations of nonbackward-compatible terrestrial wireless communication technologies from 1G in 1979 to 5G in 2019. The major areas of improvements included changing (i) from analogue technology in 1G (launched in 1979) to digital technology in 2G (launched in 1991); (ii) from circuit switching to packet-switched high-speed data network of data rate ≥ 144 kb/s in 3G (launched in 2001); (iii) to an all-IP (Internet Protocol) mobile broadband network of peak data rate ≥ 100 Mb/s in 4G (launched in 2011); (iv) to a high-mobility (≤ 500 km/h), low-latency (≥1 ms), ultra-broadband network of peak data rate ≥10 Gb/s in 5G (launched in 2019). At the same time as mobile communication technology was evolving, there was enormous progress in the development of the transceiver device or user terminal (enabling access to the network), culminating by the year 2000 in the smartphone, which is unarguably the most versatile and consequential pocket-sized device or gadget in humankind’s entire history. Think of it! The smartphone is a telephone handset, data terminal, video terminal, video camera, still camera, calculator, audio recorder and player, clock and alarm, route finder (i.e. navigational aid), personal digital assistant (e.g. notebook, diary, calendar, reminder), gateway to the Internet, and much more; yet it is not just portable (i.e. untethered to one location) but pocket-sized. The smartphone has become a must-have item for work, leisure, and travel, and its global penetration has even surpassed the all-time great success stories of the car and washing machine. For example, on 29th February 2020, the website http://livecounter.com [1] estimated that there were 1.318 billion cars in the world. This figure should be compared to an estimate of 3.5 billion smartphone users worldwide in 2020 by http://statista.com [2] and a report in http://gartner .com [3] of total global smartphone sales of 1.524838 billion units in the year 2019 alone. In 1973, the US department of defence began the development of a global positioning system (GPS) comprising a space segment of 24 satellites, a control segment of several earth stations, and a user segment. The system eventually became operational in 1995. GPS was designed primarily for the US military to provide estimates of position, velocity, and time of a GPS receiver and hence of the person or unit bearing the receiver. However, civil applications of the service have since grown to the extent that satellite navigation (enabled primarily by the US GPS) is now indispensable to aeronautical and maritime transportation, mining, surveying, and countless other activities; and is also an important tool for route determination (called satnav) by drivers and the general population around
15
16
1 Overview of Communication Systems
the world. In addition to the US GPS, other less dominant global navigation satellite systems (GNSS) have also been launched, including Europe’s Galileo (which went live in 2016), Russia’s GLONASS (the development of which started in 1976 but full global coverage was only attained in 2011), and China’s BeiDou (which started limited regional coverage in 2000 and attained full global coverage in 2018). By the early 1970s, advances in integrated circuit technology had made digital transmission a cost-effective way of providing telephony, which was hitherto an exclusively analogue communication service. Pulse code modulation (PCM), a signal processing technique devised by British engineer Alec Reeves (1902–1971) back in 1937, was employed to digitise the analogue speech signal by converting it into a sequence of numbers. Digital exchanges and TDM began effectively to replace analogue technology, i.e. analogue exchanges and frequency division multiplexing (FDM) – in the PSTN during the 1970s, although the first TDM telephony system had been installed as far back as 1962 by Bell Laboratories in the USA. The momentum of digitalisation of the entire transmission system increased in the 1990s, until by the turn of the century the telecommunication networks of many countries had become nearly 100% digital. Globally, in 2020, the only remaining analogue portion of the telecommunication network is the local loop connecting, via copper wire pair, a subscriber’s landline telephone handset to the local exchange, or a street cabinet beyond which transmission is entirely digital. Digital audio broadcasting began in September 1995 with some field trials by the British Broadcasting Corporation (BBC). Although analogue amplitude modulation (AM) and frequency modulation (FM) radio broadcast will continue into the foreseeable future, digital radio now provides the best in broadcast sound quality, especially in the presence of multipath distortion. Terrestrial digital television broadcast was first launched in the UK in 1998 and provides a more efficient utilisation of the radio spectrum and a wider service choice than its analogue counterpart. Transition to terrestrial digital television broadcasting and a total shutdown of analogue television broadcasting (a process known as digital switchover) has been completed in many countries around the world. For example, analogue TV broadcasting was fully terminated in the UK on 28th November 2013 and in China on 14th May 2016. Also, TV broadcasts via satellite have been migrated from analogue transmission using FM to digital transmission using a selection of bandwidth-efficient digital modulation techniques. It is therefore likely that analogue television broadcasting, first started by the BBC in 1936, will cease to exist within this decade. By the turn of the century, supported by a variety of reliable mass data storage media, the digital era had fully arrived, and the convergence of communication, broadcasting, computing, and entertainment had become a reality. This convergence is best epitomised at the device level by the smartphone and at the network level by the Internet. Both support multimedia streaming, audio and video (IP) telephony, mobile computing, and gaming. In view of the rapid pace of development and diverse range of services, a detailed study of all telecommunication applications would be an extremely challenging and inadvisable task. Fortunately, the underlying principles do not change and the best way to acquire the necessary expertise in this exciting field is to first become grounded in its principles. It is for this reason that this book will focus on a thorough yet accessible treatment of the principles of communication engineering, drawing on modern examples and applications as necessary to ensure a complete understanding of the concepts presented as well as a better appreciation of the current state of the art.
1.3.3 Developments in Transmission Media Much of the growth in telecommunications would have been impossible without significant developments in transmission media which provide suitable paths for signals to travel from source to destination. The first telegraph circuit built by Wheatstone and Cooke between Euston and Camden stations consisted of iron wires insulated with cotton and laid in iron pipes buried beside the railway line. The insulation and hence the line failed whenever the wires became wet. This forced Wheatstone and Cooke to abandon the idea of buried cables and opt rather for suspending the wire from telegraph posts using glass insulators. From this very humble beginning, transmission medium technology developed to encompass in chronological order copper cables, and radio and optical fibre,
1.3 Modern Telecommunication
all of which have been used for short- to long-distance links. We will briefly review the historical development of these three media. But note that other transmission media suitable only for very short links ranging from a few centimetres to a few metres have also been developed over the years. These include microstrip lines in printed circuit boards, metallic waveguides used, for example, to connect an outdoor antenna to an indoor receiver unit, and infrared radiation used, for example, in the remote control of gadgets. 1.3.3.1 Copper Cable
The telegraph poles that crisscrossed America and Europe in the early days of telegraphy carried iron or steel wires, one wire for each circuit with the ground providing the return path. It was known back then that copper was a much better electrical conductor than iron. In fact, the first ever undersea telegraph cable, laid between Dover (UK) and Calais (France) in 1851, consisted of four copper conductors insulated with a coating of gutta-percha (the natural latex of the native Malaysian Palaquium tree brought to the UK in 1843) and protected by iron sheathing. Copper was also the conducting material for the first successful trans-Atlantic telegraph cable laid in 1866 (after three previous failed attempts in 1857, 1858, and 1865) between Valentia Island (Ireland) and Newfoundland (USA), a distance of 1852 miles. Copper, however, is a soft metal and the annealing technique necessary to make it strong enough to support its own weight from pole to pole had not yet been discovered. This situation changed in 1877, when Thomas Doolittle (1839–1921) invented a suitable copper wire manufacturing process. As a result, in the 1880s copper wire replaced iron as the standard transmission medium, and the single wire circuit (with ground return path) was replaced by a two-wire (i.e. wire pair) circuit, which significantly reduced noise. Further improvement was obtained through the introduction of twisted-pair cables in which the two wires comprising one circuit are independently insulated and twisted around each other. Multipair cable was introduced, consisting of a lead tube containing up to 100 copper wires, each insulated with paraffin-impregnated cotton or gutta-percha. By 1900, to avoid the unsightly congestion of aerial wires, short-distance cables began to be laid underground in conduits made initially of creosoted wood and then of vitrified clay. The first long-distance underground cable was laid in 1912 and carried analogue telephone signals between Philadelphia and Washington DC. Long-distance underground wire pairs were replaced by coaxial cables in the 1940s. Underground cable cores bearing the twisted wire pairs of the PSTN local loop are a significant global investment of the twentieth century which could not be simply discarded in the digital era. In the 1990s, technological advances (especially in very large scale integration (VLSI) circuits to perform the required complex signal processing) allowed the copper lines of the local loop to be used as a vital medium for digital transmission known as the digital subscriber loop (DSL). From 2003, there was a rapid global uptake of DSL-enabled broadband connection to the home. AT&T invented the coaxial cable and installed the first experimental coaxial cable system between New York and Philadelphia in 1936, followed in 1941 during World War II by the first commercial installation between Minneapolis and Stevens Point. A coaxial cable (or coax for short) consists of an insulation-covered, tube-shaped outer conductor (sometimes of braided design) with a second conductor fixed into the centre of the tube and separated from it by other insulation. It provides better protection from interference and larger bandwidths than twisted-pair configurations. The early L1 coaxial cable installations of the 1940s could carry 480 telephone calls, whereas the L5 systems of the 1970s were vastly superior and could carry 132 000 simultaneous calls. The laying of the first transatlantic coaxial cable (TAT 1) spanning 2240 miles between Oban in Scotland and Clarenville in Newfoundland was completed in 1956. With an initial capacity of 36 telephone circuits, TAT 1 remained in service for 22 years, although several other TAT coax cables were laid in the ensuing years to provide more capacity. The last transatlantic coaxial cable to be laid was TAT 7 in 1978 with an initial capacity of 4000 telephone circuits. In the 1980s, coaxial cables were superseded by optical fibre and most long-distance coaxial cable installations have now been retired. As the above evolution of the transmission medium from iron wire to copper coax took place, improvements were also made in insulation material, from cotton through gutta-percha and paper to finally (synthetic) plastic.
17
18
1 Overview of Communication Systems
1.3.3.2 Radio
James Clerk Maxwell (1831–1879), an outstanding university professor by age 25 and widely regarded as the most brilliant British mathematician and physicist of the nineteenth century, made a theoretical prediction of the existence of radio waves in 1864. His theory is embodied in four equations, known as Maxwell’s equations, which when combined yield two wave equations – one for a magnetic field and the other for an electric field – propagating at the speed of light. Simply put, Maxwell’s theory stipulates that a changing electric field generates a changing magnetic field in the surrounding region, which in turn generates a changing electric field in the surrounding region, and so on. In this way a coupled electric and magnetic field travels out in space at the speed of light (≈ 3 × 108 m/s), and the resulting wave is known as an electromagnetic wave. The significance of Maxwell’s theory for telecommunications is that if we can somehow generate an electromagnetic wave having one of its parameters – amplitude, frequency, or phase – varied in sync with the variations of an information-bearing voltage signal then we can transmit information from one point to another at the speed of light and without the need for a cable connection. This is radio or wireless communication. For example, to implement wireless telegraphy we can switch the wave on or off to signal a mark or space, respectively, and at some distance recover the sequence of marks and spaces by detecting the presence and absence of the wave during each signalling interval. A German physicist called Heinrich Hertz (1857–1894) provided an experimental verification of Maxwell’s wave theory in 1888. Using a wire connected to an induction coil to generate the electromagnetic waves and a small loop of wire with a spark gap to detect them, he measured their wavelength and showed that the waves were reflected and refracted in a similar way to light. The most significant contribution to the establishment of radio communications at the turn of the twentieth century was made by Guglielmo Marconi (1874–1937), who was born and grew up in Bologna, Italy but moved to Britain in 1896 in search of financial support for his radio experiments. In 1897, he sent signals by radio over eight miles from Lavernock Point to Brean Down, made the first radio transmission from ship to shore while on a visit to Italy, and formed the Wireless Telegraph & Signal Company. In 1899, he put his invention to work for the international yacht races being held off New York harbour at that time. He followed the yachts around the course and reported the races for two newspapers using wireless telegraph from yacht to shore. This event helped bring wireless telegraphy to prominence in the US. In December 1901, Marconi achieved the first transatlantic radio transmission covering 2800 km from Poldhu in Cornwall (UK) to Signal Hill in Newfoundland (Canada); and by 1907 he had established a transatlantic wireless telegraph service. Radio communication started in 1897 with point-to-fixed-point wireless telegraphy using Morse code – a digital transmission that was simple enough to implement using the technology of the time by switching the radio signal on and off. It was not long, however, before the broadcast and mobility potential of radio became obvious. In 1904, Marconi started a service of Morse-coded news broadcasts to subscribing transoceanic ships, although any Morse code literate person with the right equipment could receive the news for free. The exploitation of radio communication grew very rapidly during the twentieth century to a point today where modern society is inextricably dependent on radio communication. Some of the most notable early developments that made this possible are summarised below. Reginald Fessenden (1866–1932) invented amplitude modulation (AM) in 1906, which for the first time allowed the amplitude of a radio carrier signal to be continuously varied according to variations of an analogue (voice) signal. This paved the way for audio broadcasts and radio telephony once the design of vacuum tube transmitters and receivers was perfected in the mid-1910s. The first AM radio broadcast was the transmission of the US presidential election returns from the Westinghouse station at Pittsburgh in November 1920. Commercial radio broadcasts started in 1922 and stations proliferated in the US after it was finally figured out that the service could be financed through advertising. In the UK, the BBC was established and started its broadcasts in the autumn of that year, paid for through licence fees – a practice which has continued till today. The first transatlantic radio telephone service
1.3 Modern Telecommunication
was established in 1927 between London and New York using a 60 kHz radio carrier. But with a capacity of only one call at a time the service was extremely expensive, attracting a tariff of £15 for a three-minute call. Edwin Howard Armstrong (1890–1954) invented the frequency modulation (FM) technique in 1934 and FM radio broadcasts started in 1939. The FM technique varies the frequency of the radio carrier in proportion to the variations of the voice signal and yields a superior received signal quality compared to AM. In 1940, the Connecticut State police put into operation the first two-way FM mobile radio system designed by Daniel Noble (1901–1980), a professor of electrical engineering in the University of Connecticut. In 1936, radio broadcast of television was started by the BBC in the UK. The first (black-and-white) television sets cost as much as a small car. Television had been invented in the 1920s, with significant contribution from John Logie Baird (1888–1946) – who invented mechanical television and successfully demonstrated a working prototype on the 26th of January 1926 to members of the Royal Institution in London. However, it was Philo Farnsworth (1906–1971) who demonstrated the first all-electronic television system in 1928. In the 1940s, radio began to be used to provide high-capacity, long-distance transmission links for telephone and television signals. Called microwave radio relay, the system consisted of a relay of line-of-sight radio transceivers arranged on towers and had the advantage of lower construction and maintenance costs compared to coaxial cable transmission systems. It provided an alternative to cable and by the 1970s carried most of the voice and TV traffic; but it was surpassed in the 1980s by optical fibre. In 1945, Arthur C. Clarke, famous as a science fiction writer, suggested in an article in the British radio magazine Wireless World that a long-distance radio link could be established using a satellite located in an equatorial orbit with a period of 24 hours. In this orbit known as geostationary orbit (GEO), the satellite would appear stationary when viewed from any point on the earth’s surface. A signal transmitted from an earth station to the satellite could be retransmitted by the satellite back to earth and received at all points on the visible earth. This would make possible broad-area coverage and long-distance radio communication that was more reliable than the only method available at the time via HF radio transmission. This vision of satellite communications became a reality 20 years later when the first commercial satellite, Early Bird (later renamed INTELSAT 1), was launched into GEO and commenced operations on the 28th of June 1965. The first spacecraft (Sputnik I) had been launched by the USSR (now Russia) in 1957, but it was the satellite Telstar I, built by the Bell Laboratories and launched in 1962 into a 158-minute low earth orbit (LEO), that relayed the first television signals between USA and Europe. Demand for satellite communication has grown steadily since its inception. In the 1970s and 1980s, satellite systems were used for international and domestic telephony and television distribution. The arrival of superior optical fibre transmission systems in the 1980s shifted telephone traffic from satellite to optical fibre links. A bold attempt in the 1990s to provide global mobile communication using a constellation of satellites in LEO or medium earth orbit (MEO) ended in commercial failure due in part to the unforeseen runaway success of terrestrial cellular mobile communications. But satellite communication is now well established as the leading means of television programme distribution directly to homes, and of communication at sea and in remote, undeveloped or disaster areas. The GPS service is also a satellite communication application. Satellite communication is also used to provide broadband Internet access for subscribers in remote areas or in regions without adequate terrestrial communication infrastructure. 1.3.3.3 Optical Fibre
The use of optical fibre as a transmission medium was first proposed in 1966 by Charles K. Kao of Standard Telecommunications Laboratories, in the UK. An optical fibre is a dielectric waveguide about 125 μm in diameter made from high-purity silica glass. It consists of a glass core surrounded by a glass cladding of lower refractive index. A plastic sheath covers the cladding to give mechanical protection. An optical cable contains several fibres used in pairs – one for each transmission direction. Light waves propagate within the inner core over long distances
19
20
1 Overview of Communication Systems
through total internal reflection at the core-cladding interface. This light can be switched on and off to signal bits 1 and 0, respectively, and thereby convey information. At the time of Kao’s proposal the best available fibre material had a loss of about 1000 dB/km, but Kao was convinced that the high loss was due to absorption by impurities in the glass, and that losses ≤20 dB/km could be achieved to make long-distance optical fibre transmission links possible. The first breakthrough in reducing fibre attenuation came in 1970, when Robert Maurer of Corning Glass Works used fused silica to make a fibre having an attenuation of 16 dB/km at a wavelength of 633 nm. Since then, improved fabrication technology and a move to higher wavelength regions – called windows – offering lower optical losses have resulted in dramatically reduced attenuation. There are four such windows at 850, 1310, 1550, and 1625 nm. The first-generation fibre systems were installed in 1977 and operated at 850 nm with an attenuation of 3 dB/km. Most operational systems today use the second window at 1310 nm with attenuation 0.5 dB/km and the third window at 1550 nm with attenuation 0.2 dB/km. The technique of Raman amplification enables use of the fourth window at 1625 nm (of attenuation around 0.3 dB/km) especially in ultra-dense wavelength division multiplexing (WDM) systems. The attenuation figures quoted here refer to intrinsic losses only. This is the loss inherent in the fibre material due to Rayleigh scattering and absorption by impurities. Optical fibre is also subject to external sources of attenuation known as extrinsic losses (e.g. due to sharp bending, coupling, and splicing), which may sometimes account for the bulk of the total attenuation. Optical fibre has numerous advantages over radio and copper lines as a transmission medium, but the most significant are its low attenuation (stated above) and large bandwidth. This small attenuation allows a wider spacing of repeaters in optical fibre transmission links than is possible in copper lines, leading to reduced costs. For example, the last (and best) transatlantic copper cable TAT-7 had 662 repeaters compared to only 109 repeaters in the first (and poorest) transatlantic optical fibre cable TAT-8. Optical fibre conveys an optical carrier of very high frequency – about 200 000 GHz. So, assuming a transmission bandwidth of about 5% of the supported carrier frequency, we see that optical fibre offers a potential bandwidth of about 10 000 GHz. The technology exists today – using soliton laser, erbium-doped fibre amplifier (EDFA) and dense wavelength division multiplexing (DWDM) – to allow long-distance transmission at several terabits per second on optical fibre without regeneration. Starting from the 1980s, optical fibre has replaced coaxial cables in all long-distance and undersea transmission lines, and in the inter-exchange and metropolitan area networks. The first international submarine optical fibre system was installed in 1986 between the UK and Belgium. In 1988, the first transatlantic optical fibre system, TAT-8, was installed between Europe and the USA with a capacity of 560 Mb/s. Several more transatlantic optical fibre cables have been installed since then, up to the latest TAT-14, which became operational in 2001 with a capacity of 3.15 Tb/s. Two transmission media frontiers have withstood the optical fibre revolution started in the final years of the twentieth century. First, the local loop – the so-called last mile of the PSTN – continues under the dominance of copper wire pairs. The reason for this is largely economic. Fibre in the local loop makes ultra-broadband connection to the home possible, but it involves significant extra costs for cable installation and new terminal equipment, costs which (unlike those of the long-distance segment) cannot be spread over several subscribers. Furthermore, the power supply required to operate the subscriber unit is currently sent from the local exchange along the same copper wire pair that carries the message signal, whereas fibre being an insulator imposes the burden of an alternative power supply arrangement. But these problems are not insurmountable and optical fibre is now beginning to replace copper in the local loop and a twenty-first century broadband network is emerging as a natural successor to the PSTN legacy of the last century. The other frontier, the only one that is impregnable to optical fibre, is the connection or access provided by radio for mobile communication units on land, sea, and air, and in remote, undeveloped or disaster-stricken areas. The mobility and rapid deployment capabilities of radio in such applications are unique qualities that cannot be delivered by optical fibre.
1.4 Communication System Elements
1.4 Communication System Elements Modern communication systems vary widely in their applications and complexity, but they are all accurately represented by the block diagram shown in Figure 1.5. A variety of information sources feed a transmitter with the message signal to be transmitted. The transmitter transforms the message signal into a form that is compatible with the type of communication system and that is suitable for passage through the transmission medium with an acceptably small distortion. The output of the transmitter, known as the transmitted signal, is placed into the transmission medium, which conveys it to a receiver located at the intended destination. The received signal at the output of the transmission medium and input of the receiver is a distorted version of the transmitted signal. Noise, distortion, and reduction in strength have been introduced by the medium. The receiver has the task of removing (as far as possible) the transmission impairments and undoing each operation performed by the transmitter. It then delivers an exact or close copy of the original message to a user or information sink. The receiver is also selected to match the characteristics of the transmission medium and to be compatible with the type of communication system. In what follows, we give a brief overview of information sources and sinks. A nonmathematical introduction to the processing tasks of the transmitter and receiver is also presented, which lays a good foundation for a more detailed treatment in the rest of the book. However, an introduction to transmission media is postponed until the first section of Chapter 5, which is devoted to a detailed treatment of the topic.
1.4.1 Information Source The information source or input device acts as an interface between the communication system and the outside world and provides the message signal that is processed by the transmitter. There are four main classes, namely audio, video, and data input devices, and sensors. Communication system Transmitted signal
Transmitter
Message signal
Transmission medium
Received signal
Receiver
Estimate of message signal
Audio input device
Audio output device
Data input device
Visual display device
Video input device
Storage device
Information source
Information sink
Figure 1.5
Block diagram of a communication system showing its major elements.
21
22
1 Overview of Communication Systems
1.4.1.1 Audio Input Devices
The microphone is the most basic audio input device. It converts acoustic (sound) pressure wave into an electrical signal (current or voltage) of similar variations, which serves as the message signal input of the communication system. A voltage supply is often required for the operation of the microphone, or for the amplification of the weak electrical signal it generates. Other audio input devices may include a music keyboard, musical instruments digital interface (MIDI), audio players, and other digital music instruments. There are different types of microphones based on the principle of their operation. Dynamic microphone: the dynamic or electromagnetic microphone is based on Faraday’s law of induction, which stipulates that a changing magnetic field will induce voltage in a conductor lying within the field. In the moving coil dynamic microphone shown in Figure 1.6, a wire coil enclosing a fixed permanent magnet is physically attached to a diaphragm. Pressure waves from a sound source fall on the diaphragm, causing it to vibrate along with the coil. The motion of the coil in the field of the permanent magnet induces an electromotive force (electrical output) in the coil, which varies in sync with the sound pressure waves. A moving magnet dynamic microphone is also possible in which the coil is fixed and the magnet is attached to the diaphragm. In this case, sound waves moving the diaphragm also move the magnet, causing a change in the magnetic field in which the coil lies and hence inducing electromotive force in the coil. A type of dynamic microphone with much better fidelity at high frequencies is the ribbon microphone. Here a light (low-inertia) aluminium ribbon is suspended edgewise between the poles of a magnet. Electromotive force is induced in the ribbon when it is moved in the magnetic field by air flowing past it from sound waves. Piezoelectric microphone: when a piezoelectric material is subjected to mechanical strain, voltage is induced between its ends. The operation of a piezoelectric microphone is based on this important property. If one end of such a material is fixed and the other end is attached to a diaphragm, sound waves falling on the diaphragm will cause a strain in the material. This induces a voltage that varies according to the strain and hence according to the sound pressure. Carbon microphone: in a variable-resistance or carbon microphone, carbon granules are packed in a chamber formed by one fixed and one movable electrode. The movable electrode is attached to a diaphragm. Sound waves falling on the diaphragm vary the packing density of the granules and hence the resistance of the chamber. The variable resistance R is converted to a varying current I by connecting a constant voltage V across the chamber since, by Ohm’s law, I = V/R. Electret microphone: an electret microphone is of variable capacitance design. A capacitor is formed using two metallic plates, one fixed and the other movable. The movable plate is a thin metallic layer diaphragm. The fixed plate is a metal plate covered over with a film of Teflon-like plastic that has a permanent electric charge Q, hence the name electret. Air pockets are trapped between the electret and the fixed electrode due to irregularities on the surface of the electret. When sound waves fall on the movable diaphragm, the size of the air pockets and hence the capacitance C of the capacitor is varied. This produces a variable voltage V according to the relation V = Q/C. An alternative variable capacitance design is the condenser microphone in which the charge Q is maintained by an external voltage source. Coil, physically attached to diaphragm Sound wave input
ʋ(t) Electrical output
N
S
Diaphragm Figure 1.6
Moving coil dynamic microphone.
permanent magnet
t
1.4 Communication System Elements
1.4.1.2 Video Input Devices
A video input device presents a video message signal to the communication system. The video signal is a variable voltage that originates from a visual signal, which consists of light intensity (or luminance) and colour (or chrominance) information. The most basic video input devices are the video camera (e.g. digital camera, analogue camcorder, and webcam) for dynamic three-dimensional (3D) images and the scanner (e.g. image scanner, 3D scanner, barcode reader, and fingerprint scanner) for still images. Secondary input devices include those that read pre-recorded video signals such as the video cassette recorder (VCR), which reads analogue video signals stored on magnetic tape in two-reel cassettes, and digital video recorders. 1.4.1.3 Data Input Devices
There is a wide variety of data input devices, which generate data serving as the message signal input of a communication system. Many of these devices work in conjunction with a computer system or other suitable data terminal equipment (DTE). A few examples of data input devices are given below. ● ●
●
●
●
A keyboard generates the Unicode or ASCII code corresponding to the key pressed. Several devices are used to convert finger and hand movement into computer commands, which may involve data generation under the control of a computer program. Examples of such devices include the mouse, trackball (an upside-down mouse), joystick, touchscreen, touchpad, pointing stick, and virtual reality systems. There are different types of devices that read stored data and present these to the communication system for processing and/or transmission. A barcode reader or laser scanner scans a barcode, which encodes information (e.g. the price of an item) in black lines of varying width. A magnetic-ink character reader reads information written on a document using magnetic ink characters. A magnetic strip reader reads data stored magnetically on magnetic strips, such as those found on credit cards, security access cards, etc. Optical character readers and optical mark readers are used (in conjunction with software) to read printed characters and to detect and decode marks on paper. Punched card readers read information stored as an array of small holes and spaces on paper. Other data-reading input devices include the optical disk drive for reading data from an optical disk, the disk drive for reading data from a floppy or hard disk, and the tape reader for reading data from a magnetic tape. A digitising tablet is used to convert graphics information (drawn on the tablet) into computer data. The drawing is displayed on a computer screen and the associated data may be processed by a communication system. A smart pen is used to capture handwritten or other graphics information, which may then be converted into data for processing in a communication system.
1.4.1.4 Sensors
Sensors measure physical quantities such as temperature, pressure, mass, etc., and convert the measurement into an electrical signal that serves as the message signal input for the communication system. Sensors are used in telemetry systems to obtain information from remote or inaccessible locations, e.g. monitoring the status of equipment on a spacecraft or detecting the accumulation of ice on the wings of an aircraft. They are also used in numerous systems including automatic data-logging systems, security systems, safety systems, traffic control systems, manufacturing systems, and process control systems. For example, a tipping-bucket rain gauge generates a voltage pulse each time a standard-size cup tips after filling with rainwater. The irregular sequence of voltage pulses serves as the message signal, which may be transmitted and analysed to obtain the rainfall rate. A security system may use a sensor to detect movement. An alarm (or message) signal is then generated and transmitted to a console and used to initiate appropriate action, such as the ringing of a bell. In the most general sense, every input device can be described as a sensor of some sort. The camera senses reflected light, the microphone senses sound pressure, the keyboard senses key press, the mouse senses hand movements, etc. However, the classification of input devices presented here is useful in identifying the type of information provided by the device to the communication system, whether audio, video, data, or a more general measurement of a physical quantity.
23
24
1 Overview of Communication Systems
1.4.2 Information Sink An information sink is the ultimate destination of a transmitted signal. It may serve as an interface between human users and the communication system, making the transmitted information understandable through sight or sound. It may also serve as a repository, storing the transmitted information for later processing, re-transmission, or display. There are therefore three types of information sink, namely audio output devices, visual display devices, and storage devices. 1.4.2.1 Audio Output Device
Generation of sound: the loudspeaker is an audio output device that converts the electrical signal output of a communication system into mechanical vibrations at the same frequencies. Figure 1.7 shows the basic operation of a moving coil loudspeaker. A cone is physically attached to a wire coil, which encloses a permanent magnet lying along its axis. The electrical signal is fed as current into the coil. This current sets up a magnetic field that varies proportionately with the current and has maximum strength along the coil axis. The interaction of the field of this electromagnet with the field of the permanent magnet results in a force that varies in sync with the field strength of the electromagnet and hence with the input current. This varying force causes the coil along with the attached cone to vibrate, thereby producing sound waves that follow the input current variations. The vibrations also generate sound waves travelling in the backward direction, which are reflected by the loudspeaker casing, and may interfere in a destructive manner with the forward waves. To prevent this, the casing is padded with sound-absorbent material. It is usual to have separate loudspeakers, a woofer for the low frequencies and a tweeter (with a smaller cone) for the high frequencies. Sound pressure level (SPL): a sound signal or sound wave is an oscillation in pressure, stress, particle displacement, or particle velocity in an elastic medium such as air. When this sound wave is received and processed by the ear–brain mechanism, the sensation of hearing is produced, provided two conditions are met: (i) the oscillation or vibration frequency must be within the audible range; and (ii) the SPL produced by the vibrations must be high enough, i.e. above the threshold of hearing. The ear only perceives sound signals of frequency in the range 20 Hz to 20 kHz. This is the theoretical audible range of frequencies. However, in practice most people have a narrower hearing range, with the top end decreasing with age after about age 30 years. Sound pressure may be expressed in dyne/cm2 , microbar, or Newton/m2 , where 1 dyne∕cm2 = 1 microbar = 0.1 N∕m2
(1.1)
The threshold of hearing for an average person below age 30 when the sound signal is at a frequency of 1 kHz is 0.0002 dyne/cm2 . That is, sound of 1 kHz in frequency will be heard (by this standard listener) only if the vibrations
Cone, physically attached to coil Electrical ʋ(t)
coil
input S
N
t Permanent magnet Figure 1.7
Audio output device: a loudspeaker.
Sound output
1.4 Communication System Elements
produce (in the ears) a sound pressure of at least 0.0002 dyne/cm2 . This value has been adopted as a reference PREF for expressing SPL in decibel (dB). Thus, the SPL of a sound of pressure P dyne/cm2 is expressed in dB above PREF as ( ) P dB, SPL = 20log10 PREF where, PREF = 0.0002 dyne∕cm2
(1.2)
The SPL of ambient noise in a quiet home is about 26 dB, that of formal conversation is about 62 dB, whereas that of jet planes at take-off is about 195 dB. The threshold of hearing increases irreversibly with age above 30 and is raised to a higher level in the presence of ambient noise. The second effect is transient and is referred to as noise masking. The dynamic range of the ear is typically 120 dB, bounded at the bottom by the threshold of hearing and at the top by the threshold of pain. Sounds of SPL < 0 dB are inaudible, whereas sounds of SPL ≥ 120 dB will cause pain and may result in an immediate and irreversible loss of hearing. Frequency response of the ear: whereas SPL is an objective measure of the amplitude of sound vibrations, another closely related term, loudness, is not. Loudness is the response of the human ear to the amplitude of sound waves. It is a subjective attribute that allows us to place a given sound at a point on a scale from soft to loud. The loudness level of a sound is expressed in phon, which is numerically equal to the SPL (in dB) of a 1 kHz reference tone. The sensitivity or response of the human ear to sound varies markedly with frequency. A 1 kHz sound at an SPL of 130 dB is deafeningly loud, but a 22 kHz sound at the same SPL is not loud at all. In fact, you would not hear the latter (because it is outside your audible range), although both vibrations produce the same high level of sound pressure. The ear acts like a band pass filter, shutting out all frequencies outside the audible band. Within the audible band itself, the response of the ear is far from uniform. Figure 1.8 is the ISO-226 standard curves of the SPL to maintain constant loudness at various frequencies. It shows, for example, that to produce sound of loudness 40 phon, one needs SPL of 40 dB at 1 kHz and a higher SPL of 77 dB at 30 Hz. We observe also that the ear’s frequency response is more variable when dealing with softer sound, but reasonably flat with loud sounds (SPL > 100 dB).
Sound pressure level (SPL), dB
120 110
115 120 phon 110
100
100
90
90
80
80
70
70
60
60
50
50
40
40
30
30
20
20
10
10 4.2
0 20
Figure 1.8
50
100
200
500 1k Frequency, Hz
2k
3k
7k 10k 12.5k
Contours of equal loudness (phon) as a function of frequency and sound pressure level.
25
26
1 Overview of Communication Systems
1.4.2.2 Visual Display Devices
Three broad classes of visual display devices are common as information sinks in communication systems. Cathode-ray tube (CRT) and flat panel displays present what is usually called a soft copy of the received message signal, whereas printers produce a hard copy of the message. Cathode-ray tube (CRT): cathode ray tubes display information (pictures, text, graphics, etc.) by using an accelerated beam of electrons to excite electrons in phosphor atoms, which emit visible energy as they return to their unexcited state. The basic parts of a CRT are shown in Figure 1.9. The CRT consists of an electron gun, a focusing system, a deflection system, and a phosphor-coated screen. Electrons are boiled off a hot cathode surface, a process called thermionic emission. The number of electrons emitted by the gun and hence the brightness of the display is controlled by a negative voltage on the control grid. A focusing system is needed to prevent beam spreading due to mutual repulsion among the electrons and to focus the electron beam to a small point on the screen. The focusing system may be an electrostatic lens formed by the electric field of a positively charged metal cylinder, or a magnetic lens formed by the magnetic field of a current-carrying coil mounted around the outside of the CRT envelope. The deflection system is used to direct the beam to any spot on the screen. Figure 1.9 shows an electrostatic deflection system. Magnetic deflection may be obtained using current in a pair of coils mounted on the top and bottom of the neck of the CRT envelope for horizontal deflection and another pair of current-carrying coils mounted left and right for vertical deflection. The phosphor atoms on the screen at the spot impacted by the accelerated electron beam will glow for a short time. The duration or persistence of the glow and its colour depends on the type of phosphor. To maintain an image without flicker, the entire screen is refreshed at a regular rate by repeatedly moving the electron beam along rows of the screen from top to bottom, while simultaneously varying the beam intensity to reproduce the desired image. For a colour display, each spot has three phosphor dots that glow with a red, green, and blue colour. Three electron guns are used, one for each colour dot. Any desired colour can be produced at each spot by (additive) mixing (of) these three primary colours in the right intensity. For example, a yellow colour is obtained by exciting the green and red phosphor dots equally without exciting the blue phosphor dot – that is, the electron gun aligned with the blue phosphor dot is switched off. Despite its excellent display quality, the weight, bulk, and power consumption of a CRT make it unsuitable for portable devices (e.g. pocket calculators) and the large displays required for example in high-definition television (HDTV). After the turn of the century, CRTs began to be superseded by flat panel displays. Flat panel displays: flat panel displays have smaller volume, weight, and power consumption than a CRT. They are now the preferred technology for use in a wide range of applications requiring monitors, e.g. TV, computers,
Horizontal deflection plates
Cathode
Heating filament
Focusing anode
Accelerating anode
Phosphorcoated screen
Electron beam Vertical deflection plates
Control grid Electron gun Figure 1.9
Focusing system
Basic components of a CRT.
Deflection system
1.4 Communication System Elements
mobile telephone units, etc. There are two main classes of flat panel displays. One class, called emissive display, works by converting electrical energy into light. The other class, called non-emissive display, uses an arrangement that modulates (i.e. blocks or passes to varying degrees) light from an external or internal source, thereby creating a graphics pattern. Examples of an emissive flat-panel display include a light-emitting diode (LED) display, plasma-panel (or gas discharge) display, and thin-film electroluminescent display. In a LED display, a matrix of diodes is arranged to form the screen spots or picture elements (called pixels) and a picture is displayed by applying a voltage to light the diode at each required location. Recent developments have led to widespread use of organic light-emitting diode (OLED) displays in which the electroluminescent substance is an organic compound that emits light in response to electric current. Sony produced the world’s first TV set using an OLED display screen in 2007. In addition to TV and computer screens, OLED displays are now used in a variety of portable devices, including the smartphone. In plasma-panel and electroluminescent displays, the volume between two glass panels is filled with a suitable substance – a mixture of neon and some other gases for the former, and a manganese-doped phosphor for the latter. One panel carries a series of vertical conducting ribbons, and the other carries a series of horizontal ribbons. The intersection between a vertical ribbon and a horizontal ribbon defines a display pixel. To light a pixel, firing voltages are applied to the pair of ribbons that intersect at that point. This breaks down the gas at the pixel to form glowing plasma, or, in the case of the electroluminescent display, causes the manganese atoms at the pixel to absorb electrical energy, which makes them glow. The firing is refreshed at a regular rate (e.g. 60 Hz) using the picture definition stored in a refresh buffer. A liquid crystal display is non-emissive. In this case, the volume between two glass plates is filled with a liquid crystal compound (LCC), a substance that has a crystalline arrangement of molecules but flows like a liquid. One of the glass plates contains rows of transparent conductors and a light polariser. The other plate contains columns of transparent conductors and a light polariser at right angles to the polariser in the other plate. The intersection of a row conductor with a column conductor defines a pixel. A pixel is turned on when the LCC molecules at its location are unaligned and is turned off when these molecules are forced to align by the application of a voltage. This happens because the unaligned molecules at the pixel twist the polarised light, causing it to pass through both polarisers. On the other hand, if a voltage is applied to two conductors it aligns the LCC molecules at their intersection and prevents the twisting of the light so that it cannot pass through both polarisers, thereby becoming blocked. Printers: impact printers produce hard copies by pressing the face of formed characters against an inked ribbon onto the paper. The characters are usually formed using a dot-matrix print head, which has a rectangular array of protruding pins, by retracting appropriate pins. The quality of the print depends on the dot size and on the number of dots per inch (or lines per inch) that can be displayed. Various nonimpact printing techniques have been devised. Laser printers create a charge distribution on a rotating drum using a laser beam. Ink-like toner particles are then emitted from a toner cartridge onto the drum. These particles adhere only to the charged parts of the drum. The particles are then attracted from the drum to the paper and pressed into the paper by heated rollers. Ink-jet printers use an electric field to deflect an electrically charged ink stream to produce dot-matrix patterns on paper. Electrostatic printers charge the paper negatively at the right spots. These spots then attract a positively charged toner to create the required picture or text. Electrothermal printers apply heat in a dot-matrix print head to create the required pattern on a heat-sensitive paper. Colour printing is obtained by mixing the right quantity of the colour pigments cyan, magenta and yellow. For example, equal mixture of magenta and cyan gives blue and equal mixture of all three pigments gives black. Laser colour printers deposit the three pigments on separate passes to produce a range of colour patterns, while ink-jet colour printers shoot the three colours simultaneously. Finally, we should mention that 3D printing has become commonplace in recent years. If viewed as an additive manufacturing process then a 3D printer is not a display device. However, if seen as a process to produce a physical
27
28
1 Overview of Communication Systems
3D object from information stored on a computer then a 3D printer qualifies as a display device. A 3D printer produces a physical object through layer upon layer deposition of material. Information stored on the computer specifies the model to be produced, including the material to be deposited at each voxel (i.e. 3D pixel). 1.4.2.3 Storage Devices
Many communication services are designed to store the received message signal for visual or audio display later. In email, the text message is stored at the recipient’s email server until the recipient accesses the server to obtain a soft copy (screen display) or hard copy (printout) of the message. Voicemail service provides for the storage of digitised speech message, which is played back later to the authorised recipient at their request. In these services, the information sink (in Figure 1.5) is an appropriate storage medium. The desirable characteristics of a storage medium include large storage capacity, high access and data-reading speeds, portability, durability, low cost, data integrity, and reusability. Four types of storage media are most used as information sinks in communication systems. Magnetic tape: this is made of a thin coating of a film of magnetic metal or ferromagnetic particles (embedded in a binding material) on a plastic strip of width 6–50 mm and length up to 1.5 km. The overall thickness of the tape (magnetic coating and substrate strip) is about 15–60 μm. The tape is contained in a two-reel cassette and is read from and written to in a tape drive or deck. The drive winds the tape from one reel to the other, causing it to move past a read/write head. Magnetic tapes offer large storage capacity and come in a variety of incompatible formats. The leading brand, known as Linear Tape-Open (LTO) Ultrium, offers a capacity of up to 30 terabytes (TB) when data are compressed. Other advantageous features of the magnetic tape include that they are reusable and easy to edit, compact, and portable. However, they have several demerits, which limit their use mostly to data archiving and hard disk backup. They are easily damaged through breaking or deterioration if exposed to heat or humidity. Recorded information degrades with continuous use, due to tape wear. Information can also be ruined by strong magnetic interference. Furthermore, access to data on the tape is slow and sequential. Magnetic disk: magnetic disks may be fixed or removable. A flat-disk substrate is coated with a thin film of magnetic material. The disk is either a floppy disk, with a flexible substrate, or a hard disk, also known as a hard disk drive (HDD), with a rigid substrate. The surface of the disk is divided into several evenly spaced concentric tracks and each track is divided into sectors. Data are stored in these sectors. During a read/write operation, the disk is rotated at constant speed and the read/write head is moved in a radial direction to position it at the desired track. This radial movement is referred to as seeking. Floppy disks are magnetically coated on both sides and contained within a plastic with a spring-loaded door that is forced open when the disk is inserted in its drive. An HDD, including the disk and read/write head, is hermetically sealed to prevent contamination by dust particles and the head glides on a thin film of air without making actual contact with the disk. A PC hard drive usually consists of several disks (with a head for each disk) mounted on the same rotating axle. In this case, the set of tracks at the same radius on all the disks is called a cylinder. HDDs have two standardised form factors. The 3.5 in. disk has dimensions (length × width × height) = 146.99 mm × 101.85 mm × 26.11 mm, whereas the 2.5 in. disk has standard length 100 mm and width 69.85 mm but they have been produced in a variety of thicknesses ranging from 5 to 19 mm. HDD storage capacities have grown significantly over the years and is currently as high as 14 TB and 5 TB, respectively, for the 3.5 in. and 2.5 in. drives. Magnetic disks have several advantages. Access time is less than 28 ms in high-speed drives. Random data access is possible using an index or directory (stored at a specific location on the disk) that lists the physical location of all data objects. Magnetic disks are reusable, cheap, and easy to write to, read from, or edit. The removable hard disks and floppy disks are also portable. However, the magnetic disks can fail, data on the disk becoming unreadable if certain critical sectors are damaged or otherwise corrupted. Floppy disk drives are much slower than hard drives and have very limited storage. Floppy disks came in two standard sizes of 5.25 in. (with an initial storage capacity of 360 kB, later improved to 1.2 MB) and 3.5 in. (with a storage capacity of 1.44 MB). They were extremely popular and ubiquitous as a portable storage medium for
1.4 Communication System Elements
computer files in the 1980s and 1990s, but at the turn of the century they were superseded by more rugged and higher-capacity portable storage media such as the compact disk and the USB memory stick, and from around 2006 computers were no longer manufactured with floppy disk drives. To find a higher-capacity and compatible replacement for the floppy disk, efforts were made in the 1990s to introduce the 3.5 in. floptical disk, which used an optical tracking mechanism to improve head positioning and with a data storage capacity of up to 120 MB. These efforts ultimately failed in the market due primarily to the success of the optical disk. Optical disk: optical disks include CD-R, magneto-optical, DVD-R, and DVD-RAM discussed below. The compact disk (CD) is based on a 12 cm plastic disk that stores data in the variation of the reflectivity of densely packed spots on the disk surface. The disk is read by bouncing a laser beam off the surface. Both the compact disk recordable (CD-R) and the digital versatile disk recordable (DVD-R) are write-once, read-many. On the other hand, the magneto-optical disk and the DVD random-access memory (DVD-RAM) can be erased and rewritten. CDs and DVDs have numerous advantages. They provide a very large storage capacity, 650 MB for a CD and up to 17 GB for a double-sided, dual-layer DVD. Data stored on a CD-R cannot be altered later, which ensures the integrity of archived information. Optical disks are very durable as the reading process is without physical contact with the disk and so does not cause any mechanical wear. Stored information can be read countless times with practically no degradation. There is ample provision for error correction and the error rate is below 10−15 . The disk is cheap, portable, compact, and not easily damaged, and the recording technique is immune to electric and magnetic fields. The main disadvantages of optical disks are that most of them, except magneto-optical disks and DVD-RAM, are not reusable. Once data have been written, they cannot be changed. Secondly, unlike magnetic disks that use the same head for reading and writing, a special disk writer is required to write information onto an optical disk. Thirdly, CD access times are about 10–20 times longer than that of magnetic hard disks. Solid state disk (SSD): an SSD, also called a flash drive, stores data in non-volatile semiconductor memory cells made from metal-oxide-semiconductor field-effect-transistor (MOSFET) and fabricated as integrated circuit (IC) chips. Unlike magnetic hard disks, they have no moving parts, run silently, and have much quicker access times. The 2.5 in. SSD has storage capacity which may be as high as 4 TB. Since around 2006, SSDs have grown in popularity as the preferred storage medium in high-end notebooks, laptops, and PCs.
1.4.3 Transmitter The transmitter is essentially a signal processor. Its primary role is to transform the message signal into a form that best suits the transmission medium, complies with any regulations governing the communication service, and meets system design objectives. How the transmitter does this depends on the type of information source, transmission medium, and communication system. For example, if the transmission medium is radio then the signal processing tasks of the transmitter would necessarily include modulation and radiation. The transmitter would process the signal in such a way as to ensure that it is transmitted in the authorised frequency band and at the permitted transmitted power level. If the main design objective is to minimise the cost of receivers then a simple AM would be preferred to a more complex digital signal processing technology. Subject to the above considerations, the signal processing performed by the transmitter may include one or more of the following processes. A detailed discussion of most of these processes can be found in later chapters of the book. Source coding: also called source encoding, this deals with the efficient representation of the message signal. Several processes may be involved, depending on the type of communication system. The following are examples of source coding. ● ●
Lowpass filtering to limit the bandwidth of the message signal. Multiplexing to combine several message signals for simultaneous transmission.
29
1 Overview of Communication Systems ●
●
●
Message formatting to represent the message in digital format. This includes character coding of text and graphics using a suitable character code, such as Unicode, and analogue to digital conversion (ADC) of an analogue message signal, such as voice and video. Data compaction to minimise the number of symbols or bits that represents the message. Examples of data compaction include variable-length codes (e.g. Morse code and Huffman code), which assign shorter codewords to more frequent source symbols, and the Lempel–Ziv algorithm, which uses a codebook built from the message. Lossy data compression to reduce the number of bits used for representing the message by removing some details considered insignificant because a human observer will not perceive their absence. Examples of lossy data compression include predictive coding (e.g. differential PCM and delta modulation), low bit rate speech coding, transform coding, motion compensation, subsampling, colour table, and truncation.
Encryption: the message – in this context referred to as plaintext – is processed at the transmitter and converted into a ciphertext, which is disguised in some way to ensure security. The process may be accomplished by scrambling the bits of the plaintext using an encryption key. Only authorised receivers will have the correct decryption key with which to decipher or decrypt the ciphertext back to its original plaintext. This process is illustrated in Figure 1.10. Channel coding: also called channel encoding, this process is necessary to ensure that the message signal is compatible with the transmission medium (or channel). It attempts to protect the message signal against channel distortion and transmission errors. It may add some redundancy to the transmitted symbols. Examples of channel coding include pre-emphasis/de-emphasis in analogue systems (e.g. FM), line coding in digital baseband systems, and provision for error detection and correction at the receiver. In fact, the process of carrier modulation introduced next is also a form of channel coding, which is required for some transmission media such as radio. Channel coding is, however, often defined very narrowly as just error control coding, a simple example of which is single parity check in which the message bits are taken k bits at a time and one extra bit (called parity bit) is added to produce a codeword of k + 1 data bits having an even number of binary 1 (for even parity check) or an odd number of binary 1 (for odd parity check). With this scheme, the receiver can detect the occurrence of a single bit error or any odd number of bit errors in a codeword. However, if two or more even numbers of bit errors occur in a codeword, this will go undetected. Nevertheless, this simple scheme does lead to a significant reduction in the number of undetected errors. For example, if a system incurs on average one character error per transmitted page
Encryption key
Decryption
Decryption key
Figure 1.10 receiver.
Ciphertext
Encryption Plaintext
30
Encryption at the transmitter and decryption at the
1.4 Communication System Elements
Location S
Location R Location S
(a)
Transmitter
Receiver
(b)
Transceiver
Transceiver
transmitting OR receiving
receiving OR transmitting
Transmitter
Receiver
Receiver
Transmitter
(c)
transmitting AND receiving
Location R
(d)
(e)
(f)
receiving AND transmitting
Figure 1.11 Communication systems: (a) Simplex; (b) Half-duplex; (c) Full duplex; (d) Point-to-point; (e) Point-to-multipoint; (f) Multipoint-to-multipoint.
of text, the introduction of single parity check with k = 7 leads to the number of undetected errors being reduced to one character in about 2500 pages of text on average. A large variety of error control coding schemes have been devised over the years with capabilities that far exceed those of the single parity check, allowing not only the detection of a wide pattern of bit errors but also the reliable correction of one or more bit errors in a codeword. Carrier modulation: this process translates the message signal frequencies from baseband into a frequency band that is suitable for and allows efficient exploitation of the transmission medium. Spread spectrum modulation: this involves the transmitted signal being deliberately spread out over a wide frequency band. This technique was previously designed for military communications to protect against frequency-selective fading, interference, and intentional jamming; but it is now also employed as a multiplexing and multiple access technique in non-military communications. Radiation: in a wireless communication system, the message signal must be processed and radiated as electromagnetic waves in the desired direction. The radiating element required depends on the type of wireless system. In general, antennas of various designs are used for radio communication systems and light-emitting diodes or laser diodes are employed in infrared and optical communication systems.
1.4.4 Receiver A receiver is required at the other end of the transmission medium (i.e. destination of the transmitted signal) to process the received signal to obtain a close estimate of the message signal and deliver this to the information sink. Note that the receiver’s input comes from the original message signal after it has been processed by two
31
32
1 Overview of Communication Systems
blocks of the communication system, purposely in the transmitter and undesirably in the transmission medium. The receiver undoes the processing of the transmitter in a series of steps performed in reverse order to that of the transmitter. This may include one or more of the following tasks. ● ● ● ●
● ● ●
Radio reception using a receive-antenna to convert the received electromagnetic waves back to a voltage signal. Spread spectrum demodulation to remove the carrier spreading. Carrier demodulation to translate the signal back to baseband frequencies. Clock extraction and synchronisation to recover the clock signal (if any) used at the transmitter in order to use the same timing intervals for operations at the transmitter and the receiver. Channel decoding to correct errors, remove redundancy, etc. Decryption to recover the plain and undisguised symbol sequence. Source decoding to recover the original message signal from the symbol sequence. This may involve demultiplexing – which breaks up a composite (multiplexed) signal into the components belonging to each of the multiple users – digital-to-analogue conversion (DAC), lowpass filtering, etc.
1.5 Classification of Communication Systems There are four major ways of classifying the wide variety of communication systems in operation today. If we consider the direction of transmission then the communication system will be either simplex or duplex. If we consider the type of signal transmitted by the system then we have either an analogue communication system or a digital communication system. Comparing the frequency content of the transmitted signal to that of the original signal leads to a classification of the system as either a baseband communication system or a modulated (or bandpass) communication system. Finally, in terms of how a transmission path is established to convey the signal from source to destination, we may have either a circuit-switched or packet-switched system.
1.5.1 Simplex Versus Duplex Communication Systems If information flows in only one direction, the communication system is referred to as simplex (SX). Typically, information originates from one transmitter and has one (or more) receiver as its destination. These receivers do not have the capability of responding through the same communication link. By the above explanation, interactive digital television broadcast is a simplex communication system. Although a customer can respond (via a telephone link), information transmissions from the television house to the customer and from the customer to the television house are carried out on two different links. Radar, a system that measures electromagnetic signals reflected off objects (such as ship, aircraft, missile, storm, hydrometeor, etc.) in order to determine their range and velocity, is also a simplex system. In this case, although the signal may be reflected by the object back towards the transmitter S, giving an appearance of bi-directional information flow, there is strictly speaking only one transmitter, which is co-located with one receiver. The reflecting object is neither a transmitter nor a receiver. Other examples of simplex systems include audio broadcast (AM radio, FM radio, music services, and digital radio), television broadcast (satellite, cable, and terrestrial), paging services, telemetry, and remote control. When information can flow in both directions on the same link, the communication system is referred to as duplex. The communication equipment at both locations S and R is equipped with transmission and reception capabilities and is therefore referred to as a transceiver – a name that is a portmanteau of the words transmitter and receiver. The system may be designed in such a way that simultaneous communication in both directions is possible. This is a full duplex (FDX) system and requires a separate channel (e.g. a different band of frequencies, a different time slot, or a different wire pair) being allocated for each direction of communication. If, on the other hand, information can only flow in one direction at a time then the system is referred to as half -duplex (HDX).
1.5 Classification of Communication Systems
The most common example of a FDX system is public telephony (both fixed and mobile). Broadband communication via DSL in the PSTN is also FDX, as is computer interconnection in local area networks (LANs). An example of an HDX system is the walkie-talkie used for wireless voice communication between two locations. Transmission in either direction uses the same radio frequency band. The handset at each location can be switched between transmit and receive modes, so that at any given time one location transmits while the other receives. In both simplex and duplex systems, if communication takes place between only two transceivers then the system may be further described as a point-to-point communication system. If there is one transmitter or transceiver communicating with several receivers or transceivers, we have a point-to-multipoint system. If there are many intercommunicating transceivers (as in a LAN, or a video conference system linking more than two locations) then we have what is called a multipoint-to-multipoint communication system. In this last case, information flow between two transceivers or nodes is essentially bi-directional and therefore a simplex multipoint-to-multipoint system is not possible. The Internet is a multipoint-to-multipoint communication system. A radio or television broadcast system is a good example of a point-to-multipoint simplex system. See Figure 1.11. An important limitation of simplex systems is that there is no return path for the receiver to automatically request re-transmission in the event of an error. Thus, there are two options to deal with the problem of errors: (i) ignore them in noncritical systems or (ii) include sufficient redundancy in the transmitted signal so that, in the event of an error, the receiver can make a reliable guess of the transmitted information. The guess may occasionally be wrong. This system of error correction is called forward error correction (FEC). Duplex systems are not under this limitation and can also use the more reliable technique of error correction called automatic repeat request (ARQ), in which the receiver automatically requests retransmission once an error is detected.
1.5.2 Analogue Versus Digital Communication Systems An important classification of a communication system, from the point of view of the technology involved in its implementation, is based on the type of information signals conveyed, whether analogue or digital. Analogue communication systems convey analogue signals, while digital systems convey digital signals. However, note that a digital communication system may employ an analogue waveform called a carrier signal to convey the digital signal, as in the use of modems for data transmission over public telephone networks, which were originally designed for analogue speech signals. We will defer an introduction to analogue, digital, and other types of signals until the next chapter and focus here on discussing the advantages of digital over analogue communication, which are so great that even analogue signals such as speech and video are transmitted digitally by first converting them to a digital format at the transmitter. This analogue-to-digital conversion (ADC) at the transmitter is followed at the receiver by the reverse process of digital-to-analogue conversion (DAC). The advantages of digital communication over its analogue counterpart include: ●
●
Low cost: inexpensive digital circuits may be used to implement the signal processing tasks required in digital communication. With continuing advances in semiconductor technology, the cost of VLSI circuits will drop even further, making digital communication systems cheaper than their analogue counterparts despite the simplicity of the latter. Cost is an important factor that determines whether a new technology is successfully assimilated into society. The digital revolution of the 1990s, which we referred to earlier, was driven to a large extent by the falling costs of digital circuits of increasing computational power. Privacy and security: increased reliance on telecommunication systems for private, business, and military communications and the sale of entertainment and information services calls for secrecy, authenticity, and integrity. The first requirement ensures that the information is received only by an authorised user, whereas the last two requirements assure the receiver that there has not been any impersonation of the sender and that the information has not been deliberately or accidentally altered in transit. Digital communication permits data encryption to be easily implemented on the information bit stream in order to satisfy these requirements.
33
34
1 Overview of Communication Systems ●
●
●
●
●
●
Dynamic range: the dynamic range of a communication system refers to the amplitude ratio between the strongest and weakest signals that the system can process with an acceptably low level of impairment. Signals of a wider range of values (from very small to very large) than is possible with analogue systems can be accurately represented and transmitted with negligible distortion in digital communication systems. The dynamic range may be increased as much as desired by increasing the number of bits used to represent each sample of the analogue signal during the ADC process. The penalty, of course, is increased bandwidth requirements. Noise immunity: the number of errors in the received data, or the bit error ratio (BER), may be very small even in the presence of a significant amount of noise in the received signal. Although the precise value of the received signal will be changed by additive noise, the change will only rarely be large enough to force the signal value beyond the range that represents the transmitted bits. Regeneration: digital communication systems allow the possibility of regenerating (at sufficiently closely spaced regenerative repeaters) clean new symbols or pulses, which are free from all impairment effects and are (ideally) an exact replica of the original transmission. Thus, unlike analogue systems, noise does not accumulate from repeater to repeater and no further signal distortion occurs beyond that which was introduced at the analogue-to-digital conversion stage. Digital signals may also be stored in various storage media (e.g. optical or magnetic disks) and processed or re-transmitted later without loss of fidelity. Error correction: it is possible to detect and even correct errors in the received data using various coding techniques, which generally insert some redundancy in the transmitted data. Flexibility: the signal processing tasks of a digital communication system may be readily reconfigured simply by changing the software program, without any need to change the hardware. Modification of system functions can therefore be implemented more cheaply and speedily. Integrated services: voice, video, and data can all be represented in a common bit stream format and transmitted simultaneously in a common communication system. Multimedia communication, the Internet, and a host of other modern communication services are only feasible through digital technology.
Digital communication also has several disadvantages. However, in most cases, these disadvantages are under the system designer’s control, and their effects may be reduced as much as desired by making a suitable trade-off. It should therefore be noted that the following disadvantages are far outweighed by the advantages discussed above, and this accounts for the transformation of global telecommunication into an all-digital network at the turn of the century. ●
●
●
Large bandwidth: digital communication systems may require more bandwidth than analogue systems. For example, a 4 kHz bandwidth is adequate for analogue speech transmission, whereas digital speech transmission using standard (64 kb/s) PCM requires a minimum bandwidth of 32 kHz. The spectrum of all transmission media, especially radio, is limited. Transmission techniques that minimise the required bandwidth are therefore preferred in order to increase the number of services, users, and bit rate per user that can be accommodated. Various low bit rate speech coding and data compression techniques have been devised to reduce the bandwidth requirements of digital audio and video transmission and storage at the price of a somewhat reduced signal quality. Complexity: digital communication systems generally perform more complex processing operations on the input signal and require more sophisticated circuitry. Synchronisation usually must be maintained between receiver and transmitter. However, advances in semiconductor technology make circuit complexity a less significant disadvantage. Most of the signal processing tasks may be performed in a single highly reliable and affordable VLSI unit, which can be easily replaced in the unlikely event of a malfunction. Furthermore, some digital transmission techniques, such as the asynchronous transfer mode (ATM) – discussed in Chapter 13 – make synchronisation a less challenging issue. Quantisation distortion: analogue signals such as speech must be converted to digital form prior to transmission or processing in a digital communication system. This conversion introduces an irreversible quantisation distortion. However, this distortion may be made as small as the system designer wishes by increasing the number of quantisation levels. The price for this improvement is increased bandwidth requirements.
1.5 Classification of Communication Systems
1.5.3 Baseband Versus Modulated Communication Systems Communication systems may also be classified as either baseband or modulated, depending on how the information signal is prepared for the transmission medium. When the transmission medium can pass signals of frequency around 0 Hz and upwards, it is described as a lowpass medium. The original information signal (called the baseband signal since it usually contains frequencies near 0 Hz) may be conveyed to the receiver by being placed directly into the lowpass medium without any frequency translation. Such a communication system is referred to as a baseband communication system. The only practical lowpass media are wire pair and coaxial cable. Thus, baseband systems are always wireline connected systems. Although the radio spectrum stretches all the way down to frequencies near 0 Hz and it is theoretically possible to convey a baseband signal (e.g. speech signal) by radio, this is not feasible in practice. Because it contains low-frequency components, the baseband signal cannot be efficiently placed into the radio medium at the transmitter or extracted from it at the receiver. We will have more to say on this later. 1.5.3.1 Analogue Baseband Communication System
The simplest example of baseband transmission is the voice intercom system. The baseband voice output signal of a microphone is conducted along a wire pair to a loudspeaker at some distance. Other examples of analogue baseband communication systems include the fixed (i.e. nonmobile) telephone connection between two local subscriber terminals in the PSTN and a closed-circuit television (CCTV) system. It could be argued that the communication system block diagram of Figure 1.5 does not apply in this case, since there is neither transmitter nor receiver and the transmission medium links the message signal directly from information source to information sink. However, in most cases a transmitter is incorporated that performs very basic signal processing such as frequency-dependent amplification to compensate for attenuation and distortion in the transmission medium. The identifying feature of a baseband system is that any type of frequency translation is specifically excluded. The system usually includes a receiver that performs lowpass filtering to minimise the effect of noise, and amplification to provide for output volume control. A block diagram of a CCTV system is shown in Figure 1.12. A lens in the video camera forms an image of the scene on a light-sensitive camera sensor, which converts the light intensity into an electrical signal called the video signal. This has a peak-to-peak value of 1 V and contains frequencies in the range 0–10 MHz, depending on the horizontal resolution of the camera. It is obvious that the video signal is a baseband signal. For a large distance between camera and console sites, a video amplifier is inserted to boost the video signal level. If there is only one camera then its output is connected directly via coaxial cable to a monitor. Otherwise, the outputs of several cameras are connected to a video switcher. This is an electronic device that selects different cameras automatically or manually for display on one or more monitors. The monitor is like a television set but has no tuning/demodulation circuits and therefore cannot receive a television broadcast off air. The monitor is simply a CRT or other type of visual display device that converts the baseband video signal into a visible display on its screen. A VCR may also be used for a permanent record of the video signal, and a printer to obtain a hard copy of a selected scene. There is no frequency translation in this CCTV system, and therefore it is a baseband system. There are some implementations of a CCTV system that use radio, infrared, or optical fibre connection between camera and monitor. Such systems must necessarily implement some form of carrier modulation and therefore are not baseband systems. One of the main difficulties with analogue baseband transmission is that a separate transmission medium (wire pair or coaxial cable) is required for each signal. One medium cannot be shared simultaneously by multiple users without interference, since each user’s signal is continuous in time and occupies (or at least partly overlaps) the same frequency band. The cost would be prohibitive to provide a separate wire pair for all anticipated simultaneous telephone calls between two exchanges in the PSTN. If the analogue signal is transformed into a discrete-time function then an analogue baseband transmission system can be implemented that has the capability to accommodate multiple users on one link.
35
36
1 Overview of Communication Systems
Camera sites Video camera (Location 1)
Console site
Video amplifier Monitor 1
Video camera (Location 2)
Video amplifier
Video switcher
Monitor 2
Video Printer Video camera (Location N)
Video amplifier VCR
Figure 1.12
Closed-circuit television system (CCTV). Example of analogue baseband transmission system.
1.5.3.2 Discrete Baseband Communication System
If we take regular samples of an analogue signal v(t) at intervals of T s to obtain a sequence of continuous-value samples v(nT s ) then we can perfectly reconstruct the original signal v(t) from the samples v(nT s ) by passing these samples through a suitable lowpass filter (LPF), provided that T s is no more than half the period of the highest frequency component of v(t). This is a statement of the sampling theorem. A discrete baseband communication system transmits v(t) by sending one voltage pulse at the instance of each sample v(nT s ). The value of each sample may be conveyed in the amplitude, duration, or position of this pulse. PAM, PWM, and PPM: if the pulses are sent with equal width at regular intervals T s but the height of the nth pulse (occurring at time t = nT s ) is varied in proportion to the sample value v(nT s ) then we have what is referred to as pulse amplitude modulation (PAM). If the pulses are sent with equal height at regular intervals T s but the duration or width of the nth pulse is varied in proportion to v(nT s ), we refer to this as pulse duration modulation (PDM), also called pulse width modulation (PWM). Finally, sending the pulses with equal height and width but at irregular intervals such that the time of occurrence or position of the nth pulse is delayed relative to the nth sampling instant by an amount that is proportional to v(nT s ) gives rise to pulse position modulation (PPM). Figure 1.13 illustrates these waveforms. In Figure 1.13d, the sampling instants are indicated by thin vertical lines. Note from this figure that the longest delay 𝜏 max occurs at the third pulse where the sample value is maximum. The first pulse corresponds to the smallest sample value and has the shortest delay 𝜏 min . The remaining pulses have delays ranging between 𝜏 min and 𝜏 max . In this way, information regarding the sample values is correctly conveyed by the positions of transmitted pulses. The block diagram of a PAM generator and receiver is shown in Figure 1.14 and a suitable arrangement for generating PDM and PPM waveforms is given in Figure 1.15. Signal waveforms have been sketched at various points of these block diagrams in order to clarify the function of each element. The waveform of v(t) is as earlier shown in Figure 1.13a and has been omitted. The effect of the transmission medium is also not shown. Note in
1.5 Classification of Communication Systems
v(t)
(a) t vPAM(t)
(b) t
Ts vPDM(t)
(c) t vPPM(t)
(d)
τmin Figure 1.13
v(t)
t
τmax
Analogue signal and its discrete representation as PAM, PDM, & PPM.
Sample and hold
vPAM(t)
Transmission medium
vPAM(t)
Clock (Ts) Figure 1.14
Block diagram of a PAM generator and receiver.
Lowpass filter
v(t)
37
38
1 Overview of Communication Systems
vPAM(t)
v(t) Clock (Ts)
Sample and hold
Ts
vPAM(t) vtrn(t) Triangle waveform generator vtrn(t)
vtPAM(t) Vr
Σ vtPAM(t) Vr
vPDM(t) – + Comparator vPDM(t) vPPM(t) Monostable multivibrator vPPM(t)
Figure 1.15
Block diagram of PDM and PPM generator.
Figure 1.14 the simplicity of the discrete baseband system, especially the PAM receiver, which is just an LPF. This filter, often referred to as a reconstruction filter, may have a frequency response that is shaped in such a way as to correct for a small distortion due to the action of the sample-and-hold circuit. PWM and PPM signals can be received (i.e. converted back to the original analogue signal) by using an integrator to convert pulse width or pulse position to voltage level. This process essentially converts PWM and PPM to a PAM waveform, which is then processed in an LPF to recover the original signal. The following salient features of a discrete baseband system should be noted. ●
●
Although it is sometimes erroneously viewed as a form of modulation employing a pulse train as the carrier, a discrete baseband communication system is actually a baseband transmission because the spectra of PAM, PDM, and PPM signals contain frequencies down to 0 Hz. In fact, the spectrum of an instantaneously sampled PAM signal is the spectrum of the original (baseband) signal plus exact duplications at regular intervals along the frequency axis. A discrete baseband system is an analogue communication system since the parameter of the pulse train that is varied may take on a continuum of values in a specified range. For example, the precise value of the pulse height is significant in a PAM system and any variation due to noise will distort the received signal.
1.5 Classification of Communication Systems ●
●
●
●
The bandwidth required to transmit a discrete baseband signal (PAM, PDM, or PPM) far exceeds the bandwidth of the original analogue signal. An instantaneously sampled PAM signal has infinite bandwidth. A discrete baseband signal fills up all the bandwidth available on the transmission medium. To share the medium among multiple signals, each must be sent in a separate time slot. PAM is very susceptible to noise. PDM and PPM have a better noise performance, like that of FM systems but are inferior in this respect to digital baseband transmission systems. If the pulses were perfectly rectangular then PDM and PPM would be completely immune to additive noise, as this would only alter the unused height parameter, without affecting the zero crossings of the pulse which determine pulse width or pulse location. Unfortunately, perfectly rectangular pulses, with zero rise time, are not only impossible to generate, they are impossible to maintain in a lowpass transmission medium. PDM is wasteful of power compared to PPM. Long pulses in PDM expend more power but carry no additional information. PPM on the other hand transmits pulses of equal energy. The main advantage of a discrete baseband system is that transmission medium sharing by multiple users, known as multiplexing, is possible. The intervals between the samples of one signal are used to transmit samples of other signals. This type of multiplexing is known as TDM and is further discussed below.
TDM: Figure 1.16 shows the block diagram of an N-channel TDM system that allows simultaneous transmission of several independent PAM signals over a single transmission medium. The commutator is shown as a rotating arm simply for the purpose of an easier illustration of the sampling and interleaving process. It is an electronic Message inputs
LPF
LPF
ʋ1(t)
LPF
ʋ2(t)
Band-limiting filters
ʋN(t) Commutator
synchronisation
Transmission medium
ʋTDM(nTs/N)
ʋTDM(nTs/N) Decommutator ʋ1 (nTs)
LPF ʋ1 (t)
ʋ2 ((n +1/N)Ts) LPF ʋ2 (t)
ʋN ((n + (N – 1)/N)Ts)
LPF
Reconstruction filters
ʋN (t)
Message outputs Figure 1.16
Block diagram of an analogue TDM system.
39
40
1 Overview of Communication Systems
(a)
ʋ1(0)
ʋ2(Ts/3) ʋ3(2Ts/3)
ʋ1(Ts)
ʋ2(4Ts/3) ʋ3(5Ts/3)
One frame Duration = Ts
time, nTs/3
ʋ1 (t)
ʋ1
ʋ1 (nTs)
(b)
ʋ2 (t)
ʋ2
ʋ2 (( n +
1
3
) Ts)
(c)
ʋ3
ʋ3 (t) ʋ3 (( n +
2
3
) Ts )
(d)
ʋTDM (nTs/3)
(e)
0 Figure 1.17
Ts
2Ts
3T s
4Ts
5Ts
6Ts
7Ts time
3-Channel TDM.
switching circuit that samples each input at a rate f s = 1/T s and interleaves the N samples inside the sampling interval T s . LPFs remove insignificant high-frequency components of the input signals and limit their bandwidth to at most f s /2. These bandlimited analogue waveforms can then be correctly reconstructed at the receiver by passing their respective sequence of samples through an LPF, as shown in Figure 1.16. The waveforms referred to in Figure 1.16 are sketched in Figure 1.17 for N = 3. It must be emphasised that this is an analogue system. Its main advantage of simplicity is outweighed by its susceptibility to noise and distortion and therefore the analogue TDM system shown in Figures 1.16 and 1.17 is rarely used in practice. However, this system forms the basis for the TDM of digital signals, which has now become a ubiquitous technique in telecommunication systems. Some of the features of TDM, whether of PAM or digital signals, include: ●
The bandwidth requirement of an N-channel TDM signal expands by a factor N, the number of multiplexed signals or channels. This happens because N samples are squeezed into one sampling interval, reducing the sampling pulse period by a factor N and hence increasing its frequency by the same factor. In practice, the bandwidth increase will be slightly larger than the factor N since some time slots must be reserved for system management and synchronisation.
1.5 Classification of Communication Systems ●
●
●
The transmitter and receiver must be synchronised in order that the interleaved samples in the TDM signal are correctly distributed by the decommutator to their respective channels. TDM is sensitive to transmission medium dispersion, which arises because the transmission medium differently attenuates or delays various frequency components of the transmitted pulse. As a result, the pulse may broaden out sufficiently to overlap adjacent time slots, an undesirable situation known as intersymbol interference (ISI). TDM is, however, immune to system nonlinearity as a source of crosstalk between independent signals, since at any given time instant only one signal is present. This feature is an important advantage that allows amplifiers to be operated near their maximum rating, a typically nonlinear region.
1.5.3.3 Digital Baseband Communication System
Digital signals, originating from coded textual information or from digitised analogue signals, may also be transmitted at baseband. This is a more common type of baseband transmission as it has all the advantages of the digital communication systems discussed earlier. The baseband transmitter performs line coding on the input bit stream, and the baseband receiver has the task of decoding the received (and channel-distorted) coded waveforms back into the original bit stream with minimum error. Before looking at these two operations, let us first consider how an analogue signal is converted into a digital bit stream. Analogue-to-digital conversion (ADC): various techniques have been devised to convert analogue signals to digital. The earliest and still widespread technique is pulse code modulation (PCM). Figure 1.18 shows four signal processing steps involved in converting an analogue signal to a PCM signal. The LPF removes nonessential frequency components in the message signal and limits the highest frequency component to f max . This serves to reduce transmission bandwidth and to ensure that the sampling theorem is satisfied when the message signal is sampled (in the sample-and-hold circuit block) at a manageable rate f s ≥ 2f max . Next, the continuous-value sample sequence (PAM signal) is converted to a discrete-value sequence. This process is known as quantisation. It introduces irrecoverable errors as each sample is approximated to the nearest of a set of quantisation levels. In a practical system with many quantisation levels, the errors are referred to as quantisation noise since their effect is like that of white noise. If quantisation levels are uniformly spaced, we refer to this as uniform or linear quantisation and to the signal conversion process simply as uniform ADC. The term PCM is usually applied to the case of nonuniform quantisation, which spaces the quantiser levels nonuniformly in such a way as to make the signal-to-quantisation-noise ratio (SQNR) approximately constant over all analogue signal values. Finally, the encoder converts each quantised signal level to a unique binary word of length k bits. Note that k bits can be used to uniquely represent N = 2k quantisation levels. Taking f s samples of the signal per second and representing each sample with k bits, the number of bits generated each second, or the bit rate, is given by Bit rate = kf s bits per second (b∕s)
(1.3)
In PCM telephony, f s = 8000 samples per second (or Hz) and k = 8 bits/sample, yielding a bit rate of 64 kb/s. In certain bandwidth-limited communication services such as mobile communications, this PCM bit rate requirement is excessive. Reducing the redundancy inherent in a PCM signal allows the bit rate and hence bandwidth and storage requirements to be significantly reduced. In a technique called differential pulse code modulation (DPCM), it is the difference e(nT s ) between the actual sample and a predicted value that is quantised and encoded, rather than Analogue signal Message signal Figure 1.18
Anti-alias LPF
PAM signal Sample and hold
Quantisation
Digital signal Encoding
Digitisation of an analogue signal: PCM signal generation.
PCM signal (Bit stream)
41
42
1 Overview of Communication Systems
the sample itself. If the predictor is properly designed and an adequate sampling rate (f s = 1/T s ) is used then the range of e(nT s ) will be very small, allowing fewer quantisation levels and hence a smaller k (bits/sample) to be used to achieve a SQNR comparable to that of a PCM system. Assuming the same sampling rate as in PCM, we see from Eq. (1.3) that the bit rate of a DPCM system will be lower than that of a PCM system of comparable SQNR. The ITU-T has adopted for voice telephony a 32 kbit/s DPCM system, obtained by using only k = 4 bits to code each sample taken at the rate f s = 8 kHz. A 64 kbit/s DPCM system has also been adopted for wideband audio (of 7 kHz bandwidth). This uses k = 4 and f s = 16 kHz. Further bit rate reduction can be achieved by using sophisticated data compression algorithms and, for the digitisation of speech, a range of methods known as low bit rate speech coding. These techniques are inherently lossy. They exploit the features of the message signal, and the characteristics of human hearing and vision, to eliminate redundant as well as insignificant information and produce a modified signal of greatly reduced bit rate. After this signal has been decompressed or otherwise processed at the receiver, a human observer finds it acceptably close to the original in quality and information content. Line coding: whatever its origin, whether from Unicode-coded textual information or from a digitised analogue signal, we have a bit stream to be transmitted. A digital baseband transmitter chooses suitable voltage symbols (e.g. rectangular or shaped pulses) to represent the string of 1’s and 0’s, a process known as line coding, and places these symbols directly into a transmission line system. Figure 1.19 shows the line codes used in Europe for connections between equipment (often within one exchange building) in the interfaces of the digital transmission hierarchy. Line codes are designed to have certain desirable characteristics and to fulfil several important functions. ●
●
The spectral characteristics of coded data must be matched to the characteristics or frequency response of the transmission medium. A mismatch may result in significant distortion of the transmitted voltage pulses. In particular, the code should have no DC offset. Line transmission systems are easier to design when different parts of the system are capacitor or transformer coupled to separate their DC bias voltage levels. These coupling elements pass higher-frequency (AC) voltages but block zero-frequency (DC) voltages. The coded data must therefore be void of DC content to prevent droop and baseline wander, whereby the received waveform drifts significantly relative to the decision threshold, which is 0 V for the case of a bipolar binary code. See Figure 1.20. The line code must combine data and timing information in one signal. It would be very expensive if a separate wire pair or coaxial cable had to be employed to carry the timing information needed at the receiver for setting decision or sampling instants. Furthermore, the line code must have a reasonable amount of clock content: the
Bit stream
1
0
1
0
0
0
0
0
1
1
0
0
0
0
1
1
0
0
0
0
+V AMI
0 –V +V
CMI
0 –V
HDB3
+V 0 –V
Figure 1.19 Examples of Line Codes: Alternate mark inversion (AMI), coded mark inversion (CMI), and high density bipolar with 3 zero maximum (HDB3).
1.5 Classification of Communication Systems
g(t), volts 1
1
1
1
1
1
1
0
1
1
1
1
1
+A
0
0
0
Bit stream
Ideal waveform Drooping waveform
Ideal baseline –A Wandering baseline
Figure 1.20
●
●
●
Droop and baseline wander in a binary line code.
timing content of a code is the maximum number of symbols that can occur together without a level transition – a small number indicating a high timing content. Ideally, there should be at least one transition in every symbol, but the major penalty is an increased bandwidth requirement. Vulnerability of the data to noise and ISI must be minimised. Sudden changes in a signal imply high frequencies in its spectrum. A rectangular pulse (with sharp transitions) transmitted through a lowpass transmission medium will spread out, with a potential for ISI. Thus, pulse shaping is frequently employed to reduce high-frequency components, which also reduces crosstalk since higher frequencies are more readily radiated. Pulse shaping also reduces the bandwidth necessary to correctly transmit the coded waveforms. The larger this bandwidth, the larger will be the amount of noise power that the receiver inevitably ‘admits’ in the process of receiving the waveforms. The line code should allow some amount of error detection. This usually involves the use of redundancy in which some codewords or symbol patterns are forbidden. A received codeword that violates the coding rule in force would then indicate some error. The line code should maximise code efficiency to allow a lower symbol rate to be used for a given bit rate. In long-distance cable systems, a lower symbol rate allows increased repeater spacing and reduces overall system cost. It turns out, however, that codes of high efficiency may lack certain other desirable characteristics. The code selected in practice will involve some compromise and will depend on the priorities of the system. Code efficiency is the ratio of actual information content (or bits) per code symbol to potential information content per code symbol. Potential information content per code symbol being given by log2 (Code radix), where code radix is the number of signalling levels or voltage levels used by the code symbols. For example, the potential information content per code symbol of a binary code (radix = 2) is log2 (2) = 1 bit, that of a ternary code (radix = 3) is log2 (3) = 1.585 bits, and that of a quaternary code (radix = 4) is log2 (4) = 2 bits, etc. Codes with higher radix can therefore convey more information per symbol, but there is increased codec complexity and a higher probability of error. Although multilevel codes (of radix ≥4) are very common in modulated communication systems to cope with restricted bandwidth, only codes with radix ≤4 are employed in baseband systems. One example of a quaternary code is the 2B1Q line code, which was adopted by ANSI in 1986 for use on basic ISDN lines. It is also the line code used on DSL local loops. As the name suggests, the 2B1Q code represents two binary digits using one quaternary symbol, i.e. one of four voltage levels. More specifically, the dibits 00, 01, 11, and 10 are represented by the voltage levels −3 V, −V, +V and +3 V, respectively.
43
44
1 Overview of Communication Systems
Transmitter Data in
Information source
Figure 1.21 ●
Line coder
Transmission medium Transmission line +
Receiver Equaliser and LPF
Noise and distortion
Pulse detector
Clock recovery
Data out
Information sink
Digital baseband transmission system.
Finally, the complexity of the encoder and decoder circuits (codec) should be kept to a minimum in order to reduce costs. In general, line codes that can be implemented by simple codecs are used for short links, whereas more efficient but complex and costly codecs are used for long-distance links because they can work with fewer repeaters and hence reduce overall system cost.
Line decoding: the transmission medium distorts the transmitted pulses by adding noise and by differently attenuating and delaying various frequency components of the pulses. A baseband receiver or repeater takes the distorted pulses as input and produces the original clean pulses at its output. It does this through three ‘R’ operations. First, a reshaping circuit comprising an equaliser and an LPF is used to reshape the pulse and ensure that its spectrum has a raised cosine shape. This operation is important to minimise ISI. Next, a retiming circuit recovers the clock signal from the stream of reshaped pulses. Level transitions within the pulse stream carry the clock information. Finally, a regenerating circuit detects the pulses at the sampling instants provided by the recovered clock signal. Occasionally, an error will occur when a noise voltage causes the pulse amplitude to cross the detection threshold. The frequency of occurrence of such errors, or BER, can be maintained at an acceptable level by ensuring that the noisy pulses are detected before the ratio of pulse energy to noise power density falls below a specified threshold. Figure 1.21 is a block diagram of a digital baseband system showing the basic operations discussed here. 1.5.3.4 Modulated Communication Systems
In very many situations it is necessary to translate the baseband signal (without distorting its information content) to a frequency band centred at a frequency f c that is well above 0 Hz. A communication system in which the information signal undergoes this modulation process before being placed into the transmission medium is referred to as a modulated communication system. There are numerous examples of modulated systems. All satellite communication systems, mobile communication systems, radio and TV broadcast systems, radio relay systems, and optical communication systems are modulated systems. Formally, we define modulation as the process of imposing the variations (or information) in a lower-frequency electrical signal (called the modulating or baseband or message signal) onto a higher frequency signal (called the carrier). The carrier signal is usually a sinusoidal signal of frequency f c . It effectively gives the message signal a ‘ride’ through the transmission medium because, for several reasons, it is impossible or undesirable for the message signal to make the ‘journey’ on its own. Role of modulation: there are several reasons why modulation is extensively used in modern communication: ●
Modulation is used to obtain a more efficient exploitation of the transmission medium by accommodating more than one user in the same medium at the same time. In most cases, the bandwidth that is available in the medium is much larger than what is required by one user or message signal. For example, the bandwidth available on a coaxial cable is more than 10 000 times the bandwidth of one telephone speech channel. The bandwidth of an optical fibre medium exceeds that of an analogue TV signal by a factor of up to one million; and the radio
1.5 Classification of Communication Systems
●
●
●
spectrum is much wider than the bandwidth required by one radio station for its broadcasts. Modulation allows the implementation of FDM in which each user’s signal is placed in a separate frequency band by modulating an appropriate carrier. If the carrier frequencies are sufficiently far apart, the different signals do not interfere with each other. A signal can be recovered at the receiver by filtering (to exclude the unwanted channels) and demodulation (to detect the message signal in the carrier signal). Providers of radio services can transmit and receive within the bands allocated to them by using a suitable modulation technique. Modulation allows us to select a frequency that is high enough to be efficiently radiated by an antenna in radio 2 R , where I systems. The power radiated by an antenna may be expressed as P = Irms r rms is the root mean square (rms) value of the current signal fed into the antenna and Rr is the antenna’s radiation resistance. It turns out that Rr depends on the size of the antenna measured in wavelength units. In general, the size of the antenna must be at least one-tenth of the signal wavelength if the antenna is to radiate an appreciable amount of power. Consider the minimum size of antennas required to radiate signals at three different frequencies, 3 kHz, 3 MHz, and 3 GHz. The wavelengths of these signals (given by the ratio between the speed of light 3 × 108 m/s and the signal’s frequency) are 100 km, 100 m, and 10 cm, respectively. Thus, if we attempted to radiate a 3 kHz speech signal, we would need an antenna that is at least 10 km long. Not only is such an antenna prohibitively expensive, it is hardly suited to portable applications such as in handheld mobile telephone units. If, on the other hand, we use our 3 kHz speech signal to modulate a 3 GHz carrier signal then it can be efficiently radiated using very small and hence affordable antennas of minimum size 1 cm. The use of modulation to transmit at higher frequencies also provides a further important advantage. It allows us to exploit the higher bandwidths available at the top end of the radio spectrum in order to accommodate more users or to transmit signals of large bandwidth. For example, an AM radio signal has a bandwidth of 10 kHz. Thus, the maximum number of AM radio stations that can be operated at low frequency (LF ≡30–300 kHz) in one locality is given by 300 − 30 = 27 Maximum number of AM stations at LF = 10 Observe that there is a tenfold increase in the number of AM radio stations when we move up by just one band to medium frequency (MF ≡ 300−3000 kHz) 3000 − 300 = 270 Maximum number of AM stations at MF = 10 The NTSC television signal requires a radio frequency bandwidth of 6 MHz. Frequency bands at MF and below can therefore not be used for TV transmission because they do not have enough bandwidth to accommodate the signal. The high frequency (HF ≡ 3−30 MHz) band can accommodate a maximum of only four such TV channels. However, as we move to higher bands, we can accommodate 45 TV channels at very high frequency (VHF ≡ 30−300 MHz), 450 TV channels at ultra high frequency (UHF ≡ 300−3000 MHz), 4500 TV channels at super high frequency (SHF ≡ 3−30 GHz), 45000 TV channels at extra high frequency (EHF ≡ 30−300 GHz); and the optical frequency band (300 GHz − 860 THz) can accommodate literally millions of TV channels. Another important function of modulation is that it allows us to transmit at a frequency that is best suited to the transmission medium. The behaviour of all practical transmission media is frequency dependent. Some frequency bands are passed with minimum distortion, some are heavily distorted, and some are blocked altogether. Modulation provides us with the means of placing the signal within a band of frequencies where noise, signal distortion, and attenuation are at an acceptable level within the transmission medium. Satellite communication was pioneered in C-band inside the 1–10 GHz window where both noise (celestial and atmospheric) and propagation impairments are minimum. Ionospheric reflection and absorption become increasingly significant the lower you go below this band, until at about 12 MHz the signal is completely blocked by the ionosphere. Furthermore, attenuation by tropospheric constituents such as rain, atmospheric gases, fog, cloud water droplets, etc., becomes significant and eventually very severe at higher-frequency bands. For this reason, modulation must be used in satellite communication to translate the baseband signal to a congenial higher-frequency band.
45
46
1 Overview of Communication Systems ●
Another example of a bandpass transmission medium is the optical fibre medium. This blocks signals at radio wave frequencies, but passes signals in the near-infrared band, particularly the frequencies around 194 and 231 THz. Thus, the only way to use this valuable medium for information transmission is to modulate an optical carrier signal with the baseband information signal.
Types of modulation: there are three basic methods of modulation depending on which parameter of the carrier signal is varied (or modulated) by the message signal. Consider the general expression for a sinusoidal carrier signal vc (t) = Ac cos(2𝜋fc t + 𝜙)
(1.4)
There are three parameters of the carrier, which may be varied in step with the message signal. The (unmodulated) carrier signal vc (t) has a constant amplitude Ac , a constant frequency f c , and a constant initial phase 𝜙. Varying the amplitude according to the variations of the message signal, while maintaining the other parameters constant, gives what is known as amplitude modulation (AM). Frequency modulation is obtained by varying the frequency f c of a constant-amplitude carrier, and phase modulation is the result of varying only the phase 𝜙 of the carrier signal. An analogue modulating signal will cause a continuous variation of the carrier parameter, the precise value of the varied parameter being significant always. It is obvious that this is then an analogue modulation and the resulting system is an analogue modulated communication system. A digital modulating signal (consisting of a string of binary 1’s and 0’s) will, on the other hand, cause the carrier parameter to change (or shift) in discrete steps. Information is then conveyed, not in the continuous precise value of the parameter but rather in the interval of the parameter value at discrete decision (or sampling) instants. This is therefore digital modulation and the resulting system is a digital modulated communication system having all the advantages of digital communication. In this case, the three modulation methods are given the special names amplitude shift keying (ASK), frequency shift keying (FSK), and phase shift keying (PSK) to emphasise that the parameters are varied in discrete steps. The number of steps of the parameter generally determines the complexity of the digital modulation scheme. The simplest situation, and the one most robust to noise, is binary modulation (or binary shift keying) where the carrier parameter can take on one of two values (or steps). This is shown in Figure 1.22. To transmit a bit stream using binary ASK, we take one bit at each clock instant and transmit a sinusoidal carrier of frequency f c for the duration of the bit. The carrier frequency f c is chosen to suit the transmission medium. The carrier amplitude is set to A1 for bit ‘1’ and to A0 for bit ‘0’. If either A1 or A0 is zero, we have a special type of binary ASK known as on–off keying (OOK), which is the digital modulation scheme used in optical fibre communication. In binary FSK, same-amplitude and same-phase carriers of frequencies f 1 and f 0 are transmitted for bit 1 and bit 0, respectively. In binary PSK, the carrier amplitude and frequency are fixed, but the carrier is transmitted with a phase 𝜙1 for bit 1 and 𝜙0 for bit 0. Usually, the difference between the two phases is 180∘ in order to use the maximum possible spacing between them. A useful way to look at digital modulation is to treat each transmitted carrier state as a code symbol. Thus, OOK represents bits 1 and 0 with the symbols SOOK1 and SOOK0 shown in Figure 1.22d; FSK uses the symbols SFSK1 and SFSK0 shown in Figure 1.22e; and PSK uses the symbols SPSK1 and SPSK0 shown in Figure 1.22f. The symbol rate (or signalling rate) is limited by the transmission bandwidth available. The theoretical maximum symbol rate is twice the transmission bandwidth. In a telephone channel with a bandwidth of about 3.1 kHz, the maximum possible symbol rate is therefore 6200 Bd. The main drawback of binary shift keying is that bit rate and symbol rate are equal since each symbol represents one bit. In many applications, a much higher bit rate is desired than can be obtained by increasing the symbol rate, which is limited by the available bandwidth. A higher bit rate can be achieved by using multilevel (also called M-ary, pronounced em-aary) digital modulation, in which M distinct carrier states are transmitted. Binary
1.5 Classification of Communication Systems
Bit Stream
0
1
1
0
Voltage t
(a)
(d)
SOOK1
SOOK0
(e)
SFSK1
SFSK0
(f)
SPSK1
SPSK0
Voltage t
(b)
Voltage t
(c) Symbol duration
Figure 1.22 Binary digital modulation schemes: (a) On-Off Keying (OOK), a special type of ASK; (b) Frequency Shift Keying (FSK); (c) Phase Shift Keying (PSK); (d) OOK symbols; (e) FSK symbols; (f) PSK symbols.
modulation is the special case M = 2. To do this, we take a group of k bits at a time and represent them with a unique carrier state or code symbol, where k = log2 M, or M = 2k . For example, taking k = 3 bits at a time, we have M = 23 = 8 possible states, namely 000, 001, 010, 011, 100, 101, 110, and 111, which must each be represented by a unique carrier state. Each symbol now carries three bits of information and the bit rate is therefore three times the symbol rate. In general, M-ary modulation increases the bit rate according to the relation Bit rate = log2 M × (Symbol rate)
(1.5)
However, the symbol states are closer together than in binary modulation, making it easier for noise effects to shift one state sufficiently close to an adjacent state to cause a symbol error. For example, the phase difference between adjacent states in an M-ary PSK is (360/M)∘ , which for M = 16 is only 22.5∘ . Bearing in mind that these PSK states all have the same energy, carrier amplitude being constant, we see that there is great potential for error if the transmission medium is prone to phase distortion. A combination of amplitude and phase shift keying (APSK), also called quadrature amplitude modulation (QAM), is often used, which increases the phase difference between symbol states. For example, it is possible to design for the minimum phase difference between states of equal energy in 16-APSK to be 90∘ , compared to only 22.5∘ in 16-PSK.
1.5.4 Circuit Versus Packet Switching A transmission path is required for information to flow between two users (i.e. between source and destination) in a communication system. It is only in a tiny minority of simple communication systems that this path is a permanent setup. For the rest, the communication system is a network of transceivers or nodes, and a path is a connected sequence of links which must be found as needed to convey information from the sending node through one or more intervening nodes to the destination node. We may classify communication systems either as circuit switched or packet switched according to how such a path is established. In both cases, switching is required at various points in the network (e.g. at a local exchange in a PSTN) to establish the concatenation of links that constitute the transmission path. This switching may be implemented in space or time or as a combination of the two. In space switching, a message arriving on one input port of the switch is routed or connected in its entirety to a designated output port or line of the switch without buffering. In time
47
48
1 Overview of Communication Systems
Bufferless
1 2 3 4
(a)
1 2 3 4
Controller
Time slots:
1
2
3
4
H
a
b
c
d
(b)
T
Input frame
H Buffers
1
2
3
4
d
a
b
c
T
Output frame
Controller
Switching table: 1→ 2 2→ 3 (c) 3→ 4 4→ 1 Figure 1.23
(a) Space switching; (b) Time switching; (c) Switching table; (d) Concentrator switching; (e) Route switching.
switching, the input and output of the switch are multi-slot data frames. The input data frame is buffered, and the contents of its time slots are switched to designated respective time slots in the output frame. The specification of input and output pairings is usually given in a routing table. Furthermore, space switching may be either concentrator switching, in which the switch has a larger number of input than output lines, or it may be route switching, where the switch has the same number of input and output lines. A concentrator switch is typically used to switch subscriber lines at a local exchange. It improves line occupancy but introduces the possibility of lost or blocked calls if an input message arrives at the switch while all output lines are engaged. Route switching on the other hand is typically used when switching trunk lines between exchanges. Figure 1.23 illustrates space and time switching in (a) and (b), respectively, under the control of the same switching table given in (c). The slots labelled H and T in the data frames of (b) are the header and trailer bits of each frame. Note, for example, that the contents of time slot 1 in the input frame are identified with the letter ‘a’ and these contents are switched to time slot 2 in the output frame according to the specification of the routing table. Concentrator and route switching are also illustrated in Figure 1.23d and e, respectively. 1.5.4.1 Circuit Switching
In a circuit switched communication system, before communication begins between two users, a dedicated channel or trunk or transmission path is set up end-to-end for the exclusive use of the two users for the entirety of their connection. The dedicated channel may be a separate frequency band or time slot or code sequence or combinations of these, depending on the channel access plan in operation. If a free path or circuit cannot be set up for a new connection between a pair of users then the requesting user is either denied access (i.e. blocked) or is placed on a queue until a free circuit becomes available.
1.5 Classification of Communication Systems
Fewer output lines
(e)
N output lines
(d)
Many input lines
(Continued)
N input lines
Figure 1.23
The circuit switching approach described above has several advantages. ●
●
●
●
●
Once a path has been established, i.e. once the call setup phase has been completed, a message will transit quickly through the system. There are no extra delays at intermediate nodes arising from queuing and the time taken to absorb portions of the message or its packets into a buffer at each node. The transmission system or network can manage congestion arising from excessive demand for simultaneous connections. New connection requests are simply denied or placed on a queue to protect the quality of active connections. The message has a smooth transit through the network from source to destination. There is no jitter arising from variations in the transit time of portions of the message, which would be the case if sections of the path were shared with other active connections. Such jitter is not desirable in real-time voice and video communications. There is no need for extra bits (called overhead bits) to be added to each portion of the message to convey source and destination addresses and sequencing information (for use in reassembling the message in the right order at the destination). As a result, the number of bits conveyed through the network is significantly reduced. The overall computational and processing demand on the network is low since the message will transit seamlessly through the network without the need for any further processing at intermediate nodes once a path has been established. However, circuit switching has several significant shortcomings.
●
It can lead to a highly inefficient utilisation of network resources. The dedicated channel cannot be used for another call during idle intervals of the active call. An interactive phone conversation between two people will be punctuated by periods of silence and an Internet browsing session may contain long intervals of inactivity, for example while examining downloaded content. The circuit switching approach locks away and wastes channel capacity during such idle intervals.
49
50
1 Overview of Communication Systems ●
●
●
The call setup phase, when a free channel is being arranged prior to commencement of the call, may be responsible for a significant portion of the delay from the time a request is made for a channel to the time the message is fully delivered at the destination. In fact, for short messages the call setup phase will account for the bulk of the delay, which may lead to an unacceptably low data throughput – this being the number of bits successfully delivered divided by the total time taken. The dedicated connection provides for a constant data rate between the two users, and this makes it difficult to interconnect users with a variety of data rates. The impact of a technical failure or breakdown in any portion of the path or route from source to destination is catastrophic and results in the termination of the active call. Once a call has been set up, circuit switching has no mechanism to modify the route mid-call. The probability of call termination is therefore increased on (e.g. long-distance) connections involving the concatenation of many links since the failure of any one link will lead to call termination.
The features itemised above make circuit switching the ideal approach to ensure the best service quality for real-time voice and video communication through a network with reliable constituents. The approach is also suitable for transmitting sufficiently long messages of whatever content or type through such networks. However, a circuit switched network could be likened to a community that generously divides its cake among the few. Packet switching offers the possibility of a different approach in which everyone can have a smaller piece of the community cake. 1.5.4.2 Packet Switching
In a packet-switched communication system or network, each message (which must be in digital form) is first broken into one or more packets of bits prior to transmission. Overhead bits in the form of a packet header are added to each packet to fulfil a variety of purposes, such as identification (e.g. source and/or destination address and/or route ID), error control, and transmission and network management functions. A data packet therefore consists of two sections, namely a header made up of overhead bits and a payload made up of source bits. Note that the source bits will include message bits, any redundant bits inserted by the channel encoder for error control, and overhead bits from any previous round of data packetisation or framing. If the packets are of fixed length (usually specified in bytes ≡ 8 bits) they are referred to as cells. An example of this fixed-length packet approach is ATM switching. An ATM cell is of length 53 bytes, which consists of a 5-byte header and a 48-byte payload. In other cases, such as IP (Internet protocol) switching, the packets are of variable length. IPv4 (version 4) has a header length that is typically 24 bytes but may vary from 20 to 60 bytes depending on header options. Since a path may be a concatenation of different link types, each with its own maximum transmission unit (MTU) which is the maximum number of bytes in a packet that the link can support, and the lowest MTU is typically 576 bytes for all link types, the recommended maximum IPv4 packet length is therefore 576 bytes to avoid packet fragmentation in transit. The latest IP version is IPv6, which has a fixed 40-byte header and a variable payload length constrained by a lowest MTU spec of 1280 bytes. IPv4 makes a 32-bit provision for source address (and similarly for destination address), which is enough to uniquely identify 232 ≈ 4.3 billion items such as nodes. In the early days of computer networking this provision was thought to be more than enough, but with the explosion of the Internet, including IoT (the Internet of Things), that early thinking proved extremely short-sighted. IPv6 therefore makes a 128-bit address provision, which is enough to uniquely identify 340 undecillion items—one undecillion being one trillion trillion trillion, i.e. one followed by 36 zeroes. IPv6 is thus considered the ultimate IP version with enough capacity for all future needs, but its uptake has been slow due to the need to invest in new Internet routers. There are two modes of packet switching, namely connection-oriented (CO) packet switching, also called virtual circuit, and connectionless (CL) packet switching, sometimes called datagram. Connection-oriented (CO) packet switching: in CO packet switching, a free but nonexclusive path is first set up through the network from source to destination and all user packets follow this path and arrive at the destination strictly in the same order as they left the source. It is important to observe that, unlike circuit
1.5 Classification of Communication Systems
switching, the path set up prior to the commencement of transmission in this virtual-circuit approach is not a dedicated path. Other packets from other users may simultaneously use portions of the same path to reach separate destinations. Therefore, packets may need to be queued at each node until their designated outgoing link is free. Each packet header carries a logical connection identifier or virtual circuit identifier (VCI), which identifies a predefined route to be followed by the packets from source to destination. Intervening nodes do not therefore make any routing decisions. Examples of connection-oriented packet switching include X.25 – the first public data network launched in the 1970s, Frame Relay – a wide area network (WAN) protocol adopted in the 1980s that largely replaced X.25, and ATM – launched in the 1990s with a lot of hype but now, due to the runaway global success of IP, mainly confined to use by telephone carriers for high-speed internal data transport. CO packet switching addresses some of the shortcomings of the circuit-switching approach and delivers the following improvements (when compared to circuit switching): ●
● ●
●
CO packet switching allows node-to-node links to be dynamically shared by multiple users over time so that network resources are used much more efficiently. It supports the exchange of packets between two users operating at different data rates. It allows new calls to be accepted during heavy traffic, although with an increased packet transfer delay, and hence reduced data throughput, due to a higher incidence of queuing at intermediate nodes. Circuit switching would simply block new calls during heavy traffic if a free path cannot be established. A prioritisation scheme may be efficiently implemented whereby important packets (e.g. those conveying network status information) are catapulted to the front of the queue at each node so that they have a much faster transit through the network. The only way to implement such prioritisation in a circuit-switched approach would be to dedicate some paths for exclusive use by priority packets, but this would be inefficient and wasteful because such packets are few and far between. Think of it this way: you wouldn’t build exclusive roads for ambulance vehicles and fire engines because there are too few of them for this to be cost-effective. However, the very fact that there are only a few of them means they can be prioritised and even allowed to interrupt other road users’ green lights at road junctions. In this way, their journey time is massively reduced, yet the added delay to other road users is negligible.
However, in addition to retaining the drawbacks of circuit switching associated with requiring an initial call setup phase and the impact of failure of any portion of the path after call commencement, CO packet switching introduces the following disadvantages (when compared to circuit switching): ●
●
●
●
CO packet switching increases the transit time of messages due to queuing delays at network nodes. As a minimum, a node delays each packet by the time taken to fully receive the packet into the node’s buffer before routing to the designated output port can commence. It introduces a variation in overall packet delay (called jitter), which is not desirable for real-time voice and video. Although all packets of one message connection follow the same route, the transit time of each of these packets will be a function of time as the queue state and workload of shared nodes change due to varying network conditions. It increases the number of bits transported in the network because a message is broken into packets, and to each of these packets a header is added. For example, in ATM, around 10% of all transported bits are overhead bits from the header. There is increased processing at network nodes. Although nodes do not perform independent routing, each node must at least read the VCI of each packet in order to correctly forward the packet according to the route established when the circuit was set up.
Nevertheless, because CO packet switching solved what was the biggest shortcoming of circuit switching, namely the inefficient utilisation of network resources, it was an important step towards building a global high-capacity integrated digital network.
51
52
1 Overview of Communication Systems
Connectionless (CL) packet switching: in CL packet switching, there is no initial call or connection setup phase. Packets are sent as individual datagrams, each containing source and destination IDs in its header. The packets are independently routed through the network at each router, which makes its own decisions on a packet-by-packet basis according to a dynamic routing table that is regularly updated as network conditions change. IP is a CL packet switching implementation which has become the single most dominant networking technology and has left its closest rivals, such as ATM, in relative obscurity. As the name implies, IP is the routing protocol that powers the entire Internet and since 4G in 2011 it was also adopted for mobile broadband connectivity. In the 1990s, phrases like voice over Internet protocol (VoIP) and television over Internet protocol (TVoIP) were coined, but such phrases have become superfluous because it is now everything over IP. CL packet switching addresses some of the shortcomings of the virtual circuit packet switching approach, including the following improvements (when compared to virtual circuit switching): ●
●
●
CL packet switching has no call setup phase, so message transmission through the network is an instant-start process. Therefore, the delay associated with circuit setup is eliminated and transmission is much quicker for users sending only a few packets, which is often the case in social media interactions and SMS (short messaging service). It is much more flexible and agile in reacting to network conditions and, unlike virtual circuits, can do so mid-call or mid-connection. If congestion develops at a given node, subsequent packets from the source can be routed away from that node, whereas in virtual circuits all packets of a call must follow a predefined route, which may include the congested node. It is also inherently more reliable. If a node fails then only packets that are lost in the crash are affected. A route may be easily found through the network for subsequent packets to bypass the failed node. In virtual circuits, all connections set up prior to the node failure and that include the failed node in their paths would continue to attempt to pass through the node after its failure and would therefore be terminated.
However, CL packet switching does introduce the following disadvantages (when compared to virtual circuit switching): ●
●
●
CL packet switching causes packets to transit the network more slowly since routing decisions must be made at every node. It is therefore a slower mechanism for users sending many packets. It could be said that CL is an instant-start jogger, whereas CO is a slow-start sprinter. Packets may follow different routes to the same destination and therefore may arrive out of order. This necessitates extra provision in the CL packet header and processing at the destination to allow received packets to be sorted into the right order to form the correct message. It is inherently difficult in CL packet switching to implement measures that guarantee a desired quality of service (QoS) and achieve congestion control. CL being an instant-start algorithm, it means that any source can instantly start sending packets through the network at any time. Each such new entrant can contribute towards both network congestion and degrading the QoS of existing users. In contrast, QoS and congestion control can be more readily managed in virtual circuit switching by negotiating the allocation of enough resources in advance for each connection during the call setup phase.
Despite the above shortcomings, CL’s unmatched routing flexibility and elimination of the significant delay overhead in call set-up have made it the undisputed technology of choice for the ubiquitous broadband communication networks of the twenty-first century, including the Internet and 4G mobile networks and beyond. The significant weakness of CL packet switching in relation to congestion control and QoS have been cleverly mitigated by overlaying it with smart oversight and control algorithms such as transmission control protocol (TCP), real-time transport protocol (RTP), and user datagram protocol (UDP). For example, TCP provides the congestion and flow control mechanisms with which IP is supervised to ensure a smooth and reliable data transfer between sender and receiver.
Review Questions
1.6 Epilogue Our detailed overview of communication systems is now complete. We had a twofold purpose in this chapter. First, we wanted to carefully and nonmathematically lay a foundation of communication engineering and erect important knowledge pegs on which you can hang the more detailed study presented in subsequent chapters. Second, we wanted to make a brief and highly selective historical sketch of the journey of telecom from telegraphy in 1837 to ubiquitous broadband Internet and 5G mobile communication in 2020. We believe that your study of communication engineering should begin with allowing yourself to be informed and inspired in equal measure by this history. We hope that seeing such potential for seemingly endless progress will stir in you a genuine hunger for competence in the subject and for a complete mastery of its principles and concepts. We hope that you will go on to become one of the architects of a yet-unseen broadband communication future. Back in the 1940s during the heyday of copper line communication, who saw the rise of optical fibre communication using glass as a transmission medium? In the 1950s when the launch of Sputnik 1 set off the space race, who saw today’s satnav ubiquity? When computer networking began with ARPANET in 1971, who imagined broadband Internet and its social media offspring? And today, who sees the future of telecom? Is there anything beyond today’s smartphone? Will it be an artificial intelligence device (AID)? Is there anything beyond today’s multimedia broadband Internet with its predominantly audio-visual capability? Will it be a multisensory Internet that adds tactile and olfactory functionality so that after 150 years of telecoms it can finally directly cater to our senses of touch and smell as much as to our sight and hearing? We hope that this chapter has inspired you to freely imagine our telecom future and that subsequent chapters will equip you with the skills to help shape it. We will now embark on the latter task and begin a more in-depth study of the subject in the next chapter with a detailed introduction to signals and systems in the context of telecommunications.
References 1 Live-Counter.com (2020). Worldwide Automobile Productions. https://www.live-counter.com/number-of-cars (accessed on 11th June 2020). 2 Statista (2020). Number of smartphone users worldwide from 2016 to 2021 (in billions). https://www.statista .com/statistics/330695/number-of-smartphone-users-worldwide (accessed on 11th June 2020). 3 Gartner (2020). Gartner says worldwide smartphone sales will grow 3% in 2020. https://www.gartner.com/en/ newsroom/press-releases/2020-01-28-gartner-says-worldwide-smartphone-sales-will-grow-3 (accessed 11th June 2020).
Review Questions 1.1 (a) . Discuss the drawbacks of verbal and visual nonelectrical telecommunications. (b) In spite of its many drawbacks, nonelectrical telecommunication remains indispensable in society. Discuss various situations in which nonelectrical telecommunication has some important advantage over (modern) electrical telecommunications. In your discussion, identify the type of nonelectrical telecommunication and the advantage it provides in each situation. 1.2
Sketch the flags that represent the following messages in semaphore code. Remember to include an end of signal code in each case. (a) NO (b) TAKE 5.
53
54
1 Overview of Communication Systems
1.3
Sketch the voltage pulse sequence for the Morse code representation of each of the following complete messages. (a) WE WON 2-0 (b) I LOVE YOU.
1.4
Baudot code (Appendix, Table A.2) requires seven transmitted bits per character, which includes a start bit (binary 0) and a stop bit (binary 1). Write out the complete bit sequence for the transmission (least significant bit (LSB) first) of the following messages using Baudot code. (a) DON’T GIVE UP (b) 7E;Q8.
1.5
In asynchronous transmission of ASCII-coded data, characters are transmitted one frame at a time starting with the LSB of the character. A frame is formed as follows: (i) take the 7-bit ASCII code for the character; (ii) insert one odd parity check bit in the MSB position to form a byte; (iii) write the byte from LSB to MSB, so transmission will start from LSB then insert bit 0 as a frame header (i.e. start bit) and bits 11 as a frame trailer (i.e. stop bits) to complete the frame. Repeat Question 1.4 for asynchronous transmission of the same messages using ASCII code. Compare the number of bits required by Baudot and ASCII codes in each case and comment on your results. Calculate the transmission efficiency of both coding schemes in (a) and (b).
1.6
The last century witnessed revolutionary developments in telecommunications far beyond what could have been contemplated by the pioneers of telegraphy. Discuss four key areas of these developments.
1.7
Draw a clearly labelled block diagram that is representative of all modern communication systems. List 12 different devices that could serve as the information source of a communication system. Identify a suitable information sink that may be used at the receiver in conjunction with each of these sources.
1.8
With the aid of a suitable diagram where possible, discuss the operation of the following information sources or sinks. (a) Dynamic microphone (b) Loudspeaker (c) CRT (d) Plasma-panel display (e) Liquid crystal display.
1.9
Figure 1.8 shows the ISO standard 226 contours of equal loudness as a function of frequency and SPL. (a) What is the loudness of sound of frequency 200 Hz and SPL 40 dB? (b) If a tone of SPL 55 dB is perceived at a loudness 50 phon, what are the possible frequencies of the tone? (c) Determine the amount by which the vibration (i.e. SPL in dB) of a 50 Hz tone would have to be increased above that of a 1 kHz tone in order that both tones have equal loudness 10 phon.
1.10
Discuss the signal processing tasks performed by the transmitter in a communication system. Indicate why each process is required and how it is reversed at the receiver to recover the original message signal.
1.11
Audio and television broadcasting have both gone digital and there has been a virtually complete digitalisation of the telecommunication networks of most countries. Examine the reasons for this trend of
Review Questions
digitalisation by presenting a detailed discussion, with examples where possible of the advantages and disadvantages of digital communication. Indicate in your discussion how the impact of each of the disadvantages can be minimised in practical systems. 1.12
Give two examples of each of the following types of communication systems: (a) Simplex system (b) Half-duplex system (c) Duplex system (d) Analogue baseband (e) Analogue modulated (f) Digital baseband (g) Digital modulated.
1.13
.(a) With the aid of suitable block diagrams, discuss the operation of any two examples of an analogue baseband communication system. (b) What are the most significant disadvantages of an analogue baseband communication system?
1.14
With the aid of suitable block diagrams, discuss the generation of the following discrete baseband signals starting from an analogue message signal. (a) PAM (b) PDM (c) PPM.
1.15
Compare the discrete baseband systems of Question 1.14 in terms of noise performance, bandwidth requirement, power consumption, and circuit complexity.
1.16
Explain how three independent user signals can be simultaneously carried on one transmission link using discrete baseband techniques. What are the major drawbacks of this system?
1.17
Using a suitable block diagram, discuss the steps involved in the ADC process. Identify the major parameters that must be specified by the ADC system designer and explain the considerations involved.
1.18
Give a detailed discussion of the desirable characteristics of line codes, which are used to electrically represent bit streams in digital baseband communication systems.
1.19
Discuss the three ‘R’ operations of a digital baseband receiver.
1.20
Discuss the roles of modulation in communication systems. Hence, identify the transmission media that can be used for voice communication in the absence of modulation.
1.21
Sketch the resulting waveform when the bit stream 10111001 modulates a suitable carrier using: (a) OOK (b) Binary FSK (c) Binary PSK. Assume that the carrier frequency is always an integer multiple of 1/T b , where T b is the bit interval.
55
56
1 Overview of Communication Systems
1.22
If the following digital communication systems operate at the same symbol rate of 4 kBd, determine the bit rate of each system. (a) OOK (b) 2B1Q (c) Binary FSK (d) 16-APSK.
1.23
Discuss the advantages of circuit switching over packet switching and identify the scenarios where circuit switching would be the preferred technique.
1.24
Discuss the mechanisms used by TCP to ensure a reliable service even though the IP network underneath may be unreliable.
57
2 Introduction to Signals and Systems
Complexity is merely a multiplicity of simplicity. The art of solving a complex problem is therefore to break it into its simple parts. In this Chapter ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
What is a signal? Forms of telecommunication signals. Objective and subjective classification of telecommunication signals. Specification of various special and standard waveforms. Qualitative introduction to sinusoidal signals. Quantitative characterisation and parameters of sinusoidal signals. Logarithmic units and transmission path calibration. Basic system properties and their applications. Worked examples and end-of-chapter questions.
2.1 Introduction The field of telecommunication deals with the transfer or movement of information from one point to another by electronic or electromagnetic means. The information to be transferred has first to be represented as a telecommunication signal. In approaching this important field of study, one must therefore have a thorough understanding of telecommunication signals and how they are manipulated in telecommunication systems. The presentation in the next three chapters has been carefully designed to help you acquire this crucial grounding using an approach that emphasises applications and a graphical appreciation of concepts. It will pay you rich dividends to work diligently through these three chapters, even if you have prior familiarity with some of the concepts discussed. You will gain important insights into crucial fundamental principles and acquire analysis skills which will serve you very well in your subsequent studies and future career. We start with understanding what constitutes a signal in telecommunications, the various forms in which signals may exist or are produced, how signals are classified both subjectively and objectively, and the mathematical specification of several special waveforms that serve as building blocks or analysis tools for general or arbitrary signals. Our treatment of special waveforms gives deserved emphasis to the sinusoidal signal using first a qualitative introduction followed by various quantitative characterisations and manipulations. We devote some time to the subject of logarithmic units, establishing its concepts and highlighting short cuts and common pitfalls, in order Communication Engineering Principles, Second Edition. Ifiok Otung. © 2021 John Wiley & Sons Ltd. Published 2021 by John Wiley & Sons Ltd. Companion Website: www.wiley.com/go/otung
58
2 Introduction to Signals and Systems
to give you complete mastery of this fundamental engineering tool. Basic systems properties are also introduced with examples drawn from both continuous-time and discrete-time operations.
2.2 What Is a Signal? A signal is a variable quantity which may be expressed as a function of one or more parameters which conveys information about a physical phenomenon, activity, or event. Telecommunication signals may convey a range of audio-visual, olfactory, tactile, and sensor measurement information, or they may serve to fulfil system control roles. An important feature of an information-bearing signal is that its strength or level or value, howsoever defined, must contain a variable component. Although a constant signal may be useful in conveying energy, such a signal carries no information. For example, a green traffic light conveys an ‘okay’ message only if it can change to red (for example) that conveys ‘not okay’, or amber that conveys ‘perhaps okay or soon okay’. Conversely, a permanently green light that has constant brightness and intensity may be decorative or illuminating but has no information content. A signal is described as one-dimensional (1D) if it is a function of only one variable such as time and is said to be multidimensional if it depends on two or more variables. There are numerous examples of signals: when you speak into a microphone, your vocal tract activity produces acoustic pressure waves which are converted by the microphone into a voltage waveform whose value varies with time in synchrony with the impinging sound pressure. This is a speech signal and is 1D. Other 1D signal examples include audio (which is the voltage output of a microphone in response to mechanical vibrations in general, covering a broader frequency range than speech), and ambient temperature, humidity, and atmospheric pressure at a fixed point (all being the voltage or current signal produced by a suitable sensor placed at the point), etc. A still image, on the other hand, is a two-dimensional (2D) signal since the value of the picture element (or pixel for short) depends on two variables, namely the x and y coordinates that specify location within the image. A video signal is three-dimensional (3D), being a time sequence of still images, and hence a function of three variables (x, y, t); although it is usually rendered as a 1D function of time through a left-to-right, top-to-bottom sequential scanning of the scene. Finally, radar can be employed to obtain detailed information about the variation with time of a physical quantity such as rain rate within a volume in 3D space (x, y, z), which therefore yields a four-dimensional signal. Can you think of a five-dimensional signal? In this book we will only consider signals that are 1D single-valued functions of time. This is not as restrictive as first seems, because the variations (and hence information) contained within any multidimensional signal can be fully captured and represented as a 1D function of time through sequential scanning.
2.3 Forms of Telecommunication Signals Table 2.1 lists five different forms of signals, the media in which they can exist, and the signal parameter that varies to convey information. Where a signal exists in acoustic or visual form, the first operation of a telecommunication system is to convert the signal into electrical form (using a suitable transducer) for ease of processing and transmission or further conversions before transmission. Figure 2.1 shows various examples of transducers often employed to convert a telecommunication signal from one form to another. The microphone converts an acoustic (or sound) signal such as music and speech into an electrical signal, whereas the loudspeaker performs a reverse process to that of the microphone and converts electrical signal into sound.
2.3 Forms of Telecommunication Signals
Table 2.1
Forms of signals.
Form
Medium
Varying Parameter
Electrical
Wire (e.g. twisted wire pair in local subscriber loop and coaxial cable in CATV)
Current or voltage level
Electromagnetic
Space (e.g. broadcast TV and radio)
Electric and magnetic fields
Acoustic
Air (e.g. interpersonal communication)
Air pressure
Light
Optical fibre and free space
On–off switching of light from injection laser diode (ILD) or light-emitting diode (LED)
Visual
Electronic or mechanical display device (e.g. paper for print image)
Reflected light intensity
Electrical
Acoustic Microphone
Loudspeaker
Visual
TV camera or Scanner
Electrical
Display screen or Printer Light
Photodetector e.g. PIN or APD
Electrical
LED or LD
Electromagnetic
Receive antenna
Transmit antenna Figure 2.1
Signal conversions in telecommunication.
Electrical
59
60
2 Introduction to Signals and Systems
The scanner or television camera converts a visual signal into an electrical signal. The scanner works only on printed (2D and still) images, whereas the television camera can handle physical scenes (3D and movable) as well. The reverse process of converting from an electrical to a visual signal is performed using a suitable display device such as the screen of a smartphone or a printer connected to a computer. A light detector or photodetector converts light energy into electric current. Examples include the PIN diode and the avalanche photodiode (APD). In both cases, the diode consists of an intrinsic (I) semiconductor layer sandwiched between heavily doped layers of p-type and n-type semiconductors, hence the name PIN. The diode is reverse-biased, and this creates a charge-depleted layer in the intrinsic region. Light falling on this layer creates electron–hole pairs that drift in opposite directions to the diode terminals (electrons towards the positive-voltage terminal and holes towards the negative), where they register as current flowing in the same direction. The APD uses a large reverse-bias voltage so that the photo-induced electrons acquire enough kinetic energy to ionise other atoms, leading to an avalanche effect. A light-emitting diode (LED) and a laser diode (LD) perform the reverse process of converting electric current to light. The optical radiation results from the recombination of electron–hole pairs in a forward-biased diode. In the laser diode, there is a threshold current above which the stimulated emission of light of very narrow spectral width commences. An antenna may be regarded as a type of transducer. Used as a transmitter, it converts electrical signals to electromagnetic waves launched out into space in the desired direction. When used as a receiver, it converts an incoming electromagnetic radiation into a current signal. Signals may be classified subjectively according to the type of information they convey, or objectively depending on their waveform structure. The waveform or wave shape of a signal is a plot of the values of the signal as a function of time. Note that the following discussion employs terms such as frequency and bandwidth, which are explained fully in subsequent chapters.
2.4 Subjective Classification of Telecommunication Signals 2.4.1 Speech Speech sound is the response of the human ear–brain system to the sound pressure wave emitted through the lips or nose of a speaker. The elements involved in speech production are illustrated in Figure 2.2. Air is forced from the lungs by a muscular action that is equivalent to pushing a piston. The air stream passes through the glottis, the opening between the vocal cords or folds. For voiced sounds, the vocal cords are set into vibration as the air stream flows by, and this provides a pulse-like and periodic excitation to the vocal tract – the air passage from the vocal cords to the openings of the mouth and nose. For unvoiced (or voiceless) sounds, there is no vibration of the vocal cords. The vocal tract, comprising the oral and nasal cavities, is a tube of nonuniform cross-section beginning at the glottis and ending at the lips and nose. The nasal cavity is shut off from the vocal tract by raising the soft palate (also called the velum) and coupled by lowering the velum. The vocal tract acts as a sound modifier. Its shape is changed to determine the type of sound that is produced. Different vowel sounds are generated by the Figure 2.2
Glottis Velum
Vocal folds Lungs Mouth
Nose
Elements involved in human speech production.
2.4 Subjective Classification of Telecommunication Signals
resonance of the vocal tract under different shapes. Vowels have strong periodic structures and higher amplitudes. On the other hand, different consonant sounds are produced by constriction of the vocal tract at different points. Air stream from the lungs flows through this point of constriction at a high velocity, giving rise to turbulence. Consonants have weaker amplitude and a noise-like spectrum. Some sounds, such as the non-vowel part of zee, are generated by mixed excitation. In this case the turbulent airflow at a point of constriction is switched on and off by the closing and opening of the glottis due to the vibration of the vocal cords. A microphone converts acoustic signal into an electrical signal, referred to as the speech signal. Knowledge of the time domain and frequency domain characteristics of speech signals is very useful in the design of speech transmission systems. Note that the exact details of these characteristics will vary significantly depending on the speaker’s sex, age, emotion, accent, etc. The following summary is therefore intended as a rough guide. A typical speech waveform is the sum of a noise-like part and a periodic part. Figure 2.3 shows 100 ms segments of the voiced sound ‘o’ in over, the unvoiced sound ‘k’ in kid and the mixed sound ‘z’ in zee. Voiced sound is strongly periodic and of relatively large amplitudes compared to unvoiced sound which is noise-like and of small amplitudes. For example, the voiced sound in Figure 2.3a has a strong periodic component of about 356 Hz. Even for a single speaker, speech signals tend to have a large dynamic range of about 55 dB. The dynamic range is given by the ratio between the largest amplitude (which may occur during intervals of syllabic stress) and the smallest amplitude in soft intervals measured over a period of 10 minutes or more. The ratio of the peak value to the root-mean-square (rms) value, known as the peak factor, is about 12 dB. Compared to a sinusoidal signal that has a peak factor of 3 dB, we see therefore that speech signals have a preponderance of small values. It is evident from Figure 2.3 that most of these small values would represent consonants and they must be faithfully transmitted to safeguard intelligibility. Over a long period of time the large amplitudes of a speech signal follow what is close to an exponential distribution, whereas the small amplitudes follow a roughly Gaussian distribution. The short-term spectrum of speech is highly variable, but a typical long-term spectrum is shown in Figure 2.4. This spectrum has a lowpass filter shape with about 80% of the energy below 800 Hz. The low-frequency components (50–200 Hz) enhance speaker recognition and naturalness, whereas the high-frequency components (3.5–7 kHz) enhance intelligibility such as being able to differentiate between the sounds of ‘s’ and ‘f’. Good 1 (a) 0
Relative Value
–1 0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
60
70
80
1 (b) 0 –1 1 (c) 0 –1
Figure 2.3
90 100 time, ms
Speech waveforms of (a) voiced sound ‘o’ in over, (b) unvoiced sound ‘k’ in kid, and (c) mixed sound ‘z’ in zee.
61
2 Introduction to Signals and Systems
0
Relative power, dB
62
–10
–20
–30
–40
Figure 2.4
0.5
1
2
4 8 Frequency, kHz
Typical power spectrum of speech signal.
subjective speech quality is, however, obtained in telephone systems with the baseband spectrum limited to the range 300–3400 Hz. This bandwidth is the ITU-T standard for telephony, although a smaller bandwidth (500–2600 Hz) has been used in the past on some international networks to increase capacity.
2.4.2 Music Music is a pleasant sound resulting from an appropriate combination of notes. A note is sound at a specific frequency, which is usually referred to as the pitch of the note. Pitch may be expressed in hertz (Hz) or using a notation based on a musical scale. Western musical scale consists of notes spaced apart in frequency at the ratio of 21/12 (= 1.059463). For example, if the middle A of a piano is of frequency 440 Hz then subsequent notes up to the next A, one octave higher, are 466, 494, 523, 554, 587, 622, 659, 698, 740, 784, 831, and 880 Hz. You will observe, as illustrated in Figure 2.5, that there is a doubling of frequency in one octave, the space of 12 notes. Sounds of the same note played on different instruments are distinguishable because they differ in a characteristic known as timbre. The note of each musical instrument comes from a peculiar combination of a fundamental frequency, certain harmonics and some otherwise related frequencies, and perhaps amplitude and frequency 466
A 440
494
554
523
622
587
659
740
698
831
784
Ai 880
One octave Figure 2.5
Fundamental frequency (in Hz) of various notes on a piano keyboard.
2.4 Subjective Classification of Telecommunication Signals
variations of some of the components. Although sounds from different musical instruments can be differentiated over a small bandwidth of, say, 5 kHz, a much larger bandwidth is required to reproduce music that faithfully portrays the timbre. High-fidelity music systems must provide for the transmission of all frequencies in the audible range from 20 Hz to 20 kHz. Note therefore the significant difference in bandwidth requirements for speech and music transmission. The maximum bandwidth required for speech transmission is 7 kHz for audio conference and loud-speaking telephones. This is referred to as wideband audio to distinguish it from the normal speech transmission using the baseband frequencies 300–3400 Hz.
2.4.3 Video The video signal is in general the electrical representation of movable 3D scenes as transmitted in television systems. An image signal is a special case that arises when the scene is a still 2D picture. Information conveyed by video signals includes the following. ●
●
●
●
●
Motion: video signals must contain the information required for the display system to be able to create the illusion of continuous motion (if any) when pictures of the scene are displayed. To do this the camera must take snapshots of the scene at a high enough rate. Each snapshot is called a frame. Observations show that a frame rate of about 30 (or a little less) per second is adequate. In motion pictures (or movies) a frame rate of 24 frames/s is used, but at the display each frame is projected twice to avoid flicker. This means a refresh rate of 48 Hz. Increasing the frame rate increases the amount of information and hence the required transmission bandwidth. Luminance: information about the brightness of the scene is contained in a luminance signal, which is a weighted combination of the red, green, and blue colour contents of the scene. The weighting emphasises green, followed by red, and lastly blue, in a way that reflects the variation of perceived brightness with the colour of light. For example, given red, green, and blue electric bulbs of the same wattage, the green light appears brightest followed by the red, and the blue light appears dullest. Chrominance: the colour content of the scene must be conveyed. There is an infinite range of shades of colours, just as there is an infinite range of shades of grey, or an infinite set of real numbers between, say, zero and one. However, (almost) any colour can be produced by adding suitable proportions of the three primary additive colours: red, green, and blue. Colour information is conveyed by chrominance signals from which the display device extracts the proportion of each of the primary colours contained in the scene. Because the eye is not very sensitive to changes in colour, the fine details of colour changes are usually omitted to allow the use of a smaller bandwidth for the chrominance signals. Audio: the sound content of the recorded scene or other superimposed sound information is conveyed in a bandwidth of 15 kHz. This allows high-fidelity sound reproduction. Control signals: the television receiver or display device needs information to correctly construct a 2D picture of the scene from the 1D (time function) video signal. Thus, synchronisation pulses are included that control the rate at which the display device draws the scene line by line in step with the image scanning operation of the camera.
How the above signals are combined to form the composite baseband video signal depends on the type of television system. However, it is obvious that video signals contain much more information than music or speech signals and will therefore require a larger transmission bandwidth. The now obsolete 625/50 PAL analogue TV standard required a bandwidth of 8 MHz. The digital TV standard DVB-T2 (Digital Video Broadcasting – 2nd Generation Terrestrial), adopted in 2009, supports compressed information bit rates ranging from 7.44 to 50.32 Mb/s and uses bandwidths of 1.7, 5, 6, 7, 8, or 10 MHz, depending on a number of factors, including image resolution (e.g. standard definition television, SDTV, or high definition television, HDTV), and signal processing mode (e.g. modulation scheme, such as QPSK, 16APSK, 64APSK, or 256APSK, and code rate, such as 1/2, 3/5, 2/3, 3/4, 4/5, or 5/6).
63
64
2 Introduction to Signals and Systems
2.4.4 Digital Data Digital data may originate from textual information generated by PCs or other data terminal equipment and encoded using the universal encoding standard Unicode, which assigns a unique integer number (called a code point) to each character in the world’s writing systems, unlike the American Standard Code for Information Interchange (ASCII) which covered Latin characters only. The first 128 code points of Unicode correspond to ASCII in Table A.4 (Appendix A). A good introduction to ASCII and Unicode is given in Section 1.3.1. Coding transforms the message text into a stream of binary digits. For example, the message ‘Take $1’ is coded in 8-bit ASCII as
An important consideration in the transmission of textual information is that there are no insignificant details that may be ignored or modified. For example, changing even just one bit in a stream of digital data can have far-reaching consequences. To verify this, check the effect of changing the 53rd bit in the above 56-bit digital data. Elaborate schemes have been devised to detect and, in some cases, correct transmission errors in digital data by adding extra bits to the message bits. Any data compression techniques adopted to reduce the size of the message bit stream must be lossless. That is, the receiver must be able to expand the compressed data back to the original bit stream. Speech, music, video, and myriads of sensor signals are usually converted to a digital representation to exploit the numerous advantages of digital transmission and storage. This analogue-to-digital conversion (ADC) at the transmitter is followed at the receiver by the reverse process of digital-to-analogue conversion (DAC). Although such an ADC output signal is a bit stream like the digital representation of textual information, there is an added flexibility that both lossless and lossy compressions can be applied to reduce the number of bits required to represent the analogue signal. Lossy compression eliminates signal details that are subjectively insignificant in order to save bandwidth. For example, barring other distortions, speech can be accurately transmitted (with intelligibility and speaker recognition) by using nearly lossless representation at 128 kbits/s, or using lossy compression to reduce the bit rate to 4 kbits/s or even lower. There is a loss of subjective quality in the latter, but the consequences are not catastrophic, as would be the case if lossy compression were applied to digital data conveying textual information.
2.4.5 Facsimile A facsimile signal, usually abbreviated fax, conveys the visual information recorded on paper, including printed or handwritten documents, drawings, and photographs. The transmission medium is in most cases a telephone line and sometimes radio. There are several fax standards such as the Group 3 (G3) standard, which has a standard resolution of 200 lines per inch (lpi), meaning that the paper is divided into a rectangular grid of picture elements (pixels), with 200 pixels per inch along the vertical and horizontal directions. That is, there are 62 pixels per square millimetre. The Group 4 (G4) standard has an ultrafine resolution of 400 lpi. The fax signal is generated by scanning the paper from left to right, one grid line at a time, starting from the top left-hand corner until the bottom right-hand corner of the paper is reached. For black-and-white only reproduction, each pixel is coded as either white or black using one binary digit (0 or 1). For better-quality reproduction in G3 Fax, up to five bits per pixel may be used to represent up to 25 = 32 shades of grey. The resulting bit stream is compressed at the transmitter and decompressed at the receiver in order to increase the effective transmission bit rate. At the receiver, each pixel on a blank paper is printed black, white, or shade of grey according to the value of the bit(s) for the pixel location. In this way the transmitted pattern is reproduced. The popularity of fax around the world reached its peak in the 1980s and 1990s. Since then fax has been in decline, although many large businesses and organisations around the world still maintain a fax machine and
2.5 Objective Classification of Telecommunication Signals
associated dedicated phone number. The emergence of smartphone devices, broadband wireless communication, and the Internet has completely obsoleted fax technology in all social applications. The reason is simple: if you can scan the document or picture using your smartphone camera and transmit it in high quality over a wireless network to its destination using a myriad of apps on your device at no extra cost to your mobile subscription, why would you need to invest in a separate and comparatively more cumbersome fax machine? Fax is still being used for business communication, but even here the technology is rapidly approaching extinction since legally binding documents may now be electronically completed online and digitally signed. Also, manually produced and signed documents may be scanned and exchanged using email or other Internet-based applications.
2.4.6 Ancillary and Control Signals Ancillary and control signals perform system control functions and carry no user-consumable information. There are numerous examples. Synchronising pulses are transmitted along with a video signal to control the start of each line and field scan. Pilot signals are transmitted along copper line systems to monitor variation in attenuation. Extra symbols or bits are inserted in digital signals for error detection/correction, synchronisation, and other system management functions.
2.5 Objective Classification of Telecommunication Signals 2.5.1 Analogue or Digital A signal is said to be analogue if it can take on a continuum of values and is defined at a continuum of time instants within a specified duration. Thus, an analogue signal g(t) is a continuous function of a continuous independent variable t, drawn from a set of real numbers. We therefore describe an analogue signal as a continuous-value, continuous-time signal. Most naturally occurring signals are analogue, e.g. speech, ambient temperature, etc. Digital signals on the other hand can only take on a set of values (e.g. 0 and 1, or V and −V for binary signals) at a set of uniformly spaced time instants (e.g. t = 0, ±T s , ±2T s , ±3T s , …, where T s is called the sampling interval). A digital signal is thus a discrete function of a discrete independent variable (drawn from a set of integers), and hence a discrete-value, discrete-time signal. Note that a continuous-value signal that is defined only at discrete time instants is not a digital signal; rather such a signal is a sampled signal or a discrete-time signal or simply a discrete signal. Also, a discrete-value signal that is defined throughout a continuum of time instants within a specified duration is neither digital nor discrete but is a quantised signal, often referred to as a staircase signal in view of its shape, which does simplify certain computations on the signal – as will become obvious in the next chapter. Figure 2.6 illustrates the four signal types introduced above. The analogue signal g(t) is defined for the duration from tmin to tmax , with a range of values from Amin to Amax . Note that there is an infinite number of time instants between tmin and tmax and g(t) is defined at every one of them. Also, g(t) can have any of the infinite number of values between Amin and Amax . The sampled or discrete signal g(nT s ) is obtained by taking the values of g(t) only at discrete time instants that are integer multiples of the sampling interval T s . This process generates a time series or sequence of values g[n], which is usually depicted diagrammatically, as shown in Figure 2.6b using a stem plot in which a vertical line of height g(nT s ) is drawn at t = nT s to represent the value of the discrete signal at that time instant. A clarification of notation is in order here. We will use square brackets to denote the sequence of a discrete signal as g[n], and round brackets to denote the value of the sequence at the nth sampling instant as g(n). Thus g[n] = {g(n), n ∈ ℤ};
g(n) ≡ g(nT s )
where ℤ = {…, −3, −2, −1, 0, 1, 2, 3, …} is the set of integer numbers and T s is the sampling interval.
65
66
2 Introduction to Signals and Systems
(a)
tmin
g(t) Amax tmax
Amin
(b)
(c)
(d)
tmin
g(nTS) ≡ g(n) TS
tmin
tmin
t
gq(t) A4 A3 A2 A1 gq(n) A4 A3 A2 A1
tmax
tmax
TS
tmax
t = nTs
t
t = nTs
Figure 2.6 (a) Analogue signal is continuous-value, continuous-time; (b) Sampled or discrete signal is continuous-value, discrete-time; (c) Quantized signal is discrete-value, continuous-time; (d) Digital signal is discrete-value, discrete-time.
We will assume that it is understood that the sampling interval – though not written explicitly in g(n) – is T s so that, for example, g(−2) is the value g(−2T s ), i.e. the value of g(t) at time t = −2T s . Clearly the independent variable of g[n] is discrete time nT s , which is for convenience written simply as n (without the factor T s ), but you must always remember that the value n = 3, for example, corresponds to the time instant t = 3T s . Note therefore that a sampled signal or sequence g[n] is a continuous function of a discrete independent variable, hence we describe it as a continuous-value, discrete-time signal. Figure 2.6c shows the quantised or staircase signal gq (t), where the subscript q indicates that the values of the signal are quantised (i.e. restricted to a discrete set of values). This signal is obtained from g(t) by rounding g(t) at each of the continuum of time instants to the nearest allowed level. In this example there are four allowed levels or values or states A1 , A2 , A3 , and A4 . A staircase signal is therefore a discrete function of a continuous independent variable; hence we describe it as a discrete-value, continuous-time signal. Finally, Figure 2.6d shows a digital signal gq [n] obtained from the sampled signal g[n] by approximating (i.e. rounding) each value to the nearest allowed level. This process is known as quantisation. It introduces an irreversible rounding or quantisation error in each quantised sample. The example shown in Figure 2.6d is a quaternary digital signal since gq [n] has four possible values or states. If there are only two possible states, the signal is referred to as a binary digital signal, which is the most common type of digital signal used in digital communications. Ternary digital signals, with three possible states, are also used for line coding to represent data as voltage levels in a baseband transmission system. In general, a digital signal may have M possible states (where M is usually an integer power of 2) and is described as an M-ary digital signal. Two features of digital signals make them more suitable for representing information in practical communication systems. First, unlike an analogue signal, the precise value of a digital signal is not important. Rather, what matters is the range within which the signal value lies. For example, if a binary digital signal is transmitted using +12 V to represent binary 1 and −12 V to represent binary 0 then any received signal value above 0 V would be interpreted as binary 1 and any below zero as binary 0. It would then take an impairment effect exceeding 12 V to cause an error in the detection or interpretation of the received signal. Second, unlike analogue signals, only the
2.5 Objective Classification of Telecommunication Signals
sampling instants are significant. This means that the detected value of a digital signal is not impaired by any distortions or noise outside the decision instants. Thus, a careful choice of sampling instants allows the digital signal to be detected at the instants of minimum distortion to its values. More significantly, the gaps between samples can be used to transmit other user signals, allowing multiple users to be simultaneously accommodated in the transmission system. This popular technique is known as time division multiplexing and is only possible if the user signals are discrete or digital. Subsequently, and throughout this book, a continuous-value, continuous-time signal will be referred to simply as a continuous signal or analogue signal and denoted such as g(t), x(t), y(t), etc., whereas a continuous-value, discrete-time signal will be referred to simply as a discrete signal and denoted such as g[n], x[n], y[n], etc. with its nth sample denoted g(n), x(n), y(n), etc. respectively. Furthermore, it will be assumed that a signal is continuous unless otherwise specified.
2.5.2 Periodic or Nonperiodic A periodic signal consists of a sequence or pattern of values that repeats over and over in time. The waveform of a periodic signal will therefore have a repetitive fundamental shape. The duration of the shortest fundamental shape is called the period, T of the waveform. Thus, if g(t) is a periodic signal of period T, then g(t) = g(t ± nT);
n = 0, 1, 2, 3, …
(2.1)
The smallest duration of time T for which Eq. (2.1) is satisfied is the period of g(t). It means, for example, that the signal has the same value at the time instants t = 0, T, 2T, 3T, …, which we may write as g(0) = g(T) = g(2T) = g(3T) = · · · Strictly speaking, a periodic signal is eternal, having neither a beginning nor an end. It exists from t = −∞ to +∞. In this respect, it would seem impossible to generate a truly periodic signal. However, in practical systems, the duration of interest is always limited. Thus, if a signal’s waveform has a fundamental shape that repeats over and over within the finite interval of interest then the signal is said to be periodic. A nonperiodic (or aperiodic) signal has no repetitive pattern and hence does not satisfy Eq. (2.1) for any T > 0. Some signals (e.g. some speech waveforms) may have a slowly changing repetitive pattern. They are referred to as quasiperiodic signals. Figure 2.7 shows examples of periodic, nonperiodic, and quasiperiodic waveforms. The fundamental shape of the periodic waveform is emphasised. The fundamental frequency f o of a periodic signal is the rate at which the fundamental shape repeats itself. This is the number of cycles per second, which is usually expressed in Hz. Since one cycle or fundamental shape has a duration T (the period), it follows that the fundamental frequency is given by 1 (2.2) T In Eq. (2.2), if the period T is in seconds, the frequency is in Hz; when T is in milliseconds (ms), the frequency is in kHz; when T is in microseconds (μs), the frequency is in MHz; and when T is in nanoseconds (ns), the frequency is in gigahertz (GHz), etc. By analogy between the cycles of circular motion and the cycles of periodic signal variation, we note that there are 2𝜋 radians or 360∘ in one cycle. Thus, the number of radians per second, called angular frequency and denoted 𝜔, of a periodic signal of fundamental frequency f o is fo =
2𝜋 (in radians per second) T In a similar manner, we consider a discrete signal g[n] to be periodic if 𝜔 = 2𝜋fo =
(2.3)
g(n) = g(n ± N)
(2.4)
67
2 Introduction to Signals and Systems
No repetitive pattern
(a)
Fundamental shape
(b)
Value
68
T
Slowly changing repetitive pattern
(c) Time Figure 2.7
Waveforms: (a) Nonperiodic; (b) Periodic with period T; (c) Quasiperiodic.
The smallest positive integer N for which Eq. (2.4) is satisfied is the period of the discrete signal, indicating that the signal completes one cycle in N samples. In line with Eq. (2.3) for a periodic continuous-time signal, such a periodic discrete signal has a fundamental angular frequency (expressed in radians per sample and denoted Ω) defined by 2𝜋 (2.5) N Ω is often referred to simply as fundamental frequency. It is important to note that the unit of (fundamental) frequency for a discrete signal is radian/sample, whereas that of the corresponding parameter of a continuous-time signal is rad/s. In addition to the fundamental frequency, every periodic signal (except sine waves, which we discuss later) contains other frequencies, called harmonic frequencies, at integer multiples of the fundamental frequency. The nth harmonic frequency is Ω=
fn = nf o ;
n = 1, 2, 3, · · ·
(2.6)
Note that the first harmonic frequency, f 1 = 1 × f o = f o , is the same as the fundamental frequency f o . This is different from conventional usage in music where the first harmonic is twice the fundamental frequency.
2.5.3 Deterministic or Random A deterministic signal is exactly predictable and can therefore be expressed as a completely specified function of time or other variable such as location in space. For example, a signal described by the functional relationship g(t) = 20cos(100𝜋t) is deterministic. A random signal on the other hand always has some element of uncertainty in its value and can therefore only be described in terms of the probability that its instantaneous value will lie in some specified range. Thus, whereas there is always some uncertainty about the value of a random signal before
2.5 Objective Classification of Telecommunication Signals
it occurs, there is no uncertainty whatsoever about the value of a deterministic signal at any point or time past, present, and future. Examples of a random signal include noise voltage in an electrical conductor and the signal r(t)cos(100𝜋t + 𝜓(t)) in which the envelope r(t) and phase 𝜓(t) vary randomly with time t.
2.5.4 Power or Energy If a signal has a finite nonzero energy E (i.e. 0 < E < ∞), it is said to be an energy signal. If, on the other hand, a signal has a finite nonzero power P (i.e. 0 < P < ∞), it is described as a power signal. Power and energy are discussed in more detail under signal characterisation in the next chapter, but for now it will suffice to note that normalised average power P of a signal is the mean square value of the signal and is also the energy E of the signal per unit time. Thus, for a signal of duration 𝕋 we may write [ ] E P = lim ; E = lim [P•𝕋 ] (2.7) 𝕋 →∞ 𝕋 𝕋 →∞ We see that if E is finite then P = 0; and if P is nonzero and finite then E → ∞. That is, an energy signal has finite energy but zero average power, whereas a power signal has finite power but infinite energy. Every signal will be either an energy signal or a power signal, but not both. A realisable finite-duration signal (called a pulse or symbol) is always an energy signal, whereas periodic signals and random signals are power signals.
2.5.5 Even or Odd A real-valued function g(t) of a real variable t is said to be even if and only if its value is unchanged when t changes sign. On the other hand, the function is said to be odd if its value changes sign without a change in magnitude when t changes sign. Thus, even signals are symmetric about the vertical axis, whereas odd signals are antisymmetric. Expressed mathematically { g(−t) for all t ⇒ Even signal (2.8) g(t) = −g(−t) for all t ⇒ Odd signal Similarly, a discrete signal g[n] is even if g(n) = g(−n) for all n, and is odd if g(n) = −g(−n) for all n. Some signals, such as g(t) in Figure 2.6a, are neither odd nor even, but they can always be expressed as a sum of even and odd signals. That is, given an arbitrary signal g(t), we can write g(t) = ge (t) + go (t) where ge (t) is even and go (t) is odd. To derive expressions for these even and odd components of g(t), substitute −t for t in the above equation, and make use of the definition of even and odd functions given in Eq. (2.8) to obtain g(−t) = ge (−t) + go (−t) = ge (t) − go (t) Thus, we have two equations: g(t) = ge (t) + go (t), and g(−t) = ge (t) − go (t), which when added together yields 1 [g(t) + g(−t)] 2 and when subtracted yields ge (t) =
1 [g(t) − g(−t)] 2 As an example, consider the deterministic signal go (t) =
g(t) = t2 (1 − t) − 10
(2.9)
(2.10)
69
70
2 Introduction to Signals and Systems
g(t) = ge(t) + go(t)
72 (a)
t
0 –64 –4
–3
–2
–1
0
6 (b)
2
3
4
t
0
–10 –4
–3
–2
–1
0
64
(c)
1 ge(t)
go(t)
1
2
3
4
t
0 –64 –4
Figure 2.8
–3
–2
–1
0
1
2
3
4
(a) Arbitrary signal g(t); (b) Even component; (c) Odd component.
which is shown in Figure 2.8a and is clearly neither even nor odd. However, we may express this signal as the sum of an even signal ge (t) and an odd signal go (t) obtained using Eqs. (2.9) and (2.10) as follows 1 [g(t) + g(−t)] 2 1 = [t2 (1 − t) − 10 + (−t)2 (1 − (−t)) − 10] 2 1 = [t2 (1 − t) − 10 + t2 (1 + t) − 10] 2 = t2 − 10
ge (t) =
1 [g(t) − g(−t)] 2 1 = [t2 (1 − t) − 10 − {(−t)2 (1 − (−t)) − 10}] 2 1 = [t2 (1 − t) − 10 − t2 (1 + t) + 10] 2 = −t3
go (t) =
These two component signals ge (t) and go (t) are shown in Figure 2.8b and c. Even and odd functions possess several useful properties which may be exploited to simplify signal analysis. Using their definition (Eq. (2.8)), it is a straightforward matter to show that the following properties hold for even and odd signals: 1. The product or quotient of two even signals is even, and the product or quotient of two odd signals is also even. 2. The sum of two even signals is even, and the sum of two odd signals is odd.
2.6 Special Waveforms and Signals
3. The quotient or product of an even and an odd signal is odd. 4. The derivative of an even signal is odd, whereas the derivative of an odd signal is even. 5. The integral of an odd signal g(t) in any range that is symmetrical about the vertical axis (i.e. from t = −T to t = T) is zero. It therefore follows that odd signals always have zero average value. 6. The integral of an even signal from t = −T to t = T is twice the integral from t = 0 to t = T. 7. The Maclaurin series of an even signal comprises only even powers, whereas that of an odd signal comprises only odd powers. 8. The Fourier series of a periodic even signal comprises only cosine terms, whereas the Fourier series of a periodic odd signal comprises only sine terms. We discuss the Fourier series in Chapter 4. To illustrate how the knowledge of some of the above properties can greatly simplify computations on a signal, consider the integration 𝜋∕3
∫−𝜋∕3
t2 sin(5t)dt
We know that t2 is even and sin(5t) is odd. Thus t2 sin(5t) is an odd signal (by property 3 above), and hence the above integration is zero (by property 5). Notice how we have obtained a solution without a lengthy calculation that would involve integration by parts. Discussion Example Classify each of the following signals in terms of its form, type of variation, and number of dimensions: (a) The number of customers in a supermarket queue. (b) The singing of birds in a forest. (a) The number of customers in the supermarket queue is a variable quantity that is capable of changing continuously with time, is observed at one location, takes on values that are restricted to positive integers, and in its raw form (i.e. before being captured and processed by a suitable sensor such as a digital camera) is visual. It is therefore a quantised 1D visual signal. (b) The singing of birds in a forest is obviously an acoustic signal that can be captured using a microphone sensor and converted into an electrical voltage signal representing music. The signal will vary continuously both in time t and in value (specified in terms of loudness of the entire sound or of a specific pitch). Furthermore, the observed value at any given time t will vary with location (x, y, z) within the forest volume. Thus, this is an analogue 4D acoustic signal.
2.6 Special Waveforms and Signals Figure 2.9 shows some common waveforms in telecommunication systems. The trapezoidal pulse is made up of rising, flat, and falling portions, and is often used to closely approximate realisable laboratory pulses. It reduces to rectangular, triangular, sawtooth, and ramp pulses as special cases. A periodic pulse train may be characterised by its duty cycle (sometimes expressed as a percentage) defined by Duty cycle =
Duration of pulse 𝜏 = Period of waveform T
(2.11)
A rectangular pulse train having a 50% duty cycle is usually simply described as a square wave. Note that the triangular and sawtooth waveforms shown in Figure 2.9 are pulse trains having 100% duty cycle. A triangular pulse has equal rise and fall times, whereas a sawtooth pulse has unequal rise and fall times. Sawtooth signals are used,
71
72
2 Introduction to Signals and Systems
Trapezoidal pulse train Rectangular pulse train
Triangular pulse train
Sawtooth pulse train
Sinusoidal waveform
Rectangular pulse
Random waveform
Figure 2.9
Common waveform types in telecommunications.
for example, in oscilloscopes to sweep an electron beam across the face of the cathode-ray tube (CRT) during the rising portion of the waveform and to provide a quick flyback of the beam during the falling portion which has a much shorter duration. Also shown in Figure 2.9 are the sinusoidal waveform, which constitutes the building blocks of all other signals, and the random waveform, which (in the form of noise) is an ever-present unwanted addition or companion to all other signals in practical transmission systems. These two signals are so crucial, we dedicate separate sections (in this chapter and the next) to their treatment. Figure 2.10 shows three examples of a pulse train with different pulse shapes and duty cycles determined using Eq. (2.11). Other fundamental waveforms and their manipulations to build other signals are introduced after these worked examples. Worked Example 2.1 Sketch a rectangular waveform of amplitude 5 V that has a duty cycle 20% and a 3rd harmonic frequency 6 kHz. We are given f 3 = 6 kHz and duty cycle = 0.2. To sketch the waveform, the period T and pulse duration 𝜏 must first be determined. Equation (2.6) gives the fundamental frequency fo =
f3 6 kHz = = 2 kHz 3 3
2.6 Special Waveforms and Signals
g1(t) d = 0.2 τ = 25 μs g2(t)
t T = 125 μs d = 0.8
0 g3(t)
0 4 Figure 2.10
4
5
10
15
t, ms
d = 0.4 8
20
40
60
t, ns
Duty cycle (d) of various pulse trains.
The period is then obtained using Eq. (2.2) T = 1∕fo = 1∕(2 kHz) = 0.5 ms Equation (2.11) then gives the pulse duration 𝜏 = T × Duty cycle = 0.5 × 0.2 = 0.1 ms A sketch of the rectangular waveform is shown in Figure 2.11 ʋ(t), V 5
t, ms 0
0.1
Figure 2.11
0.5
0.6
1.0
Two periods of the rectangular waveform of Example 2.1.
Worked Example 2.2 Determine the first three harmonic frequencies of the periodic waveform shown in Figure 2.12a. The crucial first step here is to identify the fundamental shape or cycle of the waveform. This is done in Figure 2.12b and the fundamental shape is emphasised using a bold line. We see that there are three repetitions of the shape in a time of 6 μs, giving a period T = 2 μs. Equation (2.2) yields the fundamental frequency of the waveform fo = 1∕T = 1∕(2 𝜇s) = 0.5 MHz. Therefore First harmonic frequency f1 = 1 × fo = 0.5 MHz Second harmonic frequency f2 = 2 × fo = 1 MHz Third harmonic frequency f3 = 3 × fo = 1.5 MHz
73
74
2 Introduction to Signals and Systems
(a) 0
6
0
6
t, μs
(b) t, μs
One cycle Figure 2.12
Periodic waveform of Example 2.2.
2.6.1 Unit Step Function The continuous and discrete versions of the unit step function, u(t) and u[n], respectively, are shown in Figure 2.13a and are defined by { { 1, n ≥ 0 1, t ≥ 0 (2.12) ; u(n) = u(t) = 0, n < 0 0, t < 0 Staircase (i.e. discrete value) waveforms can be constructed as a sum of unit step signals of appropriate amplitudes and delays, as illustrated in Worked Example 2.4 for a rectangular pulse. Also, we can learn a lot about the characteristic of a system by observing the system’s response to a unit step signal applied at its input.
2.6.2 Signum Function The signum function sgn(t) is shown in Figure 2.13b and is defined by ⎧1, ⎪ sgn(t) = ⎨0, ⎪ ⎩−1,
t>0 t=0
(2.13)
t0 n=0 n 0,
AI = 0
Aq < 0,
AI = 0
Aq ≥ 0,
AI > 0
Aq ≥ 0,
AI < 0
Aq < 0,
AI < 0
Aq < 0,
AI > 0
where, 𝛼 = tan−1 (|Aq ∕AI |)
(2.52)
As demonstrated in Worked Example 2.8, this method is quicker and more straightforward than its equations might suggest at first glance. Worked Example 2.7 Obtain the sum of the sinusoidal voltages v1 (t) = 3sin(𝜔t) and v2 (t) = 4sin(𝜔t − 𝜋/3) volts The sum voltage is v(t) = 3 sin(𝜔t) + 4 sin(𝜔t − 𝜋∕3) = A sin(𝜔t ± 𝛼). To determine A and 𝛼, we first represent v1 (t) as a phasor A1 ∠𝜙1 – a straight line of length equal (or scaled) to the amplitude A1 of v1 (t) and of orientation equal to the phase 𝜙1 of v1 (t). Since v1 (t) = 3sin(𝜔t) = 3cos(𝜔t − 90∘ ), it follows that A1 = 3 V and 𝜙1 = −90∘ . Similarly, v2 (t) = 4sin(𝜔t − 𝜋/3) = 4cos(𝜔t − 60∘ − 90∘ ) = 4cos(𝜔t − 150∘ ), so that v2 (t) is represented by phasor A2 ∠𝜙2 of amplitude A2 = 4 V and phase 𝜙2 = −150∘ . Next, we draw the phasor A2 ∠𝜙2 starting at the end of A1 ∠𝜙1 . The phasor A∠𝜙 obtained by joining the starting point of A1 ∠𝜙1 to the endpoint of A2 ∠𝜙2 gives the resultant voltage v(t) in both amplitude A and phase 𝜙. This phasor diagram is shown in Figure 2.25a. Note from the diagram that the desired phase is 𝜙 = −(90 + 𝛿)∘ .
2.7 Sinusoidal Signals
(a) Ak
Akq = Ak sin (ϕk )
ϕk AkI = Ak cos (ϕk ) q (b) 2nd quadrant: AI negative, Aq positive A
Aq
AI
A
ϕ
Aq
ϕ AI
AI
Aq
1st quadrant: Both AI and Aq positive
A
ϕ
ϕ
I
AI Aq
A
4th quadrant: AI positive, Aq negative
3rd quadrant: Both AI and Aq negative
Figure 2.24 (a) Resolving a phasor Ak ∠𝜙k into its in-phase component AkI and quadrature component Akq ; (b) Determining resultant amplitude A and phase 𝜙.
δ
ϕ′
A′
3
A
(a)
ϕ
(c) β 150°
3 A1∠ϕ1
–A2∠ϕ2 4
4 β = 360 – (150 + 90) = 120° C b
(b)
d
D c
B
Figure 2.25 (a) Phasor diagram for v(t) = v 1 (t) + v 2 (t) in Worked Example 2.7; (b) Naming convention for angles and sides in Eq. 2.44; (c) Phasor diagram for v 1 (t) − v 2 (t).
95
96
2 Introduction to Signals and Systems
To solve the triangle for A and 𝜙, we first use the cosine rule to obtain A and then the sine rule to obtain 𝜙. The cosine rule is used to solve a triangle (as in this case) where two sides and an included angle are known. The sine rule is simpler and is applicable in all other situations where up to three parameters of the triangle are known, including at least one side and at least one angle. In terms of the naming convention shown in Figure 2.25b, we have Cosine Rule∶ Sine Rule∶
d2 = b2 + c2 − 2bc cos(D) sin(B) sin(C) sin(D) = = c b d
(2.53)
Thus A2 = 32 + 42 − 2 × 3 × 4 × cos(120∘ ) = 37 A = 6.08 volts And sin(120∘ ) sin 𝛿 = 4 6.08 which evaluates to 𝛿 = 34.73∘ and therefore 𝜙 = −90 − 34.73 = −124.73∘ . Hence, the resultant voltage is the sinusoid v(t) = 6.08 cos(𝜔t − 124.73∘ ) = 6.08 sin(𝜔t − 34.73∘ ) volts It is worth noting that this graphical approach may be used just as easily for the subtraction of sinusoids by reversing the direction (but maintaining the same line) of the phasor of the sinusoid to be subtracted. To illustrate, Figure 2.25c shows the phasor diagram for subtracting v2 (t) from v1 (t) to obtain a resultant amplitude A′ and ′ phase 𝜙 . Notice that in this case we move from the end of phasor A1 ∠𝜙1 in the opposite direction to A2 ∠𝜙2 , which amounts to subtracting A2 ∠𝜙2 . Worked Example 2.8 Using the analytic approach of Eqs. (2.51) and (2.52), obtain a sinusoidal expression for g(t) = 10sin(20t + 45∘ ) − 20cos(20t) + 12cos(20t + 120∘ ). To solve this problem, we simply follow the steps of the approach, as discussed above and identified below. g(t) = 10 sin(20t + 45∘ ) − 20 cos(20t) + 12 cos(20t + 120∘ ) = 10 cos(20t − 45∘ ) − 20 cos(20t) + 12 cos(20t + 120∘ ) Step (i) ∘ ∘ ∘ = 10 cos(20t − 45 ) + 20 cos(20t + 180 ) + 12 cos(20t + 120 ) Step (ii) AI = 10 cos(−45∘ ) + 20 cos 180∘ + 12 cos 120∘ = 7.0711 − 20 − 6 = −18.929 Aq = 10 sin(−45∘ ) + 20 sin 180∘ + 12 sin 120∘ = −7.0711 + 0 + 10.3923 = 3.321 √ A = A2I + A2q = 19.22 𝛼 = tan−1 (|A ∕A |) = 9.95∘
⎫ ⎪ ⎪ ⎬ ⎪ ⎪ ⎭
⎫ ⎪ ⎬ q I ⎪ ∘ 𝜙 = 180 − 𝛼 = 170.05 , since AI < 0, Aq > 0 ⎭ Thus, g(t) = 19.22 cos(20t + 170.05∘ )
Steps (iii) & (iv)
Step (v)
2.7 Sinusoidal Signals
ʋ1(t) = 3sin(2πf1t) ʋ(t) = ʋ1(t) + ʋ2(t) is periodic but not sinusoidal ʋ2(t) = sin(6πf1t) t
Figure 2.26
The sum of harmonically related sinusoids is a periodic signal.
2.7.3.3 Multiple Sinusoids of Different Frequencies
When two or more sinusoids of different frequencies are added together, the resulting waveform is no longer a single sinusoid, as in the previous cases. There is no general analytic method for obtaining the resultant waveform in this case. The sinusoids may be added graphically by plotting them over the desired time interval and adding the instantaneous values of the sinusoids at each point to obtain the value of the resultant waveform at that point. However, if the sinusoidal signals are harmonically related (i.e. the frequency of each of the sinusoids is an integer multiple of one fundamental frequency) then the resultant waveform is a periodic signal. Figure 2.26 shows an example of adding two sinusoids at frequencies f 1 and 3f 1 and respective amplitudes A1 > A2 . The case of A1 < A2 would also produce a periodic waveform, albeit of a different shape. 2.7.3.4 Beats Involving Two Sinusoids
The addition of two sinusoids at different frequencies f 1 and f 2 is a very special case that produces a phenomenon known as beats which has extensive applications in medicine, music, law enforcement, telecommunications, etc. In law enforcement, for example, the police may determine the speed of a moving vehicle by observing the beats between a sinusoidal radio signal transmitted by their radar gun and the sinusoidal signal received back by reflection from the target. The difference in frequency is in general referred to as beat frequency, but in scenarios involving relative motion between a signal source and a signal observer, it is usually called Doppler frequency in honour of the Austrian physicist Christian Andreas Doppler (1803–1853) who, in 1842, explained the impact of relative speed on observed frequency. To fully understand the phenomenon of beats it is helpful to derive a general expression for the sum of two sinusoids having equal amplitude but different frequencies. Consider the trigonometric identity in Eq. (B.6) of Appendix B cos A cos B =
1 [cos(A + B) + cos(A − B)] 2
(i)
Let us make the following substitutions in the right-hand side A + B = 𝜃1
(ii)
A − B = 𝜃2
(iii)
Adding together equations (ii) and (iii) yields A=
𝜃1 + 𝜃2 2
(iv)
97
98
2 Introduction to Signals and Systems
whereas subtracting (iii) from (ii) yields 𝜃1 − 𝜃2 (v) 2 Substituting Eq. (ii) to (v) into (i) yields a trigonometric identity for the sum of two sinusoids as follows ) ( ) ( 𝜃1 + 𝜃2 𝜃1 − 𝜃2 cos cos(𝜃1 ) + cos(𝜃2 ) = 2 cos 2 2 B=
Now replacing 𝜃 1 with 2𝜋f 1 t and 𝜃 2 with 2𝜋f 2 t we obtain an expression for the sum of two sinusoidal signals of frequencies f 1 and f 2 )] ( ) [ ( f1 + f2 f1 − f2 t cos 2𝜋 t (2.54) cos(2𝜋f1 t) + cos(2𝜋f2 t) = 2 cos 2𝜋 2 2 If f 1 and f 2 are close in value then the right-hand side will appear as a sinusoid of frequency equal to the average value (f 1 + f 2 )/2 having a slowly varying amplitude given by the term in square brackets. This term is therefore referred to as the envelope of the sum signal. Variation in the envelope occurs due to a constantly changing phase difference between the two sinusoids. The higher frequency sinusoid oscillates faster, ‘overtaking’ the lower-frequency sinusoid by exactly f 1 − f 2 times each second. At the point of ‘overtaking’, their two phasors are momentarily coincident so that the two signals directly add to create a strong resultant signal. On the other hand, when the two phasors are half a cycle apart, the two signals cancel through direct subtraction, giving rise to a weak resultant signal. This interaction produces a periodic variation in resultant amplitude or intensity, or perceived loudness known as beat at a frequency equal to f 1 − f 2 . Figure 2.27 shows the beats produced when two sinusoids of frequencies 50 Hz and 46 Hz and equal amplitude are summed. Notice how the envelope varies periodically and the peak points are spaced apart by 0.25 s which is therefore the beat period, so that the beat frequency is 4 Hz, i.e. 50 – 46 Hz as expected. To comment further on applications, the phenomenon of beat frequency is relied upon in music for tuning instruments. To tune an instrument producing a tone at frequency f 1 to a slightly different tone at frequency f 2 , you play the two sounds together and listen to the beats while tuning. You know that the two sounds have equal frequency when the beats disappear. However, for the variation in sound volume (i.e. beats) to be audibly discernible the frequency difference must be small, for example no more than about 20–30 Hz; otherwise, only a tone at the average frequency will be perceived. H
0
L
H
0.25
L Beat envelope
Beat period
H
L
H
L
H
Sum signal
0.5
0.75
1 → t(s)
Figure 2.27 Summing two sinusoids at frequencies f 1 = 50 Hz, f 2 = 46 Hz produces a beat frequency f 1 − f 2 = 4 Hz due to the amplitude of the sum signal varying periodically between high (H) and low (L).
2.8 Logarithmic Units
2.7.4 Multiplication of Sinusoids Multiplication of sinusoids finds numerous applications in telecommunication and takes place, for example, in a mixer circuit that translates a signal from one radio frequency to another. It is important to become thoroughly familiar with the trigonometric identities presented in Appendix B, since they will feature prominently in our subsequent study. Note the identities (B.5) to (B.7) for the product of two sinusoids. Therefore, consider two sinusoidal signals at frequencies f 1 and f 2 v1 (t) = A1 cos(2𝜋f1 t) v2 (t) = A2 cos(2𝜋f2 t) Their product follows from Eq. (B.6) with A = 2𝜋f 1 t and B = 2𝜋f 2 t v(t) = v1 (t)v2 (t) AA = 1 2 {cos[2𝜋(f2 − f1 )t] + cos[2𝜋(f2 + f1 )t]} 2 This result shows that multiplying two sinusoids generates two new frequencies at the sum and difference of the original frequencies. A study of some of the applications of this principle is covered in later chapters.
2.8 Logarithmic Units A signal is subject to gains and losses at various stages of transmission through a communication system. To determine the signal power at a given point in the system, one multiplies the input signal power by all the power gains and divides by all the power losses experienced by the signal up to that point. This is a linear-units procedure involving multiplication and division. Signal power is in watt (W) and gains and losses are dimensionless numbers greater than 1. In Figure 2.28, an input power Pin = 3.16 mW enters the transmission system. The power at various points in the system is PA = 3.16 mW × 10 = 31.6 mW PB = 3.16 mW × 10 ÷ 63.1 = 0.50 mW Pout = 3.16 mW × 10 ÷ 63.1 × 10 = 5.01 mW We can simplify the process of signal power computation in a transmission system by adopting units of measure of power, gains, and losses that transform a complex series of multiplication and division into additions and subtractions. Logarithmic units furnish this transformation. The logarithm of a number (to base 10), denoted log10 , is the power to which 10 is raised to obtain that number. For example, because 102 = 100, we say that the logarithm of 100 is 2 and because 100 = 1, we say that the logarithm of 1 is zero, which we write as log10 100 = 2, and log10 1 = 0. Consider two positive numbers A and B whose logarithms (to base 10) are x and y, respectively. We may write log10 (A) = x
(a)
log10 (B) = y
(b)
Pin 3.16 mW Figure 2.28
Gain × 10
PA
(2.55) Loss ÷ 63.1
PB
Gain × 10
Pout
Gains and losses in transmission system: Linear-units procedure.
99
100
2 Introduction to Signals and Systems
It follows that 10x = A y
10 = B
(a) (b)
(2.56)
Multiplying Eq. (2.56) (a) and (b) together 10x 10y = 10x+y = AB And it follows by the above definition that log10 (AB) = x + y = log10 (A) + log10 (B) Note that we have made use of Eq. (2.55). Now dividing Eq. (2.56) (a) by (b) 10x A = 10x−y = 10y B It similarly follows that ( ) A = x − y = log10 (A) − log10 (B) log10 B To summarise, the following relations hold for logarithms to any base log(AB) = log(A) + log(B) ( ) A = log(A) − log(B) log B Putting A = 1 ( ) 1 = log(1) − log(B) = 0 − log(B) log B = − log(B)
(2.57)
(2.58)
Observe that multiplication (A × B) is replaced by the addition of logarithms, division (A/B) by subtraction of logarithms, and inversion (1/B) by changing the sign of the logarithm. For example log10 (103 ) = log10 (10 × 10 × 10) = log10 (10) + log10 (10) + log10 (10) = 3 × log10 (10) = 3 In general log(Ab ) = b log(A)
(2.59)
where b is any real number. In logarithmic units therefore, the output signal power of a transmission system is obtained by adding the system gains to the input power and subtracting the system losses. Of course, every quantity, including both the input and output powers, must be expressed in logarithmic units. The most used logarithmic unit in the field of engineering is the decibel (dB). The decibel unit is one-tenth of a bel (B), a logarithmic unit that is no longer in use. It is common to use gain as a generic term for both a boost in signal strength (which is actual gain) and a reduction in signal strength (which is actual loss). In logarithmic units such as the decibel, a positive gain then indicates an increase in signal strength, whereas a negative value for gain indicates a loss or reduction in signal strength. So, for example, we may refer to a gain of 18 dB or a gain of −10 dB. The latter means a loss of 10 dB, but we would not normally refer to the former as a loss of −18 dB even though such a description would be mathematically correct.
2.8 Logarithmic Units
Figure 2.29
I1
System Gain.
V1, P1
I2 V2, P2
G Z1
Z2
2.8.1 Logarithmic Units for System Gain Figure 2.29 represents an arbitrary system of total gain G and input power P1 , input current I 1 , input voltage V 1 , output power P2 , output current I 2 , and output voltage V 2 . The input and output resistances of the system are Z 1 and Z 2 , respectively. Various gains of the system, expressed as dimensionless ratios, are P2 P1 I Current gain = 2 I1 V2 Voltage gain = V1 Power gain =
The power gain of the system in dB is defined as ( ) P2 G = 10log10 dB P1
(2.60)
(2.61)
If dealing with normalised power, or if the input and output resistances of the system are equal (i.e. Z 1 = Z 2 ), then V2 I2 P2 = 22 = 22 P1 V1 I1 Substituting in Eq. (2.61), we obtain ( 2) ( )2 V2 V2 G = 10log10 = 10log 10 2 V1 V ( 1) V2 = 20log10 V ( 1) I = 20log10 2 I1
(2.62)
Equations (2.61) and (2.62) show that, for power gain in dB, the constant of multiplication is 10, whereas for current and voltage gains in dB, the constant of multiplication is 20. This difference is extremely important and must always be remembered to avoid errors. It is worth emphasising that Eq. (2.61) for power gain does not depend on the values of system resistances, whereas Eq. (2.62) for voltage and current gains holds only if the system’s input and output resistances are equal. A less commonly used logarithmic unit of gain is the neper (Np), defined as the natural logarithm of the ratio of output to input. This is logarithm to base e = 2.718281828459· · ·, denoted ln. It follows that for the system in Figure 2.29 ( ) ( ) ( ) V2 P2 I 1 Gain in neper (Np) = ln = ln 2 = ln (2.63) V1 I1 2 P1
101
102
2 Introduction to Signals and Systems
To obtain the relationship between Np and dB, note in Eq. (2.63) that a current or voltage gain of 1 Np, implies that ( ) V V2 = 1; or 2 = e1 = e loge V1 V1 From Eq. (2.62), the corresponding dB gain is ( ) V2 20log10 = 20log10 (e) = 8.686 dB V1 Similarly, a power gain of 1 Np means that ( ) P2 P 1 loge = 1, or 2 = e2 2 P1 P1 The corresponding dB gain follows from Eq. (2.61) ( ) P2 = 10log10 (e2 ) = 20log10 (e) = 8.686 dB 10log10 P1 The logarithmic units neper and decibel are therefore related as follows Voltage, current, or power gain of 1 Np = 8.686 dB
(2.64)
2.8.2 Logarithmic Units for Voltage, Power, and Other Quantities The decibel and neper are units of relative measure. For example, in Eq. (2.61) the decibel value gives a measure of P2 relative to P1 . By selecting a universally agreed reference level, we can express any signal power or voltage in dB relative to the agreed reference. In this way, absolute power level can be measured in dB and Np. Standard power reference levels are 1 W and 1 mW. Power expressed in dB relative to 1 W is said to be measured in dBW, whereas power expressed in dB relative to 1 mW is said to be measured in dBm. Logarithmic units for voltage measurement are designated dBV for a 1 V reference and dBu for a 775 mV reference. The 775 mV reference, often used in telephony, is the voltage that gives a power dissipation of 1 mW across a 600 Ω resistance. Thus ) ( P dBW = 10log10 P dBW P (watt) = 10log10 ) (1W P dBm = 30 + 10log10 P dBm = 10log10 (2.65) 1 × 10−3 W And V(volt) = 20log10 (V) dBV ( ) V = 20log10 dBu 775 × 10−3 = 2.214 + 20log10 V dBu
(2.66)
Note that to convert power expressed in dBW to dBm, you simply add 30 to the dBW value; and to convert from dBV to dBu, you add 2.214 to the dBV value. Many other quantities may be expressed in logarithmic units in order to simplify computations. This practice is particularly common in communication link analysis and design where signal and noise powers are products of contributory parameters. By expressing every parameter in logarithmic units, these powers are more conveniently computed through summation; and the computation process may be laid out in a tabular format known as a link
2.8 Logarithmic Units
power budget. For example, we learn in Chapter 6 that noise power is given by Pn = kTB
(2.67)
where k = 1.38 × 10−23 J/K or 1.38 × 10−23 W/Hz/K ≡ 1.38 × 10−20 mW/Hz/K is Boltzmann constant, T is equivalent noise temperature in kelvin (K), and B is bandwidth in Hz. Noise power per unit bandwidth, denoted N o , is given by No =
Pn = kT (W∕Hz) B
(2.68)
Expressing k in dB relative to 1 W/Hz/K gives it a value k = 10log10 (1.38 × 10−23 ) = −228.6 dBW/Hz/K. If expressed in dB relative to 1 mW/Hz/K, then its value is k = −198.6 dBm/Hz/K. Similarly, T may be expressed in a unit of dB relative to 1 K, called dBK, and bandwidth in dB relative to 1 Hz, known as dBHz. For example, if T = 500 K and B = 2 MHz, then in logarithmic units T = 10log10 (500) = 27 dBK B = 10log10 (2 × 106 ) = 63 dBHz Pn = k + T + B = −228.6 + 27 + 63 = −138.6 dBW Note that if the Boltzmann constant is expressed in dBm/Hz/K, then the resulting noise power is in dBm, which in this example would be −108.6 dBm. As a further application, the energy per bit Eb at the output of a receive-antenna in a free space line-of-sight communication link scenario is given by the expression Eb =
Pt Gt Gr W∕(bit∕s) Ls La Rb
(2.69)
We may express all the above parameters in logarithmic units, namely transmit signal power Pt in dBW, transmit-antenna gain Gt in dB, receive-antenna gain Gr in dB, free space path loss Ls in dB, additional losses La in dB, and link bit rate Rb in dBbit/s to obtain Eb in dBW/(bit/s) and hence the important ratio Eb /N o in dB as Eb = Pt + Gt + Gr − Ls − La − k − T (dB) No
(2.70)
It is important to note that we are here following the standard practice of employing the same notation for a parameter whether it is expressed in linear or logarithmic units. For example, transmit signal power is denoted Pt whether it is in watts (W) as in Eq. (2.69) or in dBW as in Eq. (2.70). You must therefore be careful to recognise, based on each usage context, which unit (linear or logarithmic) is in play. In general, if an expression involves the product of multiplicative parameters, as in Eq. (2.69) then the units are linear, but if it is the sum of such parameters, as in Eq. (2.70) then logarithmic units are involved. A psophometer is often used in speech telephony to measure the amount of power in noise and crosstalk. Although the noise spectrum spans the entire receiver bandwidth, the human ear is less sensitive to some of the spectral components, which will therefore have a less annoying effect. Human ear sensitivity is greatest between about 500 and 2000 Hz. The psophometer weights the noise spectrum to take account of the non-flat frequency response of the ear and the receiving equipment. It reduces noise power at each spectral frequency point in proportion to the reduced sensitivity of the ear at that point. The weighting has a peak at 800 Hz and gives a noise measurement that is smaller than would be the case in the absence of weighting, but which gives a better indication of how a human recipient perceives the noise. When the psophometer is used, a suffix ‘p’ is added to whatever unit is employed. Thus, we may have dBmp, pWp, etc. for psophometrically weighted dBm and picowatt, respectively. Psophometrically weighted noise power is less than unweighted white noise power by 2.5 dB over a 3.1 kHz bandwidth and by 3.6 dB over a 4 kHz bandwidth. White noise is discussed later in the book.
103
104
2 Introduction to Signals and Systems
2.8.3 Logarithmic Unit Dos and Don’ts The following relations apply to the logarithmic units for voltage and power PdBm = PdBW + 30 VdBu = VdBV + 2.214 PdBmp = PdBm − 2.5, over 3.1 kHz PdBmp = PdBm − 3.6, over 4 kHz
(2.71)
The way to interpret the above equations is, for example, that you add 30 to the dBW value of power to obtain its value expressed in dBm, and so on. Based on Eqs. (2.65) and (2.66), we can convert a power value PdBW (in dBW) to PW (in watts); a power value PdBm (in dBm) to PmW (in mW), and a voltage value V dBV (in dBV) to V v (in volts), using the following relations PW = 10(PdBW ∕10) PmW = 10(PdBm ∕10) Vv = 10(VdBV ∕20)
(2.72)
It is worth taking the time to familiarise yourself with the following checklist of the dos and don’ts of logarithmic units to avoid common pitfalls: ●
●
●
●
●
Do not add together two dBW (or dBm) values. Doing so would amount to multiplying two powers to obtain a quantity in square watts, which does not exist. If you wish to obtain the sum or difference of two powers, you must first express each power in linear units (e.g. W or mW) before combining them. Do not multiply together (or divide) two logarithmic values. Doing so would amount to raising one quantity to the power of the logarithmic value (or to the power of the reciprocal of the logarithmic value) of the other quantity, which likely is not your intended operation. Do not multiply the logarithmic value of a quantity by a constant, say n, unless your intention is to raise that quantity to power n. Similarly, do not divide by n unless you wish to raise the quantity to power 1/n. However, if, for example, your actual intention is to share a power PdBW in dBW into n equal parts then perform this operation as PdBW − 10log10 n to obtain the desired 1/n portion of the power in dBW. Alternatively, convert PdBW to PW in watts before dividing by n. Do not subtract two powers that are expressed in dissimilar logarithmic units. For example, do not subtract a dBm value from a dBW value. Do not pluralise the dB unit. In line with the convention of using logarithmic units to express absolute values, the notation dBs means dB relative to one second. The unit of decibel should therefore always be singular to avoid this confusion. Thus, it is a gain of 20 dB, not 20 dBs.
You may add together or subtract various logarithmic values if your intention is to obtain a product or division of the quantities represented by those values. For example, noise power, given by the formula Pn = kTB, may be calculated by adding together the logarithmic values of k, T, and B. The logarithmic value of Eb /N o may be calculated by subtracting the logarithmic value of N o from that of Eb . And the figure of merit G/T of a wireless receiver system may be found by subtracting the receiver’s system noise temperature in dBK from the receive-antenna gain in dB. You may subtract or add dB values to a dBW (or dBm, etc.) value to obtain a result in dBW (or dBm, etc.). This operation amounts to scaling the quantity by loss or gain factors. You may subtract two dBW (or two dBm, etc.) values to obtain a result in dB. This operation amounts to obtaining the ratio between the two quantities. For example, subtracting input power in dBW from output power in dBW gives the gain of the system in dB. And subtracting noise power in dBW from signal power in dBW yields a signal-to-noise ratio in dB.
2.8 Logarithmic Units
Always use the correct logarithmic unit for each quantity. Specifically, note that the unit of dB is applicable only to gains, losses, and the ratio between two quantities having the same linear unit. The logarithmic units for bandwidth, equivalent noise temperature, power, wireless receiver system’s figure of merit, etc. are not dB but rather dBHz, dBK, dBW, dB/K, etc., respectively. You may calculate the logarithm of a positive number X to any arbitrary base b (where b > 0) in terms of logarithm to base 10 as follows log10 X (2.73) logb X = log10 b This equation comes in handy in situations where your calculator does not have a function to perform logarithm to your desired base. For example log10 81 1.9085 = =4 log3 81 = log10 3 0.4771 log10 2048 3.3113 = = 11 log2 2048 = log10 2 0.3010 Worked Example 2.9 Let us now apply logarithmic units to the transmission system of Figure 2.28 to determine power levels at various points in the system. ) 3.16mW = 5.0 dBm 1mW Gain of 1st element = 10 (ratio) = 10log10 (10) = 10 dB (
Pin = 3.16 mW = 10log10
Loss of 2nd element = 63.1 (ratio) = 10log10 (63.1) = 18 dB Gain of 3rd element = 10 (ratio) = 10log10 (10) = 10 dB The power levels PA , PB , and Pout now follow by simple addition (subtraction) of the gains (losses) of the relevant elements. See Figure 2.30. PA = 5 dBm + 10 = 15 dBm PB = 5 dBm + 10 − 18 = −3 dBm Pout = 5 dBm + 10 − 18 + 10 = 7 dBm Pin 5 dBm Figure 2.30
Gain 10 dB
PA
Loss 18 dB
PB
Gain 10 dB
Pout
Gains and losses in transmission system: Logarithmic-units procedure.
You may verify that these results agree with those obtained earlier using linear units. For example, using Eq. (2.72), PB = −3 dBm = 10(−3∕10) mW = 0.5 mW, as obtained earlier. Note that we could have converted −3 dBm to mW without using a calculator by observing that −3 dBm means 3 dB below 1 mW, which means a factor of 2 below 1 mW, which means 0.5 mW. Worked Example 2.10 Decibel Without Calculators We wish to learn three simple steps which may be employed to obtain quick conversions between logarithmic and linear values in a wide range of cases without the need for a calculator. It will serve you well in future if you take a moment to learn these tricks.
105
106
2 Introduction to Signals and Systems
Table 2.3
Conversion of ratio to dB.
Ratio or Number
Expressed as Factors
Converted to dB
8
2 × 2 × 2 or 23
3 + 3 + 3 or 3 × 3 = 9 dB
n
2
3n dB
200
100 × 2
20 + 3 = 23 dB
60
10 × 2 × 3
10 + 3 + 4.77 = 17.77 dB
1/2
2−1
3 × (−1) = −3 dB
500
1000 ÷ 2
5 × 10−23
30 – 3 = 27 dB 7 + (−230) = −223 dB
1/800
1 ÷ (100 × 8)
0 − (20 + 9) = −29 dB
Step 1: Note that the dB value of any ratio (or number) that is a power of 10 does not require a calculator since it is simply given by 10n (ratio) = 10n (dB)
(2.74)
For example 1 (ratio) ≡ 100 (ratio) = 10 × 0 (dB) = 0 dB 100 (ratio) ≡ 102 (ratio) = 10 × 2 (dB) = 20 dB 1∕10000 (ratio) ≡ 10−4 (ratio) = 10 × (−4) (dB) = −40 dB Step 2: Know by heart the dB values of a few prime factors. You do not need to memorise them. Just interact with them for a little while and you will know them just as you know your own name without having to claim that you memorised it. Here are the few you need to know by heart 2 (ratio) = 3 dB 3 (ratio) = 4.77 dB 5 (ratio) = 7 dB 7 (ratio) = 8.45 dB
(2.75)
Step 3: You may then convert any number to dB (without needing a calculator) if you can express that number as the product of factors with known dB values. You then simply add up the dB values of those factors to obtain the desired dB value of the number. Note that there will sometimes be more than one way of factorising. For example, 500 may be written as 100 × 5 or as 1000 × 1/2, but the dB result will be the same. These steps may be applied in reverse to convert a dB value to its ratio equivalent as follows: write the dB value as the sum of two or more terms each of whose ratio equivalent is known. The desired result is simply the product of these component ratio equivalents. A few examples are tabulated below. Table 2.3 gives ratio-to-dB conversions, whereas Table 2.4 shows dB-to-ratio conversions. The character ÷ in the tables denotes division operation.
2.9 Calibration of a Signal Transmission Path
Table 2.4
Conversion of dB values to ratio.
dB Value
Expressed as a Sum
Converted to Ratio
14
7+7
5 × 5 = 25
4
7–3
5 ÷ 2 = 2.5
44
40 + 7 – 3
104 × 5/2 = 2.5 × 104
1
10 – 9
10 ÷ 8 = 1.25
−53
7 – 60
5 × 10−6
−111.55
8.45 – 120
7 × 10−12
87
7 + 80
5 × 108
2.9 Calibration of a Signal Transmission Path A transmission path imposes continual attenuation on a signal, which necessitates the placement of amplifiers (called repeaters in analogue systems) at regular intervals along the path to boost the signal strength. To trace the power level of a signal from its source through various points along a transmission path, we may calibrate the transmission path relative to a chosen reference point in order to exploit the computational advantage offered by logarithmic units. This point is called the zero-level reference point (ZRP) and is usually set around the input to the link. For a four-wire circuit, the ZRP is usually the two-wire input to the hybrid transformer. The transmission path is calibrated by assigning to every point on the link a transmission level (in dB) relative to the ZRP, hence the unit dBr. As illustrated in Figure 2.31, the assigned dBr value is the algebraic sum of the dB gains (a loss being accounted as negative gain) from the ZRP to that point. This applies to points lying beyond the ZRP in the forward path direction. The ZRP is usually chosen at the input, and therefore all points would normally fall into this category. However, if for reasons of accessibility the ZRP is located other than at the input then the dBr of each point lying before the ZRP is obtained by negating the algebraic sum of all gains from that point to the ZRP. For example, a point lying beyond the ZRP and separated from it by a 10 dB gain amplifier is assigned 10 dBr. A point lying beyond the ZRP and separated from it by 18 dB loss (G = −18 dB) and 15 dB gain (G = +15 dB) is marked −18 + 15 = −3 dBr. A point located before the ZRP and separated from it by 10 dB loss (G = −10 dB) and 25 dB gain is marked −(−10 + 25) = −15 dBr. The ZRP itself is marked 0 dBr since it is the reference. Absolute signal power measured in dBm or dBW at the ZRP is expressed as dBm0 or dBW0, respectively.
0 dBr ZRP
+ G = 10 dB
+5 dBm0 Figure 2.31
+10 dBr
– L = 18 dB or G = −18 dB
+15 dBm
–8 dBr
+
2 dBr
G = 10 dB
–3 dBm
+7 dBm
Transmission levels and the zero-level reference point (ZRP).
107
108
2 Introduction to Signals and Systems
The absolute power level PdBm of a signal (in dBm) at any point on the link is determined by adding the link’s dBr mark at that point to the signal’s dBm0 value, i.e. the power level of the signal at ZRP, denoted PdBm0 . That is PdBm = PdBm0 + dBr
(2.76)
where PdBm = Signal power at the given point PdBm0 = Signal power at the ZRP dBr = Relative transmission level of the given point. Equation (2.76) may also be used to determine the power of a signal at the entry point into the link (i.e. PdBm0 ). This is given by the difference between the signal power at an arbitrary point along the link (i.e. PdBm ) and the dBr value of the point. For example, in Figure 2.31, if we measure the signal level at the −8 dBr point and find that it is −3 dBm, then we know that the level of the signal at the entry (i.e. ZRP) point is −3 − (−8) = 5 dBm0. Worked Example 2.11 A transmission system consists of the following gain and loss components in the listed order: (1) (2) (3) (4) (5)
Loss = 30 dB Gain = 50 dB Loss = 8 dB Loss = 12 dB Gain = 25 dB
Draw a block diagram of the transmission system and calibrate it in dBr with the ZRP located at (a) The input of the first component (b) The input of the fourth component A block diagram of the system is shown in Figure 2.32. Note that we entered the loss as negative gain in order to simplify the algebraic summation involved in the calibration. The calibration for ZRP at the input of the first component is shown on the upper part of the block diagram and that for ZRP at the input of the fourth component is shown on the lower part. The procedure used is as earlier described. Points lying beyond the ZRP have a dBr value equal to the algebraic sum of the gains up to that point. Points lying before the ZRP – in this case, the first three components in (b) – have a dBr value equal to the negated algebraic sum of the gains from the ZRP to the point. (a) = Upper calibration
0 dBr ZRP
–30 dBr
G = –30 dB
–12 dBr
Figure 2.32
+20 dBr
G = –8 dB
G = 50 dB
–42 dBr
Worked Example 2.11.
+12 dBr
+8 dBr
0 dBr ZRP (b) = Lower calibration
0 dBr
G = –12 dB
+25 dBr
G = 25 dB
–12 dBr
+13 dBr
2.10 Systems and Their Properties
2.10 Systems and Their Properties A system may be defined very broadly as a functional group of interacting entities that is demarcated by a boundary from its external environment. Systems may be physical artefacts such as the human body or a household boiler, they may be concepts such as cultural or economic systems, or they may be processes such as software algorithms. This book is concerned with a very narrow functional view of a system as an arrangement or mathematical operation that maps or transforms an input signal into an output signal. And within this narrow view, we focus mainly on transmission systems, an overview of which is presented in Chapter 1, that comprises three parts, namely transmitter, channel, and receiver. Each of these parts is a system as well as a subsystem of the overall communication system. A system (in this narrow context) therefore manipulates an input signal or excitation x(t) to produce an output signal or response y(t). This operation is illustrated in Figure 2.33 and will be denoted as R
x(t) −−→ y(t)
(2.77)
which is read ‘x(t) yields response y(t)’. The system may be a continuous-time system which processes a continuous-time (CT) input signal to produce a CT response; or, as also illustrated in Figure 2.33, it could be a discrete-time system which processes a discrete-time (DT) input sequence x[n] to produce a DT response y[n]. An error control encoder is an example of a DT system which transforms a discrete sequence of message bits into a discrete sequence of coded bits that includes some redundant bits to aid error correction at the receiver. An AM (amplitude modulation) transmitter is a CT system which transforms an analogue message signal into a transmitted analogue signal at higher frequency and power. Note, however, that a system may feature both CT and DT signals. For example, an ADC system transforms a CT input signal into a digital output signal representing a discrete sequence; and a digital modulator may process a discrete input sequence of bits to produce analogue output pulse signals. It will be useful for our tasks of transmission system analysis and design in later chapters to identify some of the basic properties of the system in Figure 2.33.
2.10.1 Memory A system is classed as memoryless if its output at any instant t depends at most on the input values at the same instant. A system whose output at time t depends exclusively on the input at a past instant t − Δt is also regarded as memoryless. To distinguish between these two, the first is described as memoryless instantaneous-response, whereas the latter is referred to as memoryless delayed-response. If the output, however, depends on the current input as well as one or more past or future input values then the system is said to have memory. The system also Figure 2.33
System operation.
x(t)
Continuoustime system
y(t)
x[n]
Discretetime system
y(n)
109
2 Introduction to Signals and Systems
Excitation
Response
R
System
i(t)
ʋR(t) = Ri(t)
(a)
C
Response
System
i(t) Excitation
ʋC (t) =
1 C
t
i(t)dt
–∞
(b)
System
(c)
Figure 2.34
L
Response
i(t) Excitation
110
ʋL (t) = L
di(t) dt
Systems (a) with memory and (b) and (c) without memory.
has memory if the present output does not depend on the present input but depends on two or more past or future inputs. Figure 2.34 shows three single-component CT systems, namely (a) a resistor, (b) a capacitor, and (c) an inductor. In each case the system excitation is the input current i(t) and the system response is the voltage drop produced across the circuit element. From basic circuit theory, we have vR (t) = Ri(t) k=t∕Δt t ∑ 1 1 lim i(t)dt ≡ i(kΔt)Δt Δt→0 C ∫−∞ C k=−∞ ] [ i(t) − i(t − Δt) di(t) ≡ L lim vL (t) = L Δt→0 Δt dt
vC (t) =
(2.78)
Thus, a resistor is a memoryless instantaneous-response system, whereas a capacitor, with its response voltage obtained as a scaled accumulation of all past excitation currents, is a system with memory. The inductor also has memory since its response (voltage) is based on the difference between its present and immediate past excitation (current). All the discrete-time systems that perform the operations specified below, where p > 0 is an integer, have memory 1 ∑ x(n − k) p + 1 k=0 p
y(n) = y(n) =
p ∑
ak x(n − k)
(Moving average) (Weighted sum)
k=0
y(n) = x(n) − x(n − 1)
(Backward difference)
y(n) = x(n) − x(n + 1)
(Forward difference)
(2.79)
2.10 Systems and Their Properties
Note, however, that the CT and DT systems described by the input–output relations y(t) = Kx2 (t − t0 ) y(n) = ax(n − n0 ) where t0 , n0 > 0;
K, a ≠ 0
(2.80)
are memoryless delayed-response systems. But the systems described by y(t) = ax(t) + bx2 (t) y(n) = cx2 (n) + dx3 (n) where |a| + |b| > 0,
|c| + |d| > 0
(2.81)
are memoryless instantaneous-response systems. In a memoryless instantaneous-response system, the output signal is always in step with the input signal and an excitation at the input is felt instantaneously (i.e. without delay) at the system output. Thus, such a system does not cause any phase shift, signal delay, or phase distortion. A memoryless delayed-response system also does not cause any phase distortion, but it introduces a fixed delay between excitation and system response which is akin to the time it takes for a signal to propagate through the system, every frequency component of the signal experiencing an equal amount of delay. It should be noted, however, that memoryless systems are an idealisation. Practical transmission systems will have memory to varying extents since there will always be some amount of residual capacitive and inductive effects in any conductors or semiconductors within the system, which will introduce energy storage and hence some memory capacity. Therefore, when we treat a practical system as memoryless, we simply mean that its memory capacity is small enough to be ignored in the context of the application.
2.10.2 Stability A system is described as bounded-input, bounded-output (BIBO) stable if and only if, for every absolutely bounded input |x(t)| < K i < ∞, the output is also absolutely bounded so that |y(t)| < K o < ∞. The CT system R
x(t) −−→ x2 (t) is BIBO stable since, if |x(t)| < K i < ∞ then |y(t)| < K i 2 < ∞. That is, any finite input will produce a finite output. However, the CT system R
x(t) −−→ log(x(t)) is unstable since if |x(t)| = 0 then |y(t)| = ∞, and thus a bounded input produces an unbounded output. All the DT systems represented in Eq. (2.79) are BIBO stable, but the DT system with input–output relation (or difference equation) given by y(n) = x(n − 1) + 𝛼y(n − 1)
(2.82)
is BIBO stable only if |𝛼| < 1, and is unstable otherwise. Notice that this system derives the current output by adding a portion 𝛼 of the previous output to the previous input. If this portion is at least equal to the previous output (i.e. |𝛼| ≥ 1) then the magnitude of the output will increase indefinitely with time, which gives rise to instability. Stability is a critical system property which must always be carefully assessed and assured during design. Instability in an electrical system such as an amplifier or a digital filter might cause excessive current and power to be delivered to sensitive circuit components and devices, causing malfunction or serious damage. And if a mechanical system such as a suspension bridge is unstable, the combined excitation by strong winds and usage loading might create excessive vibrational response in the bridge structure which could be enough to cause catastrophic collapse. The Angers Bridge collapse of 16th April 1850, which happened when a battalion of French soldiers was
111
112
2 Introduction to Signals and Systems
marching across the bridge in strong winds, killing over 200 of the soldiers, is a well-known example of the danger posed by system instability.
2.10.3 Causality A causal system is one in which the response at any given time depends only on the present or past excitations. If the system’s response is influenced in any way by a future excitation then the system is said to be non-causal. You may wonder how it can ever be possible for a system to use future inputs (which have not yet occurred) in order to produce the present output. Well, this is not possible in real-time, so non-causal real-time systems cannot be designed. However, a non-causal system can be implemented to operate in non-real-time on recorded data where all ‘future’ input data are already available in storage and can be accessed as required to produce the ‘present’ output. The moving average filter, with difference equation given in line 1 of Eq. (2.79), calculates the present output as the average of the present input and p immediate past inputs. This operation is causal. The backward difference system in Eq. (2.79) is also causal. However, the centred moving average (MA) filter with difference equation given by y(n) =
p ∑ 1 x(n − k) 2p + 1 k=−p
(centred MA)
(2.83)
obtains the present output by averaging the present input x(n) along with p immediate past inputs {x(n−1), x(n−2), …, x(n−p)} and p immediate future inputs {x(n + 1), x(n + 2), …, x(n + p)}. This operation is therefore non-causal. The forward difference system in Eq. (2.79) is also non-causal. Figure 2.35 shows plots of the responses of two CT systems to an impulse function excitation 𝛿(t). The impulse responses of the systems are specified by R
𝛿(t) −−→ h1 (t) = sinc(t)
(Fig. 2.35b)
R
𝛿(t) −−→ h2 (t) = e−t u(t)
(Fig. 2.35c)
(2.84) δ(t)
(a) Impulse excitation
t
0 (b) Response: Non-causal system h1(t) = sinc(t) 1
(c) Response: Causal system h2(t) = e–tu(t) 1
0 –0.3 –4
0 –3
Figure 2.35
–2
–1
0
1
2
3
4
t
–4
–3
Example of system excitation and responses.
–2
–1
0
1
2
3
4
t
2.10 Systems and Their Properties
Notice how the system in Figure 2.35b has an output at time t < 0 in response to an input which is only applied at time t = 0. In contrast, the response of the system in Figure 2.35c begins only after the input is applied at t = 0. Thus, the first system is non-causal whereas the second is causal. We revisit the concept of a system’s impulse response in Chapter 3 and apply it to the analysis of a certain class of systems.
2.10.4 Linearity A linear system is one that obeys the principle of superposition, which states that if excitation x1 (t) produces response y1 (t) and excitation x2 (t) produces response y2 (t) then excitation a1 x1 (t) + a2 x2 (t) will produce response a1 y1 (t) + a2 y2 (t), where a1 and a2 are arbitrary constants. A linear system is therefore both additive and homogeneous. A system is said to exhibit the property of additivity if the response of the system to a sum of two or more excitations is simply the sum of the responses of the system to each excitation applied alone. And a system is said to be homogeneous if when an excitation is scaled by a constant factor the response is scaled by the same factor. Employing the notation of Eq. (2.77) allows us to state the above definitions compactly as follows Given that R
x1 (t) −−→ y1 (t) and R
x2 (t) −−→ y2 (t) then R
x1 (t) + x2 (t) −−→ y1 (t) + y2 (t) R
a1 x1 (t) −−→ a1 y1 (t) R
a1 x1 (t) + a2 x2 (t) −−→ a1 y1 (t) + a2 y2 (t)
(additivity) (homogeneity) (linearity)
(2.85)
If a system disobeys any one of the properties of additivity and homogeneity then the system is said to be nonlinear. The property of linearity is widely applied in system analysis to obtain the response of a system to an arbitrary excitation if that excitation can be expressed as a linear combination (i.e. amplitude scaling and summation) of signals whose responses are already known. Letting denote an operator that encapsulates the transformation performed by a system on its input x(t) to obtain its output y(t) such that we can write y(t) = {x(t)}
(2.86)
it follows that if (and hence the system) is linear then for a group of N amplitude-scaled inputs the output y(t) is given by } {N ∑ ak xk (t) y(t) = k=1
=
N ∑
ak yk (t)
k=1 N
=
∑
ak {xk (t)}
(2.87)
k=1
Notice that the first line represents the system operating on a group of N amplitude-scaled inputs, whereas the last line represents the system first transforming each input xk (t), k = 1, 2, 3, …, N, to yield its individual response yk (t) and then scaling those responses and adding the scaled responses together to obtain the group response. That is, the order of manipulation in the first line is (1) scaling, (2) summation, and (3) operator action, whereas
113
114
2 Introduction to Signals and Systems
in the last line the order is (1) operator action, (2) scaling, and (3) summation. This means that if an operator is linear then the order of scaling and operation, or summation (which includes integration) and operation, can be interchanged as convenient. We expand upon this reasoning later to develop an extremely useful tool for system analysis in the time domain (Chapter 3) and in the frequency domain (Chapter 4). Worked Example 2.12 Determine which of the properties of additivity, homogeneity, and linearity is obeyed by the systems with the following input–output relations: (i) y(t) = 𝛽x(t) + A, where 𝛽 and A are constants. (ii) y(n) = 𝛽x(n − 1) + 𝛾x2 (n), where 𝛽 and 𝛾 are constants. (iii) The centred moving average filter of Eq. (2.83). For convenience, let us adopt the following convention for naming the response of a system to various excitations R
xk (t) −−→ yxk (t) R
axk (t) −−→ yaxk (t) R
x1 (t) + x2 (t) −−→ yx1 +x2 (t) With this convention it follows that a system is additive if and only if yx1 +x2 (t) = yx1 (t) + yx2 (t)
(2.88)
A system is homogeneous if and only if (2.89)
yax (t) = ayx (t) And a system is linear if it is both additive and homogeneous so that ya1 x1 +a2 x2 (t) = a1 yx1 (t) + a2 yx2 (t)
(2.90)
We are now ready to tackle the problems at hand. (i) For this system R
x1 (t) −−→ 𝛽x1 (t) + A ≡ yx1 (t) R
x2 (t) −−→ 𝛽x2 (t) + A ≡ yx2 (t) R
x1 (t) + x2 (t) −−→ 𝛽[x1 (t) + x2 (t)] + A ≡ yx1 +x2 (t) We see that the sum of individual responses to x1 (t) and x2 (t) is yx1 (t) + yx2 (t) = 𝛽x1 (t) + 𝛽x2 (t) + 2A whereas the response to the sum of x1 (t) and x2 (t) is yx1 +x2 (t) = 𝛽x1 (t) + 𝛽x2 (t) + A Since yx1 +x2 (t) ≠ yx1 (t) + yx2 (t), we conclude (in view of Eq. (2.88)) that the system is not additive. In general, a system whose operation includes adding a constant term to the excitation (producing a nonzero response under conditions of zero excitation) will not be additive.
2.10 Systems and Their Properties
To assess the system’s homogeneity, we note that R
x(t) −−→ 𝛽x(t) + A ≡ yx (t) R
ax(t) −−→ 𝛽[ax(t)] + A ≡ yax (t) We see that yax (t) = 𝛽ax(t) + A, whereas ayx (t) = 𝛽ax(t) + aA and these are not equal. That is, applying an amplitude-scaling factor to the input does not yield the same result as applying this factor to the system output. In view of Eq. (2.89), the system is therefore not homogeneous. The lack of additivity or homogeneity demonstrated above is enough for us to conclude that the system is nonlinear. (ii) This DT system specification leads to R
x1 [n] −−→ 𝛽x1 (n − 1) + 𝛾x12 (n) ≡ yx1 (n) R
x2 [n] −−→ 𝛽x2 (n − 1) + 𝛾x22 (n) ≡ yx2 (n) R
x1 [n] + x2 [n] −−→ 𝛽[x1 (n − 1) + x2 (n − 1)] + 𝛾[x1 (n) + x2 (n)]2 ≡ yx1 +x2 (n) We see that the sum of individual responses to x1 [n] and x2 [n] is yx1 (n) + yx2 (n) = 𝛽x1 (n − 1) + 𝛽x2 (n − 1) + 𝛾x12 (n) + 𝛾x22 (n) and is not equal to the response to the sum of x1 [n] and x2 [n] which is yx1 +x2 (n) = 𝛽x1 (n − 1) + 𝛽x2 (n − 1) + 𝛾x12 (n) + 𝛾x22 (n) + 2𝛾x1 (n)x2 (n) Thus, the system is not additive. To assess the system’s homogeneity, observe that R
x[n] −−→ 𝛽x(n − 1) + 𝛾x2 (n) ≡ yx (n) R
ax[n] −−→ 𝛽[ax(n − 1)] + 𝛾[ax(n)]2 ≡ yax (n) Since yax (n) = a[𝛽x(n − 1) + a𝛾x2 (n)] ≠ ayx (n) we conclude that the system is not homogeneous. The system is also nonlinear in view of its lack of additivity and homogeneity. In general, any system whose input–output relation is a polynomial of order 2 or higher is nonlinear. (iii) For the centred MA system, let there be a second input sequence denoted v[n]. It follows from Eq. (2.83) that R
x[n] −−→ R
v[n] −−→
p ∑ 1 x(n − k) ≡ yx (n) 2p + 1 k=−p p ∑ 1 v(n − k) ≡ yv (n) 2p + 1 k=−p R
x[n] + v[n] −−→
p ∑ 1 [x(n − k) + v(n − k)] ≡ yx+v (n) 2p + 1 k=−p
115
116
2 Introduction to Signals and Systems
We see that yx+v (n) = =
p ∑ 1 [x(n − k) + v(n − k)] 2p + 1 k=−p p p ∑ ∑ 1 1 x(n − k) + v(n − k) 2p + 1 k=−p 2p + 1 k=−p
= yx (n) + yv (n) which means that the system is additive. To assess homogeneity, we observe that R
ax[n] −−→
p ∑ 1 [ax(n − k)] ≡ yax (n) 2p + 1 k=−p
Since p ∑ 1 [ax(n − k)] 2p + 1 k=−p [ ] p ∑ 1 =a x(n − k) 2p + 1 k=−p
yax (n) =
= ayx (n) it follows that the system is homogeneous. And since the system is both additive and homogeneous, we can conclude that it is linear. Here are questions for you to ponder: can an additive system ever fail to be homogeneous, or a homogeneous system fail to be additive? Are additivity and homogeneity different statements of the same principle? Is additivity a necessary and sufficient condition for linearity?
2.10.5 Time Invariance A system is said to be time invariant if its operations do not depend on the time of application of the excitation. Thus, the only effect (on system response) of a time shift in the excitation is simply a corresponding time shift in the response. By analogy with the homogeneity property in which amplitude scaling has the same impact whether applied at the input or at the output, we see that time-invariance means that a time shift has the same impact on the system whether applied at the input or at the output. If this is not the case, the system is said to be time variant. A formal statement of the time invariance property is that a system is time invariant if given that R
x(t) −−→ y(t)
(CT system)
or R
x[n] −−→ y(n)
(DT system)
then R
x(t − to ) −−→ y(t − to ) R
x[n − no ] −−→ y(n − no )
(time invariant CT) (time invariant DT)
(2.91)
2.10 Systems and Their Properties
To present a simple check for time invariance, let us introduce the delay operator 𝔻 such that 𝔻[x(t), to ] ≡ x(t − to ) denotes a delay of x(t) by to . A system is therefore time invariant if and only if y𝔻[x(t), to ] (t) = 𝔻[y(t), to ]
(2.92)
The left-hand side of this equation is the response to a delayed version of the excitation, whereas the right-hand side is a delayed version of the response to a non-delayed excitation. Letting {•} denote the system operation, we may write Eq. (2.92) more completely as {𝔻[x(t), to ]} = 𝔻[{x(t)}, to ]
(2.93)
This makes it explicitly clear that the order of system operation and time-shift may be interchanged in a time-invariant system. That is, you will obtain the same output if you apply a time-shift to the signal before transmitting it through the system or you transmit the signal through the system before applying the time shift to the system’s response. Worked Example 2.13 Assess each of the following systems for time invariance: (i) The system of Figure 2.34a if the system resistance R is a function of time, denoted R(t). (ii) A DT system with input–output relation y(n) = Kx(Mn), where K ≠ 0 is a constant and M > 0 is an integer. (iii) The backward difference system of Eq. (2.79). (i) This is a simple system with output (voltage) R(t)x(t) in response to an input (current) x(t). Employing Eq. (2.93), we write {x(t)} = R(t)x(t) 𝔻[{x(t)}, to ] = 𝔻[R(t)x(t), to ] = R(t − to )x(t − to ) The above is the result of first passing the signal through the system at time t and then delaying the response. If, on the other hand, we pass a delayed version x(t − to ) of the signal through the system at time t, we obtain {𝔻[x(t), to ]} = {x(t − to )} = R(t)x(t − to ) Since R(t − to ) ≠ R(t), it follows that {𝔻[x(t), to ]} ≠ 𝔻[{x(t)}, to ]. Therefore, the system is time variant. In general, a system with time-dependent components will be time-variant. (ii) Passing the signal through this system at time interval count n and then delaying the response by no yields 𝔻[{x[n]}, no ] = 𝔻[Kx(Mn), no ] = Kx(M(n − no )) = Kx(Mn − Mno ) If, on the other hand, we pass a delayed version x(n − no ) of the signal through the system at interval n, we obtain {𝔻[x[n], no ]} = {x[n − no ]} = Kx(Mn − no ) Notice that interchanging the order of time shift and operator action produces different results Kx(Mn − Mno ) and Kx(Mn − no ). Thus, the system is time variant. Figure 2.36 presents a graphical illustration of this solution for input signal { n, n = −10, −9, · · · , 0, 1, 2, · · · , 10 x(n) = 0, Otherwise and parameters K = 1, M = 2, and no = 3. Figure 2.36a–c show the implementation of the right-hand side of Eq. (2.93) in which the excitation signal x[n] is first passed through the system and then the delay is applied
117
118
2 Introduction to Signals and Systems
x(n)
10
(a)
n
0 –10
yx(n) = x(2n)
10
(b)
n
0 –10
D[x(2n), 3] ≡ yx(n – 3)
10 (c)
n
0 –10 –10 –9 –8 –7 –6 –5 –4 –3 –2 –1 0
1
2
3
4
5
6
7
8
9 10
Figure 2.36 Worked Example 2.13(ii): (a) Excitation x(n); (b) System’s response y x (n) to x(n); (c) Delay of system’s response; (d) Excitation x(n); (e) Delayed excitation x(n − 3); (f) System’s response to x(n − 3).
to the system’s response. Figure 2.36d–f, where a plot of the excitation signal x[n] is repeated in (d) for convenience of comparison, show the execution of the left-hand side of Eq. (2.93) in which the excitation is first delayed and the delayed version is then passed through the system. The outcomes of these two operations are shown in parts (c) and (f) of the figure. The resulting sequences are clearly different. (iii) First passing the signal through the system and then delaying the response yields 𝔻[{x[n]}, no ] = 𝔻[x(n) − x(n − 1), no ] = x(n − no ) − x(n − no − 1) If, on the other hand, the signal is first delayed before this delayed version is passed through the system, we obtain {𝔻[x[n], no ]} = {x[n − no ]} = x(n − no ) − x(n − no − 1) We see that the two results are identical. Interchanging the order of time shift and operator action does not alter the outcome. The system is therefore time invariant.
2.10.6 Invertibility A system is said to be invertible if the excitation signal can be recovered without error or ambiguity from the response signal. If there is a one-to-one mapping between input and output then the system is invertible. If, on the other hand, two or more different inputs can produce the same output then the system is not invertible. An example of a noninvertible system is the square operation y(t) = x2 (t). In this system, given, for example, y(t) = 4, it is not possible to recover the input x(t), since one cannot be sure whether x(t) = 2 or x(t) = −2. Another
2.10 Systems and Their Properties
x(n)
10
(d)
n
0 –10
ʋ(n) ≡ x(n – 3)
10 (e)
n
0 –10 10
(f)
n
0 –10 –10 –9 –8 –7 –6 –5 –4 –3 –2 –1 0
Figure 2.36
1
2
3
4
5
6
7
8
9 10
(Continued)
√ example is y(t) = cos(x(t)), since given, for example, y(t) = 1∕ 2 one cannot know if x(t) = (−45 + 360n)∘ or x(t) = (45 + 360n)∘ , where n is an integer. In the process of analogue-to-digital conversion (ADC), a quantiser is employed to convert a continuous-value input sample into a discrete-value output sample by mapping all input values falling within one quantisation interval to a single output value. Once this has been done, information about the precise value of the input is lost and it is not possible to go from quantiser output back to the original input. The quantiser is therefore a noninvertible system. If a system whose operation is represented by the operator is invertible then we can define an inverse system with operator inv such that x(t) = inv {y(t)} = inv {{x(t)}} = inv {x(t)} = {x(t)} = x(t) The operators of a system and its inverse therefore satisfy the relationship inv =
(2.94)
where is an identity operator that performs the identity mapping y(t) = x(t). Consider a memoryless delayed-response system having linear input–output relation y(n) = Kx(n − no )
(2.95)
where K ≠ 0 is a constant and no > 0 is an integer. This system is invertible, and we may derive an inverse system function as follows. First, rearrange to make x() the subject of the equation x(n − no ) =
1 y(n) K
119
120
2 Introduction to Signals and Systems
Next, substitute m = n − no (and hence n = m + no ) 1 y(m + no ) K Finally, replace m by n to return to our standard use of n to denote sampling interval count 1 (2.96) x(n) = y(n + no ) K Eq. (2.96) is the inverse input–output relation by which the excitation x(n) may be recovered from the response sequence y[n] of the system specified by Eq. (2.95). Notice that this inverse system is non-causal, whereas the original system is causal. The system operator and its inverse inv may be written as follows x(m) =
= K(𝔻[x(n), no ]) 1 = (𝔻[y(n), −no ]) K It should be obvious from our opening comments on invertibility that the system inv
(2.97)
y(n) = Kx2 (n − no ) is noninvertible since its inverse is √ 1 x(n) = ± y(n + no ) K and the uncertainty in sign makes unambiguous recovery impossible. If the system is a transmission channel then the excitation x(t) is the transmitted signal, the response y(t) is the received signal, and the effects of the channel are represented by the operator . We usually need to recover or closely estimate x(t) from y(t), and this requires connecting in cascade with the channel an inverse system known as an equaliser having an operator inv that satisfies Eq. (2.94). This process is illustrated in Figure 2.37. We explore the subject of equalisers in more detail in Chapter 4 (Section 4.7). Worked Example 2.14 We wish to determine the inverse of a backward difference system that operates on a causal input signal. Note that, strictly speaking, causality applies only to systems. However, the term is sometimes also applied to signals, so that a causal signal means a signal that has zero value at time t < 0 (or n < 0 for discrete signals). A backward difference system has input–output relation given earlier in Eq. (2.79) and repeated below for convenience. y(n) = x(n) − x(n − 1)
x(t)
Figure 2.37
(2.98)
y(t)
Invertible system.
x(t)
2.11 Summary
Figure 2.38 shows a block diagram of this system. Let us write out the output sequence starting at n = 0, for which y(0) = x(0) − x(−1) = x(0) (since the signal is causal and therefore x(−1) = 0) y(0) = x(0) y(1) = x(1) − x(0) y(2) = x(2) − x(1) ⋮ y(n − 1) = x(n − 1) − x(n − 2) y(n) = x(n) − x(n − 1) Since the left and right-hand side of each of the above equations are equal, the sum of the left-hand side of all these equations will also be equal to the sum of the right-hand side. Observe that, in summing the right-hand side, all terms cancel out except x(n). Thus x(n) = y(0) + y(1) + y(2) + · · · + y(n) =
n ∑
(2.99)
y(n)
k=0
The inverse of the backward difference system of Eq. (2.98) is therefore the accumulator operation of Eq. (2.99). Backward difference operation is common in data compression systems in which the encoder only transmits the difference between the present sample and the immediate previous sample. This worked example therefore shows that the decoder will be able to recover the original samples through the accumulation operation specified by Eq. (2.99), in which the present sample x(n) is the sum of all received differential samples up to the present.
+
x(n)
–
Σ
y(n)
x(n – 1)
D[x(n),1] Figure 2.38
Backward difference system.
2.11 Summary This chapter presented a detailed introduction to telecommunication signals and systems. As in fact in the rest of the book, the topics covered, and the approach and depth of treatment, have been carefully selected to strike a delicate balance between comprehensive rigour and succinct simplicity. The aim is to minimise needless mathematical hurdles and present material that is fresh and accessible for newcomers, insightful, and informative for everyone, and free of knowledge gaps in supporting further and more advanced studies.
121
122
2 Introduction to Signals and Systems
We started with a nonmathematical discussion of signals and their various features and types, using both subjective and objective classifications. Recognising and understanding such signal features will inform the methods selected for analysis and design and can greatly simplify various computations on the signal. We then introduced and characterised a range of special waveforms which can be used as building blocks to model other signals. Doing so allows, for example, results obtained for such special signals to be quickly extended and applied to practical or arbitrary signals without the need to start from first principles. To give a simple example, if you already know the response of a linear transmission system to a unit step function then by expressing a rectangular pulse in terms of unit step functions you can quickly obtain the response of the system to a rectangular pulse without having to undertake lengthy calculations. We discussed the sinusoidal signal in exhaustive detail because of its unique centrality to telecommunication and most of the basic concepts of the subject area. We also discussed logarithmic units and their applications and emphasised some of the short cuts to exploit and pitfalls to avoid, which you will do well to review again as required until you have complete mastery of what is an indispensable engineering tool. Basic system properties were also introduced, and the discussion was supported with worked examples covering both continuous-time and discrete-time systems. You are now well equipped to be able to assess the properties of a system given its input–output relation, an essential step in selecting the correct tools and approach for analysis and design. In the next chapter we build on this introduction and delve a little deeper into more general and arbitrary signals, including random signals, exploring various signal operations and characterisations, as well as signals and systems analysis tools in the time domain.
Questions 2.1
Determine which of the following signals is an energy signal and which is a power signal: (a) Impulse function 𝛿(t) (b) Sinc function sinc(t) (c) Unit step function u(t) (d) Signum function sgn(t) (e) Complex exponential function exp(j𝜔t).
2.2
By obtaining values at three time instants, namely the start, mid-point, and end of each pulse, verify the correctness of the respective expressions on the right-hand side of Eqs. (2.19) and (2.20) for the ramp pulse and inverse ramp pulse.
2.3
Figure Q2.3 shows a rectangular pulse g(t), bipolar pulse g1 (t), and triangular pulse g2 (t). g(t) = Arect(t/τ)
g1(t)
g2(t) = Atrian(t/τ) A
A A –τ/2
τ/2
t
τ/2 –τ/2 –A
Figure Q2.3
Question 2.3.
t
–τ/2
τ/2
t
Questions
(a) Express g1 (t) in terms of rectangular pulses. (b) Express g2 (t) (i) in terms of g1 (t), and hence (ii) in terms of rectangular pulses. (Hint: consider the role of derivatives and integrals in solving (b).) 2.4
Show that the trapezoidal pulse of Figure 2.13g can be expressed as given by Eq. (2.26) in terms of ramp and rectangular pulses.
2.5
Determine the average value of each of the following signals: (a) −10u(t) volts. (b) Rectangular pulse train of 30 V amplitude and 25% duty cycle. (c) Triangular pulse train of amplitude 50 V, duty cycle 20%.
2.6
Determine the average value of a trapezoidal pulse train that has the following parameters: (a) Amplitude A = 100 V. (b) Duration of rising edge of pulse 𝜏 r = 10 ms. (c) Duration of flat or constant portion of pulse 𝜏 c = 20 ms. (d) Duration of falling edge of pulse 𝜏 f = 15 ms. (e) Duration of no pulse in each cycle 𝜏 0 = 55 ms.
2.7
Determine the period, angular frequency, peak-to-peak, and rms values of the signal g(t) = −sin(t).
2.8
Determine (a) The phase difference; and (b) Delay between the signals g1 (t) = −25 cos(200𝜋t + 70∘ ); and g2 (t) = 5 sin(200𝜋t − 30∘ ), where t is time in ms. (NOTE: you must specify which signal leads the other.) (c) The wavelength (in air) of the resulting sound if g1 (t) in (b) were applied as input to a loudspeaker.
2.9
Figure Q2.9 shows oscilloscope displays of sinusoidal waveforms g1 (t), g2 (t), g3 (t), and g4 (t). Write down the sinusoidal expression of each waveform in the form Acos(2𝜋ft + 𝜙).
2.10
Given the voltage signals v1 (t) = 20 sin(t) v2 (t) = −5 cos(t − 60∘ ) v3 (t) = sin(2𝜋 × 103 t) v4 (t) = 2 sin(4𝜋 × 103 t − 60∘ ) (a) Determine the period of each signal. (b) Determine the phase difference between v1 (t) and v2 (t). (c) Make clearly labelled sketches of v3 (t) and v4 (t) over three cycles starting at t = 0 and hence explain why there is no defined phase difference between sinusoids of unequal frequencies. (d) If v3 (t) is applied as input to a loudspeaker, what is the wavelength (in air) of the resulting audible sound?
123
2 Introduction to Signals and Systems
g2(t)
10 V/div
10 V/div
g1(t)
t
Time base = 50 s/div
t
Time base = 62.5 ms/div
g3(t)
g4(t) 10 V/div
10 V/div
124
t
t
Time base = 500 μs/div
Figure Q2.9
2.11
Time base = 25 ms/div
Question 2.9.
Given the signals g1 (t) = 30 cos(200πt) g2 (t) = 40 sin(200πt) g3 (t) = −50 sin(200πt + 50∘ ) determine the signal g(t) = g1 (t) − g2 (t) + g3 (t) expressed in the form g(t) = A cos(200πt + 𝜙) The amplitude A and phase 𝜙 of g(t) must be specified.
2.12
Determine the output signal v(t) of each of the summing devices shown in Figure Q2.12a and b, expressed in the form v(t) = A cos(𝜔t + 𝜙) The amplitude A and phase 𝜙 of v(t) must be specified.
2.13
The solid curve of Figure Q2.13 is a sketch of a voltage waveform v(t), which consists of a DC component and two sinusoids. Write out the full expression for v(t) in the form v(t) = A0 + A1 cos(2𝜋f1 t + 𝜙1 ) + A2 cos(2𝜋f2 t + 𝜙2 ) explaining clearly how you arrive at the values of A0 , A1 , A2 , f 1 , f 2 , 𝜙1 , and 𝜙2 .
Questions
ʋ1(t)
ʋ1(t) +
(a)
–
Σ
ʋ(t)
Σ –
+ ʋ(t)
ʋ3(t)
ʋ2(t) ʋ1(t) = 4cos(100t – π/3) ʋ2(t) = 3sin(100t) Figure Q2.12
+
ʋ2(t)
(b)
ʋ1(t) = 30sin(ωt) ʋ2(t) = 40sin(ωt – π/4) ʋ3(t) = 50sin(ωt + π/6)
Question 2.12.
18 15 12 ʋ(t) (volts) →
9 6 3 0 –3 –6 –9
0
0.5
Figure Q2.13
Question 2.13.
Table 2.5 Power (W)
1
1.5 Time, t (ms) →
2
2.5
3
Question 2.15. Power (dBm)
Power (dBW)
Power (dBmp, 3.1kHz)
Volts (dBV)
Volts (dBu)
100 20 30 10 10
2.14
Given two voltage signals v1 (t) = 10 + 4 sin(2𝜋 × 103 t) volts and v2 (t) = 25 sin(4𝜋 × 104 t) volts, sketch the waveform of the product signal g(t) = v1 (t)v2 (t).
2.15
Fill in the blank cells in Table 2.5. Assume that the input and output resistances of the system are equal to 1 Ω, which also means that we are dealing with normalised power.
125
126
2 Introduction to Signals and Systems
2.16
A transmission system consists of the following gain and loss components in the listed order: (1) Gain = 20 dB (2) Gain = 50 dB (3) Loss = 95 dB (4) Gain = 30 dB (5) Loss = 12 dB. (a) Draw a block diagram of the transmission system and calibrate it in dBr with the ZRP located at the input of the second component. (b) A signal monitored at the ZRP has a level of 60 dBm0. What will be the absolute power level of this signal at the output of the transmission system?
2.17
A discrete-time system is described by the following difference equation y(n) = x(n − 1) + 2y(n − 1) (a) By preparing a tabular layout of the response h(n) of this system to a unit impulse excitation 𝛿[n] = {1, 0, 0, 0, …} for values of n from 0 to 6 show that h(n) grows indefinitely with n and hence that the system is unstable. (b) It is easy to erroneously conclude that this is a memoryless delayed-response system by thinking that the current output depends exclusively on the input at a single past instant. Show that the output of this system depends on multiple past inputs and that in fact every previous input has influence on the current output, which means that the system has memory.
2.18
A system performs the accumulation operation specified by the input–output relation y(n) =
n ∑
x(k)
k=−∞
(a) Assess the properties of this system in relation to memory, stability, causality, linearity, time invariance, and invertibility. (b) Derive the difference equation (or input–output relation) of an inverse system which can be employed to recover the excitation x[n] of this system from its response y[n].
127
3 Time Domain Analysis of Signals and Systems
Impossible is a lie peddled on the back of a wrong method. In this Chapter ✓ ✓ ✓ ✓ ✓ ✓ ✓
Basic signal operations, including a detailed treatment of time shifting, time reversal, and time scaling. Random signals analysis. Standard statistical distribution functions and their applications in telecommunications. Signal characterisation in the time domain. System characterisation and analysis in the time domain. Autocorrelation and convolution operations. Worked examples involving a mix of heuristic, graphical, and mathematical approaches to demonstrate the interpretation and application of concepts and to deepen your insight and hone your skills in engineering problem solving. ✓ End-of-chapter questions to test your understanding and (in some cases) extend your knowledge of the material covered.
3.1 Introduction Signals and systems can be characterised both in time and frequency domains. This chapter focuses on a time domain description and analysis of signals and systems. For signals, we may specify the waveform structure of the signal – through a waveform plot as might be displayed on an oscilloscope, or a mathematical expression that gives values of the signal as a function of time, or a statistical description in the case of random signals. From these, various parameters of the signal may be determined, such as amplitude, peak-to-peak value, average value, root-mean-square (rms) value, period, duty cycle, instantaneous values and their rates of change, power, energy, autocorrelation, etc. We may specify the input–output relation or transfer characteristic of any system. However, for a special class of systems which are linear and time invariant, the response of the system to a unit impulse input (known as the system’s impulse response) is the most versatile characterisation tool. The response of such a system to any arbitrary input may be obtained by convolving the input signal with the system’s impulse response. We start with a discussion of basic signal operations on which are built other more complex signal manipulations in the time domain, e.g. autocorrelation, convolution, etc. Basic signal operations include time shifting, time reversal, time scaling, addition, subtraction, multiplication, division, differentiation, and integration. However, we discuss only the first three operations. These modify the signal through a direct manipulation of the independent Communication Engineering Principles, Second Edition. Ifiok Otung. © 2021 John Wiley & Sons Ltd. Published 2021 by John Wiley & Sons Ltd. Companion Website: www.wiley.com/go/otung
128
3 Time Domain Analysis of Signals and Systems
variable (which in this chapter is time t). It is assumed that the reader has a good grounding in the other basic operations. These modify the signal through a direct manipulation of the dependent variable (which is the value of the signal). We then turn our attention to random signals and develop the basic time domain tools necessary to deal with this very important class of signals. This is followed by a presentation of various standard distribution functions with an emphasis on how the random variations modelled by each function arise in practical communication systems. A detailed treatment of the most common time domain signal analysis and characterisation processes applicable to both random and deterministic signals is then presented. This includes discussion of signal mean, signal power, and rms value, signal energy, signal autocorrelation, and the covariance and correlation coefficient between two signals. The rest of the chapter is then devoted to learning the craft of time domain analysis of linear time invariant (LTI) systems with an emphasis on gaining a mastery of the convolution operation as applied to continuous-time and discrete-time systems.
3.2 Basic Signal Operations For convenience and clarity, but without any loss of generality, we discuss the operations of time shifting, time reversal, and time scaling using the signal g(t) whose waveform is shown in Figure 3.1a. This signal has a value of 1 in the interval from t = −2𝜏 to 0, decreases linearly from a value of 1 to a value of 0 in the interval t = 0 to 𝜏, and is zero everywhere else. Thus ⎧1, ⎪ g(t) = ⎨1 − t∕𝜏, ⎪0, ⎩
−2𝜏 ≤ t ≤ 0 0≤t≤𝜏
(3.1)
Elsewhere
3.2.1 Time Shifting (Signal Delay and Advance) To uncover the waveform corresponding to the signal g(t − 𝜏), we visit the definition of g(t) in Eq. (3.1) and replace t by t − 𝜏 wherever it occurs in that equation. Thus ⎧1, ⎪ t−𝜏 , g(t − 𝜏) = ⎨1 − 𝜏 ⎪ ⎩0,
−2𝜏 ≤ t − 𝜏 ≤ 0 0≤t−𝜏 ≤𝜏 Elsewhere
Adding 𝜏 to each section of the inequalities on the right-hand side and simplifying 1 − (t − 𝜏)∕𝜏 to 1 − (t∕𝜏 − 1) = 2 − t∕𝜏 yields ⎧1, ⎪ t g(t − 𝜏) = ⎨2 − , 𝜏 ⎪ ⎩0,
−𝜏 ≤ t ≤ 𝜏 𝜏 ≤ t ≤ 2𝜏
(3.2)
Elsewhere
This means that g(t − 𝜏) has a value of 1 in the interval t = (−𝜏, 𝜏), decreases linearly from a value of 1 to a value of 0 in the interval t = (𝜏, 2𝜏), and is 0 everywhere else, which corresponds to the waveform shown in Figure 3.1b. It is clear that g(t − 𝜏) is the original waveform g(t) shifted to the right through 𝜏 so that it starts (𝜏 units of time) later. For positive 𝜏 (as implied in Figure 3.1), g(t − 𝜏) is the signal g(t) manipulated only by delaying it by 𝜏 without altering its waveform size (i.e. amplitude), shape, or span (i.e. duration). The effect of delaying an arbitrary signal g(t) by 𝜏 is thus to produce the signal g(t − 𝜏) in which every event (e.g. peak, trough, transition, etc.) that occurs in g(t) at some time t = to also occurs within g(t − 𝜏) but at the later time t = to + 𝜏.
3.2 Basic Signal Operations
g(t) 1 (a) –3τ
0
–τ
–2τ
τ
2τ
3τ
τ
2τ
3τ
τ
2τ
3τ
t
g(t – τ) 1 (b) –3τ
0
–τ
–2τ
t
g(t + τ) 1 (c) –3τ Figure 3.1
–τ
–2τ
0
t
Time-shift operations on signal g(t) in (a), including time-delay by 𝜏 in (b) and time-advance by 𝜏 in (c).
Similarly, we obtain the expression for the signal g(t + 𝜏) as ⎧1, ⎪ t+𝜏 , g(t + 𝜏) = ⎨1 − 𝜏 ⎪ ⎩0, ⎧1, ⎪ t = ⎨− , ⎪ 𝜏 ⎩0,
−2𝜏 ≤ t + 𝜏 ≤ 0 0≤t+𝜏 ≤𝜏 Elsewhere
−3𝜏 ≤ t ≤ −𝜏 −𝜏 ≤ t ≤ 0
(3.3)
Elsewhere
The waveform of g(t + 𝜏) specified in the above equation is plotted in Figure 3.1c, which shows that g(t + 𝜏) is the original waveform g(t) shifted to the left through 𝜏, so it starts earlier. In general, regardless of the type of functional dependence on t, the signal g(t + 𝜏) is simply the signal g(t) advanced in time by 𝜏 so that every event in g(t) also occurs within g(t + 𝜏) but at 𝜏 units of time earlier, and with all waveform features of shape, size, and span remaining unchanged. The basic signal operations of time delay and time advance are very common and in fact vital in communication systems. For example, sophisticated (digital) signal filtering is achieved by combining variously weighted and time-delayed versions of a signal to obtain a suitably filtered output. And if the system is non-real-time then the combined components may even involve time-advanced versions of the signal since these ‘future’ samples are in storage and therefore are available for use to produce an output at the present time. Signal delay arises naturally in signal transmission through a medium. In this context it may be formally defined as the time separation between an event in one signal and a corresponding event in a reference signal. The signal in question y(t) may be the observation at one point of a transmission system (e.g. the received or output signal), whereas the reference x(t) may be the same signal observed at an earlier point (e.g. input) of the system. The delay then specifies the time taken for the signal to propagate between the two points. If the transmission system is distortionless, there is a one-to-one correspondence between events in y(t) and x(t); each pair of events being separated by a constant time 𝜏, which is the group delay. In this case y(t) = Kx(t − 𝜏)
(3.4)
129
130
3 Time Domain Analysis of Signals and Systems
where K is a positive real constant scaling factor. That is, y(t) is a scaled (i.e. boosted or attenuated) and delayed version of x(t). Equivalently, albeit less conventionally, by replacing t with t + 𝜏 wherever it occurs in the above equation, we may write 1 x(t) = y(t + 𝜏) K which states that x(t) is the signal y(t) advanced by time 𝜏 and scaled by factor 1/K. The angle of a sinusoidal signal increases at a rate of 2𝜋f rad/s and hence by 2𝜋f𝜏 rad in a time interval 𝜏. Thus, there is a relationship between phase difference Δ𝜙 (in radian) and delay 𝜏 (in seconds). If x(t) and y(t) are sinusoidal signals having the same frequency f (and period T = 1/f ) the two parameters are related by ) ( Δ𝜙 Δ𝜙 T (3.5) = 𝜏= 2𝜋 2𝜋f In a multipath transmission medium, such as terrestrial microcellular wireless environments, the signal reaches the reception point via N paths – comprising one primary path (usually the direct and hence shortest path) and (N − 1) secondary paths. In this case, delay is referenced to the primary signal. The delay of the ith path is referred to as excess delay 𝜏 i . Clearly, the excess delay of the primary path is zero. In such a transmission medium, one transmitted narrow pulse becomes a closely spaced sequence of N narrow pulses – effectively one broadened pulse – at the receiver. This pulse broadening, or dispersion, places a limit on the symbol rate that can be used for transmission in the medium if an overlap of adjacent symbols, a problem known as intersymbol interference (ISI), is to be avoided. The duration of each pulse is extended by an amount equal to the medium’s total excess delay, which is simply the excess delay 𝜏 imax of the last received significant path. The delay profile of a multipath transmission medium is often characterised by two parameters, namely average or mean delay 𝜏 avg and rms delay spread 𝜏 rms 𝜏avg =
N ∑
𝛼i 𝜏i
i=1
𝜏rms =
√ √N √∑ √
2 (𝛼i 𝜏i2 ) − 𝜏avg
(3.6)
i=1
where 𝛼 i is a ratio giving the power received via the ith path as a fraction of the total received power. Note that the count of paths in Eq. (3.6) includes the primary or direct path for which 𝜏 i = 0. ISI is negligible, and the medium is treated as narrowband if symbol duration T s > > 𝜏 rms ; otherwise, the medium is wideband and will be a source of frequency-selective fading.
3.2.2 Time Reversal The operation of time reversal is executed mathematically on a signal g(t) by replacing the independent variable t by −t wherever it occurs in the defining equation for g(t). Doing this for the signal g(t) defined earlier in Eq. (3.1), whose waveform is plotted again in Figure 3.2a for convenience, we obtain ⎧1, ⎪ −t g(−t) = ⎨1 − , 𝜏 ⎪ ⎩0,
−2𝜏 ≤ −t ≤ 0 0 ≤ −t ≤ 𝜏 Elsewhere
Multiplying each section of the inequalities on the right-hand side by −1 flips each inequality and yields ⎧1, ⎪ t g(−t) = ⎨1 + , 𝜏 ⎪ ⎩0,
0 ≤ t ≤ 2𝜏 −𝜏 ≤ t ≤ 0 Elsewhere
(3.7)
3.2 Basic Signal Operations
g(t) 1 (a) –3τ
–2τ
–τ
0
g(t – τ)
τ
2τ
3τ
τ
2τ
3τ
τ
2τ
3τ
τ
2τ
3τ
τ
2τ
3τ
τ
2τ
3τ
t
1 (b) –3τ
–2τ
–τ
0
g(t + τ)
t
1 (c) –3τ
–2τ
–τ
0
g(–t)
t
1 (d) –3τ
–2τ
–τ
0
g(–t – τ)
t
1 (e) –3τ
–2τ
–τ
0
g(–t + τ)
t
1 (f) –3τ Figure 3.2
–2τ
–τ
0
t
Various time-shift and time-reversal operations on signal g(t).
This equation states that g(−t) has a value of 1 in the interval t = (0, 2𝜏), decreases linearly from a value of 1 at t = 0 to a value of 0 at t = −𝜏 in the interval t = (−𝜏, 0), and is zero everywhere else. This variation corresponds to the waveform shown in Figure 3.2d, from which a graphical process for time reversal is readily seen: the time-reversed signal g(−t) has a waveform that is simply a mirror image of g(t), with the double faced mirror lying along the y axis. That is, the waveform of g(−t) is the result of flipping the waveform of g(t) about the y axis. It follows from our definition of even and odd signals (see Eq. (2.8)) that an even signal is time-reversal invariant (i.e. it is unchanged by the operation of time reversal), whereas an odd signal is changed only by a factor of −1 when subjected to time reversal. Time reversal and time shift are often combined in more advanced signal operations (e.g. convolution) whereby a time-reversed signal is subsequently time-shifted, or a time-shifted signal is subsequently time-reversed. It is therefore of practical interest to explore the forms of the signals g(−t − 𝜏) and g(−t + 𝜏). Substituting −t − 𝜏 for t wherever it occurs in the definition of g(t) in Eq. (3.1), we obtain ⎧1, ⎪ −t − 𝜏 , g(−t − 𝜏) = ⎨1 − 𝜏 ⎪ ⎩0,
−2𝜏 ≤ −t − 𝜏 ≤ 0 0 ≤ −t − 𝜏 ≤ 𝜏 Elsewhere
131
132
3 Time Domain Analysis of Signals and Systems
Adding 𝜏 to each section of the inequalities on the right-hand side before multiplying through by −1 yields ⎧1, ⎪ t g(−t − 𝜏) = ⎨2 + , 𝜏 ⎪ ⎩0,
−𝜏 ≤ t ≤ 𝜏 −2𝜏 ≤ t ≤ −𝜏
(3.8)
Elsewhere
This waveform is sketched in Figure 3.2e from which, by comparing Figure 3.2d and e, it can be seen that g(−t − 𝜏) is the signal g(−t) advanced by 𝜏. That is, g(−t − 𝜏) is the result of time-reversing g(t) followed by a time advance through 𝜏. Alternatively, by comparing Figure 3.2b and e, we see that g(−t − 𝜏) is the time reversal of g(t − 𝜏). Thus, g(−t − 𝜏) may also be obtained from g(t) by first delaying g(t) by 𝜏 followed by a time reversal. In general, time reversal of any signal followed by a time advance of 𝜏 is equivalent to time-delaying the signal by 𝜏 followed by time reversal. There is yet one other helpful way of viewing the combined operations of time reversal and delay necessary to obtain g(−t − 𝜏) from g(t). Noting that the −t axis increases to the left, and by analogy with g(t − 𝜏) which represents a shift of g(t) by 𝜏 to the right (i.e. in the direction of increasing t, which implies introducing a delay relative to t), we can state that g(−t − 𝜏) is the result of delaying g(−t) by 𝜏 relative to −t (i.e. shifting g(−t) leftwards through 𝜏 in the direction of increasing −t). It should now be a straightforward matter for you to proceed as above to obtain ⎧1, ⎪t g(−t + 𝜏) = ⎨ , ⎪𝜏 ⎩0,
𝜏 ≤ t ≤ 3𝜏 0≤t≤𝜏
(3.9)
Elsewhere
This waveform is sketched in Figure 3.2f from which we see, by comparison with Figure 3.2d, that g(−t + 𝜏) is the signal g(−t) delayed by 𝜏; or (by comparison with Figure 3.2c) that g(−t + 𝜏) is the time reversal of g(t + 𝜏). Thus, the signal g(−t + 𝜏) may be obtained from g(t) either by first applying a time reversal followed by a delay of 𝜏, or by first applying a time advance of 𝜏 followed by time reversal. In general, the time advance of any signal through 𝜏 followed by time reversal is equivalent to time reversal of the signal followed by a delay of 𝜏. Note again, since the −t axis decreases to the right, it follows that g(−t + 𝜏) is an advance of g(−t) through 𝜏 relative to −t achieved by shifting g(−t) rightward (in the direction of decreasing −t). Letting 𝕋 ℝ[x] denote time reversal of signal x, 𝔻[x, 𝜏] denote time delay of x by 𝜏, and 𝔻[x, −𝜏] denote the time advance of x by 𝜏, we may summarise the interrelationships and effects of various combinations of time shift and time reversal operations as follows g(t − 𝜏) = 𝔻[g(t), 𝜏] = 𝕋 ℝ[g(−t − 𝜏)] g(t + 𝜏) = 𝔻[g(t), −𝜏] = 𝕋 ℝ[g(−t + 𝜏)] g(t) = 𝔻[g(t + 𝜏), 𝜏] = 𝔻[g(t − 𝜏), −𝜏] = 𝕋 ℝ[𝕋 ℝ[g(t)]] g(−t) = 𝕋 ℝ[g(t)] = 𝔻[g(−t + 𝜏), −𝜏] = 𝔻[g(−t − 𝜏), 𝜏] g(−t + 𝜏) = 𝔻[g(−t), 𝜏] = 𝕋 ℝ[g(t + 𝜏)] = 𝔻[𝕋 ℝ[g(t)], 𝜏] = 𝕋 ℝ[𝔻[g(t), −𝜏]] g(−t − 𝜏) = 𝔻[g(−t), −𝜏] = 𝕋 ℝ[g(t − 𝜏)] = 𝔻[𝕋 ℝ[g(t)], −𝜏] = 𝕋 ℝ[𝔻[g(t), 𝜏]]
(3.10)
3.2.3 Time Scaling Time scaling is performed on a signal g(t) by multiplying the independent variable t, wherever it occurs in the defining equation for g(t), by a real scale factor 𝛽 > 0. Carrying out this mathematical process on the signal g(t) defined in Eq. (3.1), whose waveform is repeated in Figure 3.3a for ease of comparison, we obtain ⎧1, ⎪ t g(𝛽t) = ⎨1 − 𝛽 , 𝜏 ⎪ ⎩0,
−2𝜏∕𝛽 ≤ t ≤ 0 0 ≤ t ≤ 𝜏∕𝛽, Elsewhere
𝛽>0
(3.11)
3.2 Basic Signal Operations
g(t) 1 (a) –6τ
–5τ
–4τ
–3τ
–2τ
–τ
0
τ
2τ
3τ
4τ
5τ
6τ
3τ
4τ
5τ
6τ
3τ
4τ
5τ
6τ
3τ
4τ
5τ
6τ
3τ
4τ
5τ
6τ
t
g(βt), β = 1/3 1 (b) –6τ
–5τ
–4τ
–3τ
–2τ
–τ
0
τ
2τ
t
g(βt), β = 2 1 (c) –6τ
–5τ
–4τ
–3τ
–2τ
–τ
0
τ
2τ
t
0.5τ g(βt), β = –1/3 1 (d) –6τ
–5τ
–4τ
–3τ
–2τ
–τ
0
τ
2τ
t
g(βt), β = –2 1 (e) –6τ
Figure 3.3
–5τ
–4τ
–3τ
–2τ
–τ 0 –0.5τ
τ
2τ
t
Time scaling of signal g(t).
The waveform of g(𝛽t) is sketched in Figure 3.3b for 𝛽 = 1/3, and in Figure 3.3c for 𝛽 = 2. Notice that the effect of time scaling is to change the duration of the signal and the spacing (and hence rate of occurrence) of events within the signal. For example, an event such as the transition from a value of 1 to a value of 0 that takes 𝜏 units of time to complete in g(t) now takes 𝜏/𝛽 units of time to complete in the time-scaled signal g(𝛽t). In general, and for any arbitrary signal g(t), when 𝛽 < 1 (as illustrated in Figure 3.3b) then g(𝛽t) is expanded in time by the factor 1/𝛽 when compared to g(t). This means that the duration of corresponding events (e.g. pulse width, cycle period, etc.) is longer in g(𝛽t) than in g(t) by this factor (1/𝛽). Equivalently, we may state that the rates of occurrence of events (and hence the frequencies and bandwidth) of g(𝛽t) are those of g(t) reduced by a factor of 1/𝛽. Conversely, if 𝛽 > 1 (as illustrated in Figure 3.3c) then g(𝛽t) is compressed in time, durations of corresponding events are shorter in g(𝛽t) than in g(t) by a factor of 𝛽, events occur faster, and the bandwidth of g(𝛽t) is 𝛽 times the bandwidth of g(t). For a quick example, if a sinusoidal signal x(t) = A cos(2𝜋ft + 𝜙) is time-scaled by a factor 𝛽, we obtain another sinusoidal signal y(t) given by y(t) = x(𝛽t) = A cos(2𝜋𝛽ft + 𝜙) Notice that the frequency of y(t) is 𝛽 times that of x(t), whereas the period of y(t) is 1/𝛽f , which is 1/𝛽 times the period of x(t). This observation is consistent with the above discussions on the impact of time scaling on frequency and event durations.
133
134
3 Time Domain Analysis of Signals and Systems
There is one important exception to the impact of time scaling, and that is when it is applied to a zero-duration signal such as the impulse function 𝛿(t). To discover how 𝛿(𝛽t) is related to 𝛿(t), recall Eq. (2.29), which defines the impulse function as a limiting case of the rectangular pulse. Substituting 𝛽t for t in this equation yields ( )] ( )] [ [ 𝛽t t 1 1 𝛿(𝛽t) = lim rect = lim rect 𝜏→0 𝜏 𝜏→0 𝜏 𝜏 𝜏∕𝛽 The term in square brackets is a rectangular pulse of height 1/𝜏 and duration 𝜏/𝛽, which therefore has a constant area of 1 1 𝜏 × = 𝜏 𝛽 𝛽
as
𝜏→0
Since 𝛿(t) has a unit area and is an even function, it follows that 𝛿(𝛽t) =
1 𝛿(t), |𝛽|
𝛽≠0
(3.12)
Thus, time scaling of 𝛿(t) by 𝛽 uniquely translates into amplitude scaling by 1/|𝛽|. As earlier noted, the scale factor 𝛽 will be a nonzero and positive real number. But what if it is negative? To explore this scenario, note that if 𝛽 < 0, then −𝛽 = |𝛽| so that Eq. (3.11) becomes ⎧1, ⎪ t g(𝛽t) = ⎨1 + |𝛽| , 𝜏 ⎪ ⎩0,
0 ≤ t ≤ 2𝜏∕|𝛽| −𝜏∕|𝛽| ≤ t ≤ 0 ,
𝛽 x1 , is therefore Pr(x1 ≤ X ≤ x2 ) = PX (x2 ) − PX (x1 )
(3.14)
Considering an infinitesimal range (x, x + dx) and dividing Pr(x ≤ X ≤ x + dx) by the size dx of the range yields the derivative of PX (x) which we denote as pX (x) Pr(x ≤ X ≤ x + dx) PX (x + dx) − PX (x) dPX (x) = ≡ ≡ pX (x) (3.15) dx dx dx The parameter pX (x) is known as the probability density function or probability distribution function (PDF) of the random variable X. The PDF of X is in general not constant but varies with x in a way that reflects the relative
135
136
3 Time Domain Analysis of Signals and Systems
likelihood of each x value. Equation (3.15) shows that pX (x)dx is the probability that X will have a value around x in an infinitesimal range of size dx. That is Pr(x ≤ X ≤ x + dx) = pX (x)dx which is the area of a rectangle of height pX (x) and width dx. Thus, if we plot a graph of pX (x) versus x, then the probability that X lies in a sizeable range (x1 , x2 ) will be the sum of the areas of all such rectangles from pX (x1 )dx at x1 to pX (x2 )dx at x2 , which is simply the area under the PDF curve in that range. In other words Pr(x1 ≤ X ≤ x2 ) =
x2
(3.16)
pX (x)dx
∫x1
Thus PX (x1 ) = Pr(−∞ < X ≤ x1 ) =
x1
∫−∞
(3.17)
pX (x)dx
The CDF of a random variable is therefore the integral of its PDF, which is equivalent to the statement in Eq. (3.15) that pX (x) is the derivative of PX (x). Since, by definition, probability is a positive number between 0 and 1, and we are certain that X has a value in the range (−∞, ∞), we may make the following characterising statements about pX (x) and PX (x) (i) pX (x) ≥ 0 ∞
(ii)
∫−∞
pX (x)dx = 1
(iii) 0 ≤ PX (x) ≤ 1 (iv) PX (∞) = 1 (v) PX (−∞) = 0 (vi) PX (x2 ) ≥ PX (x1 ) for x2 ≥ x1
(3.18)
Equation (3.18)(ii) indicates that the total area under a PDF curve is always 1, and Eq. (3.18)(vi) means that PX (x) is monotonically non-decreasing. Another distribution function of interest is the complementary cumulative distribution function (CCDF) or exceedance distribution function. Denoted PX (x), this is the probability Pr(X > x) that the random variable X takes on some value that exceeds x. Long-term statistical data on rainfall rate R (in mm/hr) and rain-induced path loss L (in decibel, dB) are often analysed to obtain their CCDF, which gives the probability (usually expressed as a percentage of time in an average year) that the random variable exceeds a specified level. The CCDF is often presented as a graphical plot from which the value of the random variable exceeded with a given probability may also be read. For example, the level of rain attenuation exceeded for 0.1% of time in an average year is needed to design a super high frequency (SHF) link that can withstand rain impairments for 99.9% of the time in an average year. Using Eq. (3.18)(ii) we obtain the following relationship between the CCDF and CDF of a random variable ∞
CCDF ≡ PX (x1 ) =
∫x1
x1
∞
pX (x)dx =
= 1 − PX (x1 ) ≡ 1 − CDF
∫−∞
pX (x)dx −
∫−∞
pX (x)dx (3.19)
We may characterise a random variable X using various aggregate quantities, called moments of the random variable. The nth moment of X is the expected value or expectation of X n , denoted E[X n ], which is the average of all the realisations of X n in an infinite number of observations of X (or trials of the chance experiment). It is worth emphasising the difference between sample mean of a quantity and the expectation of that quantity. Sample mean is an estimate of the true mean obtained by averaging the quantity over a random subset or sample of the population. Expectation is the true mean of the quantity obtained by averaging over the entire population. Thus
3.3 Random Signals
E[X n ] is obtained by adding all the possible values of X raised to power n, each addition being weighted by the relative likelihood of occurrence of the value. That is ∞
E[X n ] =
∫−∞
xn pX (x)dx
(3.20)
The first two moments are the most useful and respectively give the mean 𝜇X and mean-square value (i.e. total power) of the random variable. The power of signals is discussed further in Section 3.5. ∞
E[X] ≡ 𝜇X =
∫−∞
xpX (x)dx ∞
E[X 2 ] ≡ Total power =
∫−∞
x2 pX (x)dx
(3.21)
Another important characterising parameter is the second central moment of the random variable X. This is the expected value of the squared deviation of X from its mean, i.e. the expected value of (X − 𝜇X )2 , which is more commonly called variance and denoted 𝜎X2 or Var[X]. Since the expectation E[⋅] operator is linear, we note that 𝜎X2 = E[(X − 𝜇X )2 ] = E[X 2 − 2X𝜇X + 𝜇X2 ] = E[X 2 ] − 2𝜇X E[X] + 𝜇X2 = E[X 2 ] − 2𝜇X2 + 𝜇X2 = E[X 2 ] − 𝜇X2 = E[X 2 ] − (E[X])2 ≡ Total power − DC power
(3.22)
Noting that total power comprises AC power and DC power components, it follows that the second moment is the total power of the random variable, whereas the variance specifies AC power and the square of the mean gives the DC power. The square root of variance is called the standard deviation 𝜎 X of the random variable X. To further emphasise the physical significance of the variance parameter, note that if 𝜎X2 is zero it means that X has a constant value 𝜇 X , and its PDF is therefore a unit impulse function located at 𝜇X . A nonzero value of 𝜎X2 is an indication that X does take on values other than 𝜇 X . And the larger the value of 𝜎X2 , the more is the spread of the values of X around its mean, and hence the broader is the PDF curve. Also, it follows from Eq. (3.22) that when a random variable has zero mean then its variance equals the total power of the random variable. In general, we can specify the expectation of any function of X, and not just its powers X n . Thus, denoting a function of X as g(X), we have ∞
E[g(X)] =
∫−∞
(3.23)
g(x)pX (x)dx
In the special case where g(X) is the exponential function exp(j𝜔X), this expectation is known as the characteristic function of X, denoted 𝜓 X (𝜔). Thus ∞
E[exp(j𝜔X)] ≡ 𝜓X (𝜔) =
∫−∞
pX (x)ej𝜔x dx
(3.24)
The characteristic function has many interesting properties and applications. To see one such application, let us take the nth derivative of the above equation with respect to 𝜔, noting that it is only ej𝜔x that is a function of 𝜔 on the right-hand side ∞
∞
dn dn dn 𝜓X (𝜔) = pX (x)ej𝜔x dx = pX (x) n (ej𝜔x )dx n n ∫ ∫ d𝜔 d𝜔 −∞ d𝜔 −∞ ∞
=
∫−∞
jn xn pX (x)ej𝜔x dx ∞
= jn
∫−∞
xn pX (x)dx ≡ jn E[X n ] at 𝜔 = 0
137
138
3 Time Domain Analysis of Signals and Systems
Thus E[X n ] =
1 jn
(
dn 𝜓X (𝜔) d𝜔n
)| | | | |𝜔=0
(3.25)
It will become obvious in the next chapter that if ℙX (𝜔) denotes the Fourier transform (FT) of pX (x), then the right-hand side of Eq. (3.24) and hence 𝜓 X (𝜔) is simply ℙX (−𝜔). Thus, given its PDF pX (x), the characteristic function of the random variable X can be read from widely available FT tables. The nth moment of X can then be obtained by differentiation of 𝜓 X (𝜔) as in Eq. (3.25), which avoids the more difficult integration involved in Eq. (3.20). A real-valued random variable X that can take on values in the infinite interval (−∞, ∞) does not have a finite peak value or amplitude since any value up to infinity is theoretically possible. However, we may define the peak value Ap of such a random variable as the point at which its CDF is less than, say, 0.99. That is, PX (Ap ) ≤ 0.99, which means that X takes on values less than this ‘peak’ with a probability of 0.99. Thus the ‘peak value’ so defined will be exceeded 1% of the time on average.
3.3.3 Stationarity and Ergodicity The task of characterising a random process is greatly simplified if the process is stationary. A random process is said to be strict-sense stationary if its distribution functions do not change with time. However, if only its first two moments are time-independent then the process is said to be wide-sense stationary (WSS). Defining the autocorrelation function of the random process as the expected value of the product of two random variables of the process, e.g. X 1 and X 2 in Figure 3.4, a WSS random process therefore has a constant mean and an autocorrelation function RX (𝜏) that depends only on the time shift 𝜏 = t2 − t1 between the samples but not on time t1 . Thus, if the random process of Figure 3.4 is WSS, then E[X1 ] = E[X2 ] = Mean of process (3.26)
E[X1 X2 ] = RX (𝜏)
A strict-sense stationary process may have a further attribute whereby, although its sample realisations cannot be identical in waveform, it may turn out that they have identical statistical characterisations, and we can obtain the statistical properties of the random process by observing any one of its sample realisations. Such a random process is said to be ergodic. In this case we can replace the more difficult task of ensemble averaging (to obtain, say, the mean of the process) with time averaging over any one of the observed sample realisations. For example, if the random process of Figure 3.4 is ergodic then the mean 𝜇X and autocorrelation function RX (𝜏) – discussed in Section 3.5.5 – can be obtained as follows 𝕋 ∕2
1 xk (t)dt 𝕋 →∞ 𝕋 ∫−𝕋 ∕2
𝜇X = lim
𝕋 ∕2
1 xk (t)xk (t + 𝜏)dt 𝕋 →∞ 𝕋 ∫−𝕋 ∕2
RX (𝜏) = lim
k = 1, 2, 3, · · · , or N
(3.27)
Notice that the random process can be treated simply as a nonperiodic power signal, allowing us to apply the analysis tools that are developed in Section 3.5. We will henceforth assume that all random processes we deal with are ergodic. This assumption applies to additive white Gaussian noise (AWGN), the type of random noise signal usually assumed in communication systems.
3.4 Standard Distribution Functions
3.4 Standard Distribution Functions The performance of telecommunication systems and radio wave propagation is subject to randomness in the transmission medium, intractability of a wide variety of contributing factors, and the nondeterministic behaviour of a population of users. A statistical approach is therefore often necessary in the analysis and design of communication systems. In this section we discuss six of the most common standard distribution functions and how they are employed to model and characterise random noise, signals, and events in various telecommunication system scenarios.
3.4.1 Gaussian or Normal Distribution When a random variable X is the result of a sufficiently large number of statistically independent and random additive contributions, the central limit theorem stipulates that X will follow a Gaussian distribution. Also called the normal distribution (because it occurs very frequently in real life), this distribution is fully characterised by the mean 𝜇 X and variance 𝜎X2 of the random variable and is often denoted as (𝜇X , 𝜎X2 ). Before delving into the details of the normal distribution, it would be helpful to understand two example scenarios where it might arise. Consider a random variable X defined as the count of the number of heads obtained when a fair coin is tossed n times. Clearly, the value of X is random and can be any integer number from 0 to n. It is also the result of n statistically independent and random contributions since the outcome of each toss is both random and unaffected by other tosses. When n is very large, all the conditions of the central limit theorem are satisfied, and X will thus follow a Gaussian distribution with mean 𝜇 X = n/2. However, by far the most relevant and important example to our study is that of noise voltage as a random variable. Thermal noise, for example, results from the additive contribution of many electrons, each moving randomly about an equilibrium position under thermal agitation. There is no general drift of electrons in one direction in the conductor, this requiring some externally applied voltage. If we observe the conductor over a significant period, we will find that at some instants the random motions lead to a slight depletion of electrons in the upper half of the conductor, which gives a positive noise voltage. At other instants there is a slight surplus of electrons in the upper half of the conductor, giving a negative noise voltage. The average or mean of the noise voltage samples measured in the observation interval is zero since there is no reason for either a surplus or depletion of electrons to be preferred. All the conditions of the central limit theorem are satisfied, and the observed noise voltage will be a random variable X having a normal distribution with mean 𝜇 X = 0 and variance 𝜎X2 equal to the noise power. The PDF of a Gaussian random variable (𝜇X , 𝜎X2 ) is given by ) ( (x − 𝜇X )2 1 (3.28) exp − pX (x) = √ 2𝜎X2 2𝜋𝜎X2 This PDF is plotted in Figure 3.5 for various values of mean 𝜇X and variance 𝜎X2 . Notice that it has a maximum √ value of 1∕ 2𝜋𝜎X2 at x = 𝜇 X (i.e. the mean is the most likely value or mode of X). Furthermore, it decreases symmetrically away from 𝜇 X in a log-quadratic or bell-shaped fashion, the rate of decrease being tampered by the variance such that the PDF goes from being quite flat at 𝜎X2 → ∞ to being a unit impulse function 𝛿(x − 𝜇 X ) at 𝜎X2 → 0. Note that the value of pX (x) does not depend on the absolute value of x but on the deviation |x − 𝜇X | of x from the mean. Note also that, as the PDF broadens with increasing variance, its height or peak value reduces since its total area is always equal to 1. A standard Gaussian random variable Z is one that has zero mean and unit variance. Thus, substituting 𝜇 X = 0 and 𝜎X2 = 1 in Eq. (3.28) gives the PDF of (0, 1) as 1 pZ (z) = √ exp(−z2 ∕2) 2𝜋
(3.29)
139
3 Time Domain Analysis of Signals and Systems
0.4
0.3
pX(x)
140
0.2
0.1
0 –25 Figure 3.5
–20
–15
–10
–5
0 x
5
10
15
20
25
PDF pX (x) of Gaussian random variable (𝜇X , 𝜎 X 2 ) for various values of mean 𝜇X and variance 𝜎 X 2 .
The CDF and CCDF of a Gaussian random variable cannot be obtained in closed-form, but we may express them in terms of the Q-function Q(x) as follows by using Eq. (3.17) in line 1 below, the substitution z ≡ (x − 𝜇X )/𝜎 X in line 2, and property (ii) of Eq. (3.18) in the last line ( ) x1 (x − 𝜇X )2 1 exp − PX (x1 ) = dx √ ∫−∞ 2𝜎X2 2𝜋𝜎X2 (x1 −𝜇X )∕𝜎X
=
∫−∞
1 2 √ exp(−z ∕2)dz 2𝜋
∞
∞
1 1 2 2 √ exp(−z ∕2)dz − √ exp(−z ∕2)dz ∫(x1 −𝜇X )∕𝜎X 2𝜋 ∫−∞ 2𝜋 ( ) x1 − 𝜇X =1−Q 𝜎X =
Thus, (𝜇X , 𝜎X2 ) has CDF and (from Eq. (3.19)) CCDF given by ( ) x − 𝜇X CDF ≡ Pr(X ≤ x) ≡ PX (x) = 1 − Q 𝜎X ) ( x − 𝜇X CCDF ≡ Pr(X > x) ≡ PX (x) = Q 𝜎X
(3.30)
(3.31)
where ∞
Q(z1 ) =
∫z1
1 2 √ exp(−z ∕2)dz 2𝜋
(3.32)
is the probability that a zero-mean, unit-variance Gaussian random variable has a value that exceeds z1 . Values of Q(x) may be read from widely available tables such as provided in Appendix C.3 for positive values of x. To obtain values of Q(x) for negative arguments, use the relation Q(x) = 1 − Q(−x)
(3.33)
For example, Q(−2) = 1 − Q(2) = 1 − 0.0228 = 0.9772. We know that the most likely value of X is the mean 𝜇X . So, what is the probability that X will take on a value that exceeds the mean by, say, m standard deviations, where
3.4 Standard Distribution Functions
pX(x)
Pr(X ≤ μX – x1) = Pr(X > μX + x1)
Area = 1 – 2a = Pr (μX – x1 ≤ X ≤ μX + x1)
Area = a
Area = a
μX – x1 Figure 3.6
Pr(X > μX + x1)
μX
μX + x1
x
Even symmetry of Gaussian PDF pX (x) about its mean 𝜇X .
m is any real number? This is Pr(X > 𝜇 X + m𝜎 X ), which is obtained using Eq. (3.31) as ) ( 𝜇X + m𝜎X − 𝜇X = Q(m) Pr(X > 𝜇X + m𝜎X ) = Q 𝜎X The table of Q-function values (in Appendix C) gives Q(1) = 0.15866, Q(2) = 0.02275, Q(3) = 0.00135. Considering Figure 3.6 and the even symmetry of the normal distribution about its mean, it means that X lies within m standard deviations of the mean with probability 1 − 2Q(m). Thus, 68.27%, 95.45%, and 99.73% of the samples of X respectively lie within one, two, and three standard deviations of the mean. In addition to the Q-function, the Gaussian CDF and CCDF are also often expressed in terms of the complementary error function erfc(x), which is defined by the integral 2 erfc(x) = √ 𝜋 ∫x
∞
exp(−y2 )dy
(3.34)
To obtain these equivalent expressions, consider the following extract from the third and fourth lines of Eq. (3.30) ( ) ∞ x − 𝜇X 1 2 Q 1 = √ exp(−z ∕2)dz ∫(x1 −𝜇X )∕𝜎X 2𝜋 𝜎X √ Making the substitution y = z∕ 2 in the right-hand side yields ( ) ∞ ∞ √ x1 − 𝜇X 1 2 1 exp(−y2 ) 2dy = √ Q = exp(−y2 )dy √ ∫ x1 −𝜇√X 2𝜋 𝜎X 2 𝜋 ∫ x1 −𝜇√X 𝜎X 2 𝜎X 2 ( ) − 𝜇 x 1 1 X = erfc √ 2 𝜎 2 X
Thus 1 Q(x) = erfc 2
(
) x √
;
√ erfc(x) = 2Q( 2x)
(3.35)
2
It follows from Eqs. (3.31) and (3.35) that
( ) x − 𝜇X 1 CDF ≡ Pr(X ≤ x) ≡ PX (x) = 1 − erfc √ 2 2𝜎X ( ) x − 𝜇X 1 CCDF ≡ Pr(X > x) ≡ PX (x) = erfc √ 2 2𝜎X
(3.36)
141
142
3 Time Domain Analysis of Signals and Systems
The complementary error function is also extensively tabulated in Appendix C and will be our preferred function when discussing bit error probability in digital transmission systems. For negative arguments (not covered in the complementary error function table), use the relation erfc(x) = 2 − erfc(−x)
(3.37)
The tabulated values of erfc(x) and Q(x) provided in Appendix C are accurate. However, if you prefer a direct calculation or if your x value is not covered in the table then you may use the following formulas which give results that are remarkably accurate to within 0.275%. For Q(x) the recommended formula is [1] exp(−x2 ∕2) , x≥0 √ 2𝜋(0.661x + 0.339 x2 + 5.51) which may be manipulated, using Eq. (3.35), into the following formula for erfc(x) Q(x) = √
erfc(x) = √
exp(−x2 ) , √ 𝜋(0.661x + 0.339 x2 + 2.755)
x≥0
(3.38)
(3.39)
For x > > 1, these formulas simplify to the following approximations Q(x) ≅
exp(−x2 ∕2) exp(−x2 ) ; , erfc(x) ≅ √ √ 𝜋x 2𝜋x
Worked Example 3.1
x≫1
(3.40)
Gaussian Noise
Determine the following for a Gaussian noise voltage vn (t) of standard deviation 𝜎 = 2.5 mV: (a) Noise power Pn in dBm. (b) The probability that a sample of vn (t) lies between 𝜎 and 2𝜎. (c) The probability that a sample of vn (t) exceeds −3.0 mV. ( ) 6.25 × 10−6 dBm Pn = 𝜎 2 = (2.5 × 10−3 )2 = 6.25 𝜇W = 10log10 (a) 1 × 10−3 = −22 dBm (b) Denoting vn (t) ≡ X and 𝜎 ≡ 𝜎 X for convenience, the required probability is Pr(𝜎 X ≤ X ≤ 2𝜎 X ). We know from Eqs. (3.14) and (3.30) that ) ( ( )) ( x1 − 𝜇X x2 − 𝜇X − 1−Q Pr(x1 ≤ X ≤ x2 ) = PX (x2 ) − PX (x1 ) = 1 − Q 𝜎 𝜎X ) ( )X ( x2 − 𝜇X x1 − 𝜇X −Q =Q 𝜎X 𝜎X Substituting 𝜇 X = 0, x1 = 𝜎 X , and x2 = 2𝜎 X yields the required probability as ( ) ( ) 2𝜎X 𝜎X Pr(𝜎X ≤ X ≤ 2𝜎X ) = Q −Q = Q(1) − Q(2) 𝜎X 𝜎X = 0.1587 − 0.0228 = 0.1359 where we have read values of Q(1) and Q(2) from the Q-function table. (c) What we are to determine here is the CCDF Pr(X > −3.0 mV), which, from Eqs. (3.31) and (3.33), with 𝜇X = 0, x = −3 mV, and 𝜎 X = 2.5 mV, is ) ( x − 𝜇X Pr(X > x) = Q 𝜎X ) ( −3 mV = Q(−1.2) Pr(X > −3 mV) = Q 2.5 mV = 1 − Q(1.2) = 1 − 0.1151 = 0.8849
3.4 Standard Distribution Functions
3.4.2 Rayleigh Distribution The Rayleigh distribution arises in terrestrial wireless transmission media when the receiver does not have a direct line-of-sight to the transmitter. In this scenario, the received signal R will be the resultant of n multipath signals or rays, each of which is produced by a distinguishable group of scatterers in the medium and may be resolved into an in-phase component as well as a quadrature component. Because of random variations in the magnitude and phase of each ray due to small fluctuations in attenuation and delay, these n in-phase and quadrature components add to produce zero-mean random variables X and Y , respectively, which have equal power and are the real and imaginary parts of R. For a sufficiently large number of scatterers n, the central limit theorem is satisfied so that X and Y are independent zero-mean equal-variance Gaussian random variables, and the envelope of the received signal will follow a Rayleigh distribution. It is worth emphasising here that, in practice (and in the absence of a dominant direct ray), the received signal envelope will be acceptably well modelled by a Rayleigh distribution even for a relatively small number n of multipath signals or scattered rays. To demonstrate this, Figure 3.7 shows the results of a simulation experiment in which n sinusoids having equal amplitude and a uniformly distributed random phase were combined. The PDF of the resultant magnitude or envelope R is plotted along with a Rayleigh PDF for comparison. Notice that for n ≥ 5, the Rayleigh PDF gives a reasonably good fit to the observed distribution of the received signal envelope. Therefore, this model is usually assumed for the received signal envelope in situations (commonly encountered in terrestrial macrocellular wireless communication systems) where there is no line-of-sight between a transmitter and the receiver. As a useful exercise, by counting the rectangular grids estimate the area under each PDF in Figure 3.7. Is your result as expected? One scenario in which the occurrence of the Rayleigh distribution is absolutely assured is in the envelope of thermal noise voltage in conductors. Noise in communication systems is actually a complex random variable R = X + jY , where X is an in-phase zero-mean Gaussian random variable of variance 𝜎X2 that gives rise to a real or active noise power 𝜎X2 , and Y is a quadrature-phase zero-mean Gaussian random variable of variance 𝜎Y2 that produces a reactive or imaginary noise power 𝜎Y2 . This is simply as a result of thermal agitation (for example) producing a random noise voltage that is (to randomly varying degrees) out of step with the cycles of a reference wanted signal and may therefore be resolved into an in-phase component which is either in-step or exactly half-cycle out of step, and a quadrature component that is exactly one-quarter cycle (i.e. ±90∘ ) out of step. Clearly, there is no reason for either phase shift to be preferred in the random agitation of countless electrons, so X and Y are of equal power and we adopt 𝜎X2 = 𝜎Y2 ≡ 𝜎 2 . The relationship between R, X, and Y is illustrated in Figure 3.8, where r is the magnitude of a sample of the complex noise, commonly referred to as the natural envelope (or simply envelope) of the noise, 𝜓 is the phase of that sample and is the value of another random variable Ψ which specifies the phase of the complex noise, and x and y are the in-phase and quadrature components of R and therefore the respective values of X and Y . Thus √ (3.41) r = x2 + y2 ; 𝜓 = tan−1 (y∕x) We are interested in determining the PDFs pR (r) and pΨ (𝜓) of the envelope and phase of the complex noise. Since the random variables X and Y are independent, it is intuitively obvious that R and Ψ are also independent. In Figure 3.8, the probability that a sample of the complex noise R will lie in the shaded elemental area dA = dxdy = rd𝜓dr is therefore Pr(x ≤ X ≤ x + dx, y ≤ Y ≤ y + dy) = Pr(x ≤ X ≤ x + dx) ⋅ Pr(y ≤ Y ≤ y + dy) = pX (x)dx ⋅ pY (y)dy = Pr(r ≤ R ≤ r + dr, 𝜓 ≤ Ψ ≤ 𝜓 + d𝜓) = Pr(r ≤ R ≤ r + dr) ⋅ Pr(𝜓 ≤ Ψ ≤ 𝜓 + d𝜓) = pR (r)dr ⋅ pΨ (𝜓)d𝜓
143
pR(r)
1.3
0.4
0.5 n =4
n =3
1.0
n =5
0.3
0.25
0.2
0.5 0.1
0
0
1
2
3
pR(r)
0
0
1
2
3
4
0.28
0.3
0
0
1
2
3
4
5
0.2 n = 20
n = 10
n =6 0.2
0.2
0.1 0.1
0.1
0
0
1
2
3
r→
4
5
5.8
0
0
2
4 r→
6
8
0
0
2
4
6 r→
8
10
12
Figure 3.7 PDF pR (r) of the resultant amplitude or envelope when n sinusoids of equal amplitude and uniformly distributed random phase are combined. A Rayleigh PDF is shown in dotted line for comparison.
Figure 3.8 Envelope r, phase 𝜓, in-phase component x, and quadrature component y of complex noise.
rdψ
Imaginary axis
3.4 Standard Distribution Functions
dA dψ ψ
x
r
y
dy
dx
Real axis
Substituting the Gaussian PDFs of X and Y as given in Eq. (3.28) into the second line above, bearing in mind that 𝜇 X = 𝜇 Y = 0 and 𝜎X2 = 𝜎Y2 ≡ 𝜎 2 , we obtain ) ) ( ( y2 1 1 x2 pR (r)dr ⋅ pΨ (𝜓)d𝜓 = √ exp − 2 dx ⋅ √ exp − 2 dy 2𝜎 2𝜎 2𝜋𝜎 2 2𝜋𝜎 2 ) ( 2 x + y2 1 dxdy = exp − 2 2𝜋𝜎 2𝜎 2 ) ( 1 r2 = rdrd𝜓 exp − 2𝜋𝜎 2 2𝜎 2 ) ( r 1 r2 = 2 exp − 2 dr ⋅ d𝜓 2𝜋 𝜎 2𝜎 Comparing corresponding terms on the left-hand side and the last line of the right-hand side yields ) ( r r2 pR (r) = 2 exp − 2 , r ≥ 0 𝜎 2𝜎 1 pΨ (𝜓) = , −𝜋 ≤ 𝜓 ≤ 𝜋 (3.42) 2𝜋 pR (r) specifies the PDF of a Rayleigh distribution, whereas pΨ (𝜓) is a uniform distribution. The conditions r ≥ 0 and −𝜋 ≤ 𝜓 ≤ 𝜋 are written into Eq. (3.42) as an important reminder that the Rayleigh distribution is only applicable to a positive real random variable and that 𝜓 is an angle in the range −180∘ to 180∘ . Alternatively, we may invoke the unit step function (Eq. (2.12)) to cover this constraint on the Rayleigh distribution by writing ) ( r r2 (3.43) pR (r) = 2 exp − 2 u(r) 𝜎 2𝜎 The Rayleigh distribution is completely characterised by a single scale parameter 𝜎, which is the standard deviation of the underlying constituent Gaussian distributed in-phase and quadrature random variables. Regarding the uniform distribution, it is worth pointing out that, since the area under a PDF curve is 1, the PDF of a uniformly distributed random variable U that takes on continuous values in the range (u1 , u2 ) is given by pU (u) =
1 , u2 − u1
u1 ≤ u ≤ u2
Integrating the expression for pR (r) yields the Rayleigh CDF as ) ( r r 1 z2 PR (r) = pR (z)dz = 2 z exp − 2 dz ∫0 𝜎 ∫0 2𝜎 To evaluate this integral, recall that 2 d −ar2 e = −2are−ar dr
(3.44)
145
146
3 Time Domain Analysis of Signals and Systems
which when both sides are integrated and a = 1∕2𝜎 2 is substituted, gives the useful result ∫
re−r
2 ∕2𝜎 2
dr = −𝜎 2 e−r
2 ∕2𝜎 2
(3.45)
that we employ to evaluate the integral for PR (r) to obtain [ )]|r )0 ( ( 1 z2 z2 || | PR (r) = 2 −𝜎 2 exp − 2 | = exp − 2 | | 𝜎 2𝜎 2𝜎 ||r |0 Thus ) ( r2 CDF ≡ Pr(R ≤ r) ≡ PR (r) = 1 − exp − 2 2𝜎 ) ( r2 CCDF ≡ Pr(R > r) = 1 − CDF = exp − 2 2𝜎
(3.46)
The characteristic values of the Rayleigh distribution (derived below in Worked Example 3.2) are as follows √ Median = 𝜎 2 ln 2 = 1.17741𝜎 (3.47) √ (3.48) Mean = 𝜎 𝜋∕2 Mean square value ≡ Power = 2𝜎 2 (3.49) √ Rms value = 2𝜎 (3.50) ) ( 𝜋 𝜎2 (3.51) Variance = 2 − 2 Mode = 𝜎 (3.52) 1 −1∕2 0.60653 = Max PDF value = e (3.53) 𝜎 𝜎 Note that the median of a random variable is the value at which its CDF = 0.5, which means that half of the population of the samples of the random variable will be less than its median and the other half will be larger. Figure 3.9a and b show the Rayleigh distributions for 𝜎 = 1 and 𝜎 = 2, including the PDF in (a) and the cumulative and exceedance distribution functions in (b). You may again wish to use a count of the tiny rectangular grids in Figure 3.9a to estimate the area under each PDF and to compare your finding with the expected result. Worked Example 3.2 Determine the median, mean, mean square value, variance, mode, rms, and maximum PDF value for a Rayleigh distributed random variable R in terms of the scale parameter 𝜎 of the distribution. This is an optional worked example. ̃ R ) = 0.5, Denoting the median value of R as 𝓂 ̃ R , and noting that CDF = 0.5 at the median, it means that PR (𝓂 so that Eq. (3.46) gives ( ) ( ) ) ( 𝓂 ̃ 2R 𝓂 ̃ 2R 𝓂 ̃ 2R 1 =2 1 − exp − 2 = 0.5; ⇒ exp − 2 = ; ⇒ exp 2 2𝜎 2𝜎 2𝜎 2 Taking the natural logarithm of both sides gives 𝓂 ̃ 2R ∕2𝜎 2 = ln 2, from which we obtain the desired expression √ for median as: 𝓂 ̃ R = 𝜎 2 ln 2 The mean 𝜇 R is given by the equation ∞
𝜇R =
∫−∞
∞
r ⋅ pR (r)dr =
∫0
r⋅
r −r2 ∕2𝜎 2 e dr 𝜎2
Using integration by parts ∫ u ⋅ dv = uv − ∫ du ⋅ v with u ≡ r, dv ≡ [( )|∞ ) ] ∞( 1 | −r 2 ∕2𝜎 2 −r 2 ∕2𝜎 2 r⋅ re 𝜇R = 2 dr | − dr dr re | ∫ ∫0 ∫ 𝜎 |0
r −r 2 ∕2𝜎 2 e 𝜎2
3.4 Standard Distribution Functions
0.6
σ=1
Median
pR (r)
0.4
Mean
Mode
0.5
0.3
Mode Median
0.1 0
Mean
σ=2 0.2
0
1
2
3
4 r→
5
6
7
8
5
6
7
8
(a) 1 CDF
Probability
0.8 0.6
0.4 CCDF
0.2
0 0
1
2
3
4 r→
(b) Figure 3.9 (a): Probability distribution function (PDF) pR (r) of Rayleigh distribution for 𝜎 = 1, 2; (b): Cumulative distribution function (CDF) and exceedance distribution function (CCDF) of Rayleigh distribution for 𝜎 = 1, 2.
Using Eq. (3.45) to evaluate the indefinite integrals yields [ ( ] ∞ ∞ )∞ 2 2 | 2 2 2 2 1 e−r ∕2𝜎 dr = 0 + e−r ∕2𝜎 dr 𝜇R = 2 − r𝜎 2 e−r ∕2𝜎 || + 𝜎 2 ∫0 ∫0 𝜎 |0 ( 2 ) ∞ But ∫−∞ √1 exp − 2𝜎r 2 dr = 1, being the area of a zero-mean Gaussian PDF. Also, considering the even sym𝜎 2𝜋
metry of this PDF, it follows that ) ) ( ( ∞ ∞ 1 1 r2 r2 dr = 2 dr = 1 exp − exp − √ √ ∫0 𝜎 2𝜋 ∫−∞ 𝜎 2𝜋 2𝜎 2 2𝜎 2 ) ( ∞ 𝜎√ r2 ⇒ exp − 2 dr = 2𝜋 ∫0 2 2𝜎 √ Hence, mean 𝜇R = 𝜎 𝜋∕2 Mean square value is the expectation of R2 and is obtained as ∞
E[R2 ] =
∫−∞
∞
r 2 ⋅ pR (r)dr =
∫0
r2 ⋅
r −r2 ∕2𝜎 2 e dr 𝜎2
147
148
3 Time Domain Analysis of Signals and Systems
Using integration by parts (as previously) and then Eq. (3.45), we obtain [( )|∞ ( ) ] ∞ 2 2 2 2 1 | r2 ⋅ re−r ∕2𝜎 dr | − 2r ⋅ E[R2 ] = 2 re−r ∕2𝜎 dr dr | ∫ ∫0 ∫ 𝜎 |0 ∞ 2 2 |∞ 2 2 2 2 |0 re−r ∕2𝜎 dr = 0 +(2𝜎 2 e−r ∕2𝜎 )| = −(r 2 e−r ∕2𝜎 )| + 2 |0 |∞ ∫0 = 2𝜎 2 Variance is given by Eq. (3.22) as 𝜎R2 = E[R2 ] − 𝜇R2 Therefore, using the previous two results, we obtain ( ) 𝜋 𝜋 𝜎R2 = 2𝜎 2 − 𝜎 2 = 𝜎 2 2 − 2 2 Mode 𝓂 ̂ R is the most likely value of the random variable. This is therefore the value of r at which pR (r) is maximum and may be determined by setting the derivative of pR (r) to zero and solving for r. That is ( ) [ ( )] 2 2 r −r2 ∕2𝜎 2 d r −r2 ∕2𝜎 2 2r 1 d pR (r) = e = e − + 2 e−r ∕2𝜎 dr dr 𝜎 2 𝜎2 2𝜎 2 𝜎 At r = 𝓂 ̂ R , this derivative is zero. Thus [ ( )] 2𝓂 ̂R 𝓂 ̂ R −𝓂̂ 2 ∕2𝜎 2 2 2 1 R e − + 2 e−𝓂̂ R ∕2𝜎 = 0 2 2 𝜎 2𝜎 𝜎 ( ) 2 2 𝓂 ̂ 2 2 1 ⇒ e−𝓂̂ R ∕2𝜎 − 4R + 2 = 0 2𝜎 𝜎 𝓂 ̂ 2R
1 𝜎4 ; ⇒ 𝓂 ̂ 2R = 2 = 𝜎 2 2 𝜎 𝜎 Hence, mode 𝓂 ̂R=𝜎 √ Rms value (denoted Arms ) is simply the square root of the mean square value obtained above. Thus, Arms = 2𝜎 By substituting r = 𝓂 ̂ R into the PDF expression, we obtain the maximum value of the Rayleigh PDF as ) ( 𝜎 𝜎2 ̂ R ) = pR (𝜎) = 2 exp − 2 pR (r)max = pR (𝓂 𝜎 2𝜎 ( ) 1 0.60653 1 = = exp − 𝜎 2 𝜎 ⇒
𝜎4
=
3.4.3 Lognormal Distribution In the absence of terrain features and scatterers, the average power Pav of a received radio signal decreases inversely with distance d far from the transmitter according to d−m , where m = 2 in free space and m = 4 in a plane-earth-only scenario. This would mean that all locations at the same radial distance d from the transmitter receive the same power, which is not the case in practice. The presence of terrain features, including hills, trees, buildings, and other obstacles will, one after the other, interfere with the signal or expend a portion of its energy through absorption, diffraction, or scattering. This introduces several levels of variation of the received signal in a terrestrial wireless environment. The received signal power averaged at a small locality at distance d from the transmitter, called the local-mean power, will vary between locations that have the same distance d from an isotropic transmit-antenna since the environmental interventions experienced by the signal will be location dependent and no longer just distance dependent. If we define Pav as the long-term mean power averaged over all outdoor locations at a distance d from an isotropic transmit-antenna, we find m > 4 so that Pav decreases more rapidly with distance than in the free space and plane-earth-only environments.
3.4 Standard Distribution Functions
The local-mean power will exhibit a long-term random variation about the above-average power, an effect known as shadow fading. This is because of several contributory randomly varying multiplicative loss factors imposed by multiple objects in the signal’s transmission path. Reflections and scattering from these objects will give rise to multiple received rays, which adds an extra dimension of variation in received signal power on a small-scale of a few wavelengths (called multipath fading or fast fading) due to random short-term fluctuations in the instantaneous received signal envelope, which may be modelled by a Rayleigh distribution if there is no dominant direct ray, as discussed in the previous subsection. The impact of shadow fading on local-mean power may be accounted for by expressing this power (in watts) as the product of the average (d−m -dependent) power and n multiplicative independent random loss factors l1 l2 l3 …ln . When expressed in logarithmic units, we see that the dBm power is the result of additive contributions from n independent random variables L1 , L2 , L3 , …, Ln , where Lk = 10log10 lk . For large n, all conditions of the central limit theorem are satisfied and the dBm power values will have a Gaussian distribution. In general, if when we take the logarithm (to whatever base b) of the values x of a random variable X we find that logb (X) follows a Gaussian distribution then X is said to have a lognormal distribution. In addition to the lognormal variation of local-mean power highlighted above, any natural phenomenon in which the observed quantity is the result of a build-up of numerous and independent small percentage changes (such as gain or loss factors) may be modelled by the lognormal distribution. This distribution has therefore been widely used to model variations in signal power, time duration of complex events, rates, length, and size in many fields, including biology, chemistry, medicine, hydrology, social sciences, and engineering. The PDF pX (x) of a lognormal random variable X may be derived straightforwardly as follows. If a random variable X has a lognormal distribution, it means that the random variable Y ≡ ln(X), where ln denotes logarithm to the base of the constant e = 2.718281828459…, has a Gaussian distribution (𝜇Y , 𝜎Y2 ) which we denote as 2 (𝜇ln X , 𝜎ln ) to emphasise that the mean and variance are computed on log values of X. Thus X pX (x)dx ≡ Pr(x ≤ X ≤ x + dx) = Pr(ln x ≤ Y ≤ ln(x + dx)) = Pr[ln x ≤ Y ≤ ln(x(1 + dx∕x))] = Pr(ln x ≤ Y ≤ ln x + ln(1 + dx∕x)) = Pr(ln x ≤ Y ≤ ln x + dx∕x) ln x+dx∕x
pY (y)dy ∫ln x 1 = pY (ln x)dx x where, in line 3 above, we have used the relation ln(1 + a) ≈ a for a ≪ 1, and in the penultimate line we have evaluated the integral by noting that the integration range of size dx/x is so small that the integral is simply the area of a rectangle of height pY (lnx) and width dx/x. Comparing the left-hand side and the last line of the right-hand side 1 pX (x) = pY (ln x) x Now replacing pY (lnx) using its PDF pY (y) as given in Eq. (3.28), we obtain ( ( )2 ) 1 1 ln x − 𝜇ln X pX (x) = u(x) (3.54) √ exp − 2 𝜎ln X x𝜎ln X 2𝜋 =
This is the PDF of a lognormal random variable X, where the unit step function u(x) has been introduced above to explicitly state that pX (x) = 0 for x < 0. A lognormal random variable X can only have positive values in the range (0, ∞) and its distribution is fully characterised by two parameters 𝜇lnX and 𝜎 lnX , which are, respectively, the mean and standard deviation of the log-values of X. The CDF and CCDF of X are obtained as follows ( ( )2 ) x 1 ln y − 𝜇ln X 1 1 exp − dy CDF ≡ PX (x) = √ 2 𝜎ln X 𝜎 2𝜋 ∫0 y ln X
149
150
3 Time Domain Analysis of Signals and Systems
Substituting z ≡ (ln y − 𝜇ln X )∕𝜎ln X , we have dz∕dy = 1∕y𝜎ln X or dy = y𝜎ln X dz, and the limits of integration y = (0, x) become z = (−∞, (ln x − 𝜇ln X )∕𝜎ln X ), since ln 0 = −∞. The above integration therefore simplifies to ( 2) (ln x−𝜇ln X )∕𝜎ln X 1 z dz exp − CDF ≡ PX (x) = √ ∫ 2 2𝜋 −∞ ( 2) ( 2) ∞ ∞ 1 1 z z = √ dz − √ dz exp − exp − 2 2 2𝜋 ∫−∞ 2𝜋 ∫(ln x−𝜇ln X )∕𝜎ln X ( ) ln x − 𝜇ln X =1−Q 𝜎ln X Hence
) ) ( ( ) ln x − 𝜇ln X ln x − 𝜇ln X ln x − 𝜇ln X 1 1 = erfc − = 1 − erfc CDF = 1 − Q √ √ 𝜎ln X 2 2 𝜎ln X 2 𝜎ln X 2 ( ) ) ( ln x − 𝜇ln X ln x − 𝜇ln X 1 = erfc CCDF = 1 − CDF = Q √ 𝜎ln X 2 𝜎 2 (
(3.55)
ln X
Some of the characteristic values of a lognormal random variable X are derived in the next worked example. The results are summarised below for convenience Median = exp(𝜇ln X ) ( Mean = exp 𝜇ln X +
(3.56) 2 𝜎ln X
) (3.57)
2
2 Mean square value = exp[2(𝜇ln X + 𝜎ln X )]
(3.58)
2 2 Variance = exp(2𝜇ln X + 𝜎ln X )[exp(𝜎ln X ) − 1]
(3.59)
2 𝜎ln X)
(3.60)
Mode = exp(𝜇ln X −
2 Rms value = exp(𝜇ln X + 𝜎ln X)
Max PDF value =
(3.61) 2 𝜎ln ∕2) X
exp(−𝜇ln X + √ 𝜎ln X 2𝜋
(3.62)
Figure 3.10 shows various plots of the lognormal PDF to illustrate the impact of its two model parameters 𝜇lnX and 𝜎 lnX . The skewness of the distribution is controlled entirely by 𝜎 lnX . As 𝜎 lnX decreases (left to right of bottom row of Figure 3.10), we notice the following trends: (i) the lognormal PDF becomes increasingly symmetrical about the mode; (ii) the mean and median coalesce towards the mode; (iii) the spread of values reduces and becomes concentrated around the mean, which means that the PDF decreases more sharply away from the mode and its peak value increases to maintain a unit area. If 𝜎 lnX is fixed then skewness remains unchanged, and as 𝜇 lnX increases (left to right of top row of Figure 3.10) the lognormal PDF broadens, its peak value reducing as it expands horizontally away from the origin, making larger values of X relatively more likely to occur. For example, as indicated in the top row of Figure 3.10, where 𝜎 lnX is fixed at 𝜎 lnX = 1, the probability that the lognormal random variable X will take on a value larger than eight increases from just under 2% to nearly 60% as 𝜇lnX increases from 𝜇 lnX = 0 to 2.3026. Thus, we may conclude that 𝜎 lnX is a shape parameter, whereas 𝜇 lnX is a location parameter of the lognormal probability distribution. Worked Example 3.3 Determine the median, mean, mean square value, variance, mode, rms, and maximum PDF value for a lognormal random variable X in terms of its distribution parameters 𝜇lnX and 𝜎 lnX . This is an optional worked example.
pX(x)
0.14
0.6
0.06
μlnX = 1.6094; σlnX = 1
μlnX = 0; σlnX = 1
μlnX = 2.3026; σlnX = 1
0.1
0
2
4
6
8
5
10
Mode, median, mean 5
10 x→
15
20
15
20
3
0.4
2
0.2
1
0
3
4
5 x→
6
0
10
Mean
0
20
30
40
4
Mode, median, mean
μlnX = 1.6094; σlnX = 0.1
0.6
0.1
Figure 3.10
0
0.8 μlnX = 1.6094; σlnX = 0.5
0
0
0.02
Pr(X > 8) = 0.5883
Median
Median
0.05
pX(x)
0
Pr(X > 8) = 0.3192
Mean
0
Mode
Mode
Pr(X > 8) = 0.0188
Median Mean
0.2
0.04
Mode
0.4
7
0 4.5
μlnX = 1.6094; σlnX = 0.02
5 x→
Probability distribution function PDF pX (x) of lognormal distribution for various values of distribution parameters 𝜇lnX and 𝜎 lnX .
5.5
152
3 Time Domain Analysis of Signals and Systems
The median 𝓂 ̃ X is the value of X at which the CDF in Eq. (3.55) equals 0.5. Thus, replacing x in the right-hand side of Eq. (3.55) by 𝓂 ̃ X and equating to 0.5 allows us to solve for 𝓂 ̃X ) ( ) ( ln 𝓂 ̃ X − 𝜇ln X ln 𝓂 ̃ X − 𝜇ln X 1 = 0.5; ⇒ Q = ; 1−Q 𝜎ln X 𝜎ln X 2 ln 𝓂 ̃ X − 𝜇ln X =0 ⇒ 𝜎ln X ̃ X − 𝜇ln X = 0, and hence since Q(0) = 1/2. Thus, ln 𝓂 𝓂 ̃ X = exp(𝜇ln X ) The mean E[X] ≡ 𝜇 X and mean square value E[X 2 ] may be determined by noting that Y = ln(X) is a Gaussian random variable, and X = eY , so that the kth moment of X is given by E[X k ] = E[(eY )k ] = E[ekY ] ∞
=
=
ky
e pY (y)dy =
∫−∞
1 √
𝜎ln X
∞
2𝜋 ∫−∞
1 √
𝜎ln X [
exp −
∞
2𝜋 ∫−∞
ky
−
e e
(y−𝜇ln X )2 2𝜎 2 ln X
dy
2 2 k + 𝜇ln X )y + 𝜇ln y2 − 2(𝜎ln X X
]
2 2𝜎ln X
dy
The numerator of the integrand is in the form y2 − 2by + c, which we may express in the form (y − b)2 + c − b2 . Doing this and simplifying leads to [ ] 2 2 2 ∞ k))2 + 𝜇ln − (𝜇ln X + 𝜎ln k)2 (y − (𝜇ln X + 𝜎ln 1 X X X k exp − E[X ] = dy √ 2 2𝜎ln 𝜎ln X 2𝜋 ∫−∞ X ] [ 2 2 ∞ k))2 2k𝜇ln X + k2 𝜎ln (y − (𝜇ln X + 𝜎ln 1 X X dy exp − + = √ 2 2 2𝜎ln 𝜎ln X 2𝜋 ∫−∞ X ( [ ] } ){ 2 2 ∞ 2k𝜇ln X + k2 𝜎ln (y − (𝜇ln X + k𝜎ln ))2 1 X X = exp exp − dy √ 2 2 2𝜎ln 𝜎 2𝜋 ∫−∞ X ln X
But the term inside the curly brackets is the total area under a Gaussian PDF of variance 𝜎 lnX and mean 𝜇ln X + 2 k𝜎ln , so this term equals 1. Hence X ) ( 2 2k𝜇ln X + k2 𝜎ln X k E[X ] = exp 2 Substituting k = 1 for mean and k = 2 for mean square value yields the desired results ( ) 2 𝜎ln X Mean ≡ E[X] ≡ 𝜇X = exp 𝜇ln X + 2 2 Mean square value ≡ E[X 2 ] = exp[2(𝜇ln X + 𝜎ln X )]
The variance follows straightforwardly from the above two results Variance ≡ 𝜎X2 = E[X 2 ] − (E[X])2 2 2 = exp(2𝜇ln X + 2𝜎ln X ) − exp(2𝜇ln X + 𝜎ln X ) 2 2 = exp(2𝜇ln X + 𝜎ln X )[exp(𝜎ln X ) − 1]
3.4 Standard Distribution Functions
Mode 𝓂 ̂ R is the value of x at which pX (x) is maximum, which corresponds to the point at which dpX (x)/dx = 0. Taking the derivative of pX (x) ( [ ( )2 ]) 1 d d 1 ln x − 𝜇ln X p (x) = √ exp − 2 𝜎ln X dx X dx x𝜎 ln X 2𝜋 [ ( )2 ] 1 1 ln x − 𝜇ln X =− √ exp − 2 𝜎ln X x2 𝜎ln X 2𝜋 [ ( )2 ] ( )( ) ln x − 𝜇ln X 1 1 ln x − 𝜇ln X 1 − + √ exp − 2 𝜎ln X 𝜎ln X x𝜎ln X x𝜎 2𝜋 ) { ln X [ ( )2 ]} ( 𝜇ln X − ln x 1 1 ln x − 𝜇ln X −1 = √ exp − 2 2 𝜎ln X 𝜎ln x2 𝜎ln X 2𝜋 X At x = 𝓂 ̂ X , this derivative is zero. Notice that there are two factors on the right-hand side of the above equation. The first factor (in curly brackets) can only be zero at x = ∞ which is not an acceptable solution for the mode. Therefore, it is the second factor that will be zero at x = 𝓂 ̂ X . Thus 𝜇ln X − ln 𝓂 ̂X 2 𝜎ln X
−1=0
2 ⇒ ln 𝓂 ̂ X = 𝜇ln X − 𝜎ln X
Hence, (upon taking the exponential of both sides) 2 Mode ≡ 𝓂 ̂ X = exp(𝜇ln X − 𝜎ln X)
Rms value (denoted Arms ) is simply the square root of the mean square value, the formula of which was derived above. Thus 2 Arms = exp(𝜇ln X + 𝜎ln X)
By substituting x = 𝓂 ̂ X into the PDF expression, we obtain the maximum value of the lognormal PDF as 2 pX (x)max = pX (𝓂 ̂ X ) = pX (exp(𝜇ln X − 𝜎ln X )) [ ( )2 ]| | 1 1 ln x − 𝜇ln X | = √ exp − | 2 𝜎 | ln X x𝜎ln X 2𝜋 |x=exp(𝜇ln X −𝜎ln2 X ) ( )2 2 ⎡ ⎤ − 𝜎 − 𝜇 𝜇 ln X ln X 1 1 ln X 2 ⎥ = √ exp(𝜎ln X − 𝜇ln X ) exp ⎢− ⎢ 2 ⎥ 𝜎ln X 𝜎ln X 2𝜋 ⎣ ⎦
=
2 ∕2) exp(−𝜇ln X + 𝜎ln X √ 𝜎ln X 2𝜋
3.4.4 Rician Distribution It is often the case, especially in microcellular wireless transmission environments, that the received signal R′ is the resultant of a dominant direct signal of magnitude A combined with n significantly weaker multipath signals. Following our earlier discussion (see Section 3.4.2), and using the direct signal as reference, the situation is now that of combining a constant in-phase component A with random in-phase and quadrature components x
153
3 Time Domain Analysis of Signals and Systems
Imaginary axis
154
dxʹ dϕ ϕ A xʹ = rʹ cos ϕ
rʹdϕ dy dA drʹ rʹ y = rʹ sin ϕ Real axis x
Figure 3.11 Combination of line-of-sight signal of amplitude A with random multipath-induced in-phase, and quadrature components x and y to produce a resultant signal having magnitude r ′ and phase 𝜙.
and y, which are respective samples of zero-mean Gaussian random variables X and Y having equal variance 𝜎 2 . Figure 3.11 shows that the envelope or resultant magnitude of this received signal is given by √ (3.63) r ′ = (A + x)2 + y2 which will obviously have some random variation. We are interested in determining its PDF pR′ (r ′ ). Notice that we are now dealing with two independent random variables, namely Y , which is Gaussian with mean 𝜇Y = 0 and variance 𝜎Y2 = 𝜎 2 , and X ′ = A + X, which is Gaussian with mean 𝜇X ′ = A and variance 𝜎X2 ′ = 𝜎 2 . Thus ) ( y2 1 exp − 2 ; pY (y) = √ 2𝜎 2𝜋𝜎 2 ) ( (x′ − A)2 1 pX ′ (x′ ) = √ (3.64) exp − 2𝜎 2 2𝜋𝜎 2 A sample of the received signal is characterised by its magnitude r ′ given in Eq. (3.63) and phase 𝜙 shown in Figure 3.11, which are respective samples of random variables R′ and Φ. However, these random variables are not independent since the amount 𝜙 by which the phase of R′ deviates from the reference 0∘ direction of the dominant direct signal (due to the addition of the random quadrature component y) depends on the value of A and hence of R′ . This observation has an important implication in that we therefore cannot express the joint probability distribution pR′ ,Φ (r ′ , 𝜙) of R′ and Φ as the product of their respective individual PDFs pR′ (r ′ ) and pΦ (𝜙). The following relationships are obvious from Figure 3.11 and are used in the derivation below. x′ = A + x;
dx′ = dx
dA = dx′ dy = dxdy = r ′ dr ′ d𝜙 x′ = r ′ cos 𝜙; 2
y = r ′ sin 𝜙
x′ + y2 = (r ′ cos 𝜙)2 + (r ′ sin 𝜙)2 = r ′
2
3.4 Standard Distribution Functions
The probability that a sample of R′ will lie in the shaded elemental area dA is given in terms of the joint probability distribution of R′ and Φ as pR′ ,Φ (r ′ , 𝜙)dr ′ d𝜙 = Pr(r ′ ≤ R′ ≤ r ′ + dr ′ , 𝜙 ≤ Φ ≤ 𝜙 + d𝜙) = Pr(x′ ≤ X ′ ≤ x′ + dx′ , y ≤ Y ≤ y + dy) Pr(x′ ≤ X ′ ≤ x′ + dx′ ) ⋅ Pr(y ≤ Y ≤ y + dy) = pX ′ (x′ )dx′ ⋅ pY (y)dy = pX ′ (x′ )pY (y)dxdy = pX ′ (x′ )pY (y)r ′ dr ′ d𝜙 Using the expressions for pX ′ (x′ ) and pY (y) given in Eq. (3.64) yields ) ( (x′ − A)2 + y2 1 ′ ′ pR′ ,Φ (r , 𝜙)dr d𝜙 = r ′ dr ′ d𝜙 exp − 2𝜋𝜎 2 2𝜎 2 ) ( ′2 r + A2 − 2Ar ′ cos 𝜙 ′ ′ 1 r dr d𝜙 exp − = 2𝜋𝜎 2 2𝜎 2 Thus
) ( ′2 r + A2 − 2Ar ′ cos 𝜙 r′ exp − pR′ ,Φ (r , 𝜙) = 2𝜋𝜎 2 2𝜎 2 ′
(3.65)
This is the joint PDF of R′ and Φ, which, as earlier noted, cannot be equated to pR′ (r ′ )pΦ (𝜙) because the two random variables are not independent. To obtain our desired PDF of R′ , observe that pR′ (r ′ )dr ′ is by definition the probability that R′ will have a value in the infinitesimal interval (r ′ , r ′ + dr ′ ). This is the probability that a sample of the received signal will have a magnitude lying anywhere within the annular area demarcated by the two dotted circles in Figure 3.11. We therefore obtain pR′ (r ′ )dr ′ by summing the probability pR′ ,Φ (r ′ , 𝜙)dr ′ d𝜙 that R′ lies within all the elemental areas dA as 𝜙 goes full circle from 0 to 2𝜋 to cover the entire annular area. That is 2𝜋
pR′ ,Φ (r ′ , 𝜙)dr ′ d𝜙 ) 2𝜋 ( ′2 ( ′ ) r + A2 r ′ dr ′ Ar exp − exp cos 𝜙 d𝜙 = ∫0 2𝜋𝜎 2 2𝜎 2 𝜎2
pR′ (r ′ )dr ′ =
∫0
The integral in the above equation cannot be evaluated in closed form but may be expressed in terms of the well-known modified Bessel function of the first kind of zero order I 0 (x) which is given by I0 (x) =
1 2𝜋 ∫0
2𝜋
exp(x cos 𝜙)d𝜙 =
∞ ∑ (x∕2)2m (m!)2 m=0
Thus, we finally obtain ) ( ′) ( ′2 r′ r + A2 Ar I0 pR′ (r ′ ) = 2 exp − 𝜎 2𝜎 2 𝜎2
(3.66)
(3.67)
This is the Rician distribution, named in honour of Stephen Oswald Rice (1907–1986), an American scientist and author of the classic papers on mathematical analysis of noise [2, 3]. As expected, this distribution reduces to the Rayleigh distribution of Eq. (3.42) when A = 0, since I 0 (0) = 1. Furthermore, in the limit A2 ∕2𝜎 2 ≫ 1, the random components are small compared to A, and the quadrature component y may be ignored so that r ′ ≈ A + x, which is a Gaussian random variable of mean A. Thus, the Rician distribution tends towards Rayleigh in the limit A2 ∕2𝜎 2 ≪ 1 when there is no distinct line-of-sight signal, and towards Gaussian in the opposite limit A2 ∕2𝜎 2 ≫ 1 when the line-of-sight signal is overwhelmingly dominant. The Rician distribution depends on two parameters, namely the magnitude A of the reference constant component of power A2 (see Eq. (3.108)) and the power 2𝜎 2 in the random multipath contributions. The ratio between
155
156
3 Time Domain Analysis of Signals and Systems
power in the line-of-sight component and power in the multipath components is known as the Rician K-factor, often expressed in dB (but converted to its non-dB ratio when used in formulas) ( 2) A2 A dB (3.68) K = 2 ; K = 10log10 2𝜎 2𝜎 2 The K-factor is often employed as a parameter to describe and characterise the Rician distribution. It has been reported [4] that measurements in microcellular environments show K in the range 6–30 dB. We may further simplify the Rician PDF by expressing it as pV (v), where v is a dimensionless quantity specifying the factor by which the sample r ′ exceeds 𝜎. That is, introducing v=
r′ ; 𝜎
a=
A ; 𝜎
⇒ r ′ = 𝜎v;
A = 𝜎a
(3.69)
it follows that pV (v)dv = Pr(v ≤ V ≤ v + dv) = Pr(𝜎v ≤ R′ ≤ 𝜎(v + dv)) = Pr(𝜎v ≤ R′ ≤ 𝜎v + 𝜎dv) = pR′ (𝜎v)𝜎dv = 𝜎pR′ (r ′ )dv And hence pV (v) = 𝜎pR′ (r ′ )
(3.70)
Using these substitutions in Eq. (3.67) gives the normalised Rician PDF ) ( 2 v + a2 I0 (av) pV (v) = v exp − 2 In terms of the K-factor this becomes )] [ ( √ v2 I0 (v 2K) pV (v) = v exp − K + 2
(3.71)
(3.72)
The CDF and CCDF of a Rician random variable are given by v
CDF ≡ Pr(R′ ≤ 𝜎v) = Pr(V ≤ v) = p (z)dz ∫0 V )] [ ( v √ √ z2 I0 (z 2K)dz = 1 − Q1 ( 2K, v) z exp − K + = ∫0 2 CCDF ≡ Pr(R′ > 𝜎v) = 1 − CDF √ = Q1 ( 2K, v) where Q1 (c, d) is the Marcum Q-function given by ( 2 ) ∞ z + c2 z exp − I0 (cz)dz Q1 (c, d) = ∫d 2 ( 2 ) ∞ c + d2 ∑ = exp − (c∕d)n In (cd) 2 n=0
(3.73)
(3.74)
(3.75)
and I n is the modified Bessel function of order n given by In (x) =
∞ ∑ (x∕2)2m+n m!(m + n)! m=0
(3.76)
3.4 Standard Distribution Functions
0.6
K = –∞ dB
pRʹ(rʹ)
K = 0 dB
K = 10 dB
K = 6 dB
0.4
0.2
0
0
1
2
3
4
5
6
7
8
7
8
0.1
K
K = –∞ dB
Pr(V > ʋ)
1
0.01
0.001
0
1
2
=
K
0
dB
3
=
K
6d
B
4
5
=
10
6
dB
Normalised signal level, ʋ = rʹ/σ Figure 3.12
Rician PDF pR′ (r ′ ) and CCDF Pr(V > v) at various values of K-factor.
The mean square value of a Rician distributed random variable is its total power 2
Mean square value ≡ E[R′ ] = A2 + 2𝜎 2 = 2𝜎 2 (K + 1) And the mean is given by √ ( ) ( )] ( )[ K 𝜋 K K ′ E[R ] ≡ 𝜇R′ = 𝜎 exp − (K + 1)I0 + KI 1 2 2 2 2
(3.77)
(3.78)
where I 0 and I 1 are as given in Eq. (3.76) for n = 0, 1, respectively. Figure 3.12 shows the PDF and CCDF (exceedance distribution function) of a Rician random variable at various K-factors. The value r ′ of the random variable R′ is normalised by 𝜎 as stated in Eq. (3.69). Notice how the PDF gradually morphs from Rayleigh at K = −∞ dB (and hence A = 0 as per Eq. (3.68)) towards bell-shaped (i.e. Gaussian) as K increases.
3.4.5 Exponential and Poisson Distributions Consider again a Rayleigh random variable R representing the magnitude of the resultant received signal in a multipath environment in the absence of a dominant direct ray. The instantaneous power of R is defined as the square of each sample r of the random variable and will itself be a random variable. Since half of this power is reactive, let us define a new random variable W = R2 /2 representing the active instantaneous power of R. We wish to determine the distribution function of W, and will make use of the following relationships between w and r, respectively the samples of W and R w = r 2 ∕2; r 2 = 2w; r = (2w)1∕2
157
158
3 Time Domain Analysis of Signals and Systems
The PDF pW (w) of W is, by definition, given by pW (w)dw ≡ Pr(w ≤ W ≤ w + dw) = Pr((2w)1∕2 ≤ R ≤ [2(w + dw)]1∕2 ) √ = Pr( 2w ≤ R ≤ (2w)1∕2 (1 + dw∕w)1∕2 ) √ √ √ = Pr( 2w ≤ R ≤ 2w + dw∕ 2w) √ √ 2w+dw∕ 2w
pR (r)dr ∫√2w √ √ = pR ( 2w)dw∕ 2w =
where we have simplified the third line above by noting that (1 + a)n ≈ 1 + na for a ≪ 1, and thus (1 + dw/w)1/2 = 1 + dw/2w since dw is infinitesimally small. Comparing the left-hand side with the last step of the right-hand side, we see that √ pR ( 2w) pW (w) = √ 2w √ √ which means that pW (w) is obtained by evaluating pR (r) (in Eq. (3.43)) at r = 2w and dividing by 2w. Thus √ ( ) 2w 2w 1 exp − pW (w) = ⋅√ 𝜎2 2𝜎 2 2w ( ) w 1 = 2 exp − 2 u(w) (3.79) 𝜎 𝜎 This is the exponential distribution, applicable to a positive random variable that takes on continuous values in the range (0, ∞). The unit step function u(w) is introduced above for the sole purpose of emphasising that pW (w) = 0 for w < 0 and may be omitted, its presence being implied by the nature of the distribution and more specifically the fact that the random variable W is always positive. The exponential distribution is often written in the form pW (w) = 𝜆e−𝜆w
(3.80)
and is characterised by a single parameter 𝜆(≡ 1/𝜎 2 in Eq. (3.79)). In terms of 𝜆, its CDF and CCDF are obtained as w
CDF ≡ Pr(W ≤ w) =
∫0
pW (z)dz
w
=
∫0
𝜆e−𝜆z dz = −e−𝜆z |w0
= 1 − e−𝜆w
(3.81)
CCDF ≡ Pr(W > w) = 1 − CDF = e−𝜆w
(3.82)
And it has the following characteristic values Mean ≡ E[W] = 1∕𝜆
(3.83)
Mean square value ≡ E[W ] = 2∕𝜆 √ √ Rms value ≡ E[W 2 ] = 2∕𝜆 2
2
(3.84) (3.85)
3.4 Standard Distribution Functions 2 Variance ≡ 𝜎W = E[W 2 ] − (E[W])2 = 1∕𝜆2
(3.86)
ln 2 Median = 𝜆
(3.87)
We see from Eqs. (3.79), (3.80), and (3.83) that the received instantaneous active power in a multipath environment (with no dominant direct ray) is exponentially distributed with average value 𝜎 2 . Although we have derived it in connection with the distribution of received signal power in a multipath environment, the exponential distribution arises more naturally in the distribution of the time interval 𝜏 between successive events or arrivals (also known as inter-arrival time) in a Poisson process, in which case the distribution parameter 𝜆 represents the average arrival rate (which must be constant over the observation interval of interest). A Poisson process is one that satisfies the following fundamental properties: 1. The probability of one event or arrival in an infinitesimally small interval Δt is 𝜆Δt, where 𝜆 is a specified constant and 𝜆Δt ≪ 1. 2. The probability of zero arrival during the interval Δt is 1 − 𝜆Δt. 3. Arrivals are memoryless. This means that an arrival or event in any given time interval is independent of events in previous or future intervals. 4. There cannot be more than one arrival in the interval Δt. This means that two or more arrivals cannot occur at the same time instant; so that in an infinitesimally small interval Δt there are only two possibilities, namely either there is no arrival or there is exactly one arrival. The Poisson process is thus a special case of a Markov process, where the probability of an event at time t + Δt depends only on the probability at time t. A discrete random variable X is said to have a Poisson distribution if it takes on integer values 0, 1, 2, 3, … in accordance with a Poisson process. The probability mass function (PMF) of X, denoted Pr(X = k), gives the probability that X is exactly equal to some value k. We wish to determine the Poisson PMF, and specifically the probability that in a Poisson arrival process there will be exactly k arrivals in some finite interval of duration D. Dividing the interval D into n infinitesimally small slots Δt, so that D = nΔt and Δt = D/n, then Δt → 0 in the limit n → ∞. In this limit, there will be either one arrival in each slot Δt with probability p = 𝜆Δt = 𝜆D/n or zero arrival with probability 1 –p. Figure 3.13 shows the subsets of observations having exactly k arrivals in D. There are n Ck (pronounced ‘n-choose-k’ or ‘n-combination-k’) such subsets, corresponding to the number of ways to select k slots out of n available. Noting that in selecting the k slots, the first pick may be any of the n slots, followed by the second pick, which may be any of the remaining n − 1 slots, and so on up to the kth pick which may be any of the remaining n − (k − 1) slots, we see that there are n(n−1)(n−2)…(n−k + 1) possibilities. However, every k! = k(k−1)(k−2)…1 of these are identical selections, being merely repeats of the same set of k slots in a different order. For example, if k = 3 then every 3! = 3 × 2 × 1 = 6 sets, such as slots {2, 3, 4}, {2, 4, 3}, {3, 2, 4}, {3, 4, 2}, {4, 2, 3}, {4, 3, 2}, will occur as repeats of the same selection of slots 2, 3, and 4. The number of ways to select k slots, regardless of order, from n available is therefore ( ) n(n − 1)(n − 2) · · · (n − k + 1) n n Ck ≡ = k k(k − 1)(k − 2) · · · 1 n! (3.88) = k!(n − k)! Each of the observation subsets shown in Figure 3.13 occurs as a result of there being one arrival (with probability p) in some first slot AND one arrival (with probability p) in some second slot … AND one arrival (with probability p) in some kth slot AND zero arrival (with probability 1 − p) in some (k + 1)th slot … AND zero arrival (with probability 1 − p) in the nth slot. Since events in the slots are independent, the probability of each subset is therefore the product of these probabilities, i.e. pk (1 − p)n−k . And because these subsets are mutually exclusive,
159
160
3 Time Domain Analysis of Signals and Systems
Finite interval D n – k slots
k slots
Δt
Probability of each subset = pk(1 – p)n – k
nC k
subsets
= One arrival with probability p
Legend:
k slots
= No arrival with probability 1 – p
Figure 3.13
Subsets of observations with exactly k arrivals in a finite interval D.
their probabilities add to give the following probability of there being exactly k arrivals in interval D ( ) ( ) ( )k ( ) n k n 𝜆D 𝜆D n−k Pr(X = k) = lim p (1 − p)n−k = lim 1− n→∞ k n→∞ k n n [ ( ) ( ) ] k n(n − 1)(n − 2) · · · (n − (k − 1)) (𝜆D) 𝜆D −k 𝜆D n = lim 1 − 1 − n→∞ n n k! nk ( )( [( )( ) )−k ( )n ] (𝜆D)k 2 k−1 𝜆D 1 𝜆D 1− ··· 1− 1− 1− = lim 1 − n→∞ n n n n n k! [( ) ] 𝜆D n (𝜆D)k = lim 1 − n→∞ n k! To evaluate the first term on the right-hand side of the above equation, consider the binomial expansion of (1 + x/n)n in the limit n → ∞ lim (1 + x∕n)n = lim [1 + nx∕n + n(n − 1)(x∕n)2 ∕2! + n(n − 1)(n − 2)(x∕n)3 ∕3! + · · ·]
n→∞
n→∞
= 1 + x + x2 ∕2! + x3 ∕3! + x4 ∕4! + · · · = ex In view of this definition for the exponential function, it follows that the term in question is exp(−𝜆D) and hence (𝜆D)k exp(−𝜆D) (3.89) k! This is the PMF of the Poisson distribution. The mean E[X] (i.e. expected number of arrivals in interval D), mean square value E[X 2 ], and variance 𝜎X2 of the distribution are given by (see Question 3.4) Pr(X = k) =
E[X] = 𝜆D E[X 2 ] = (𝜆D)2 + 𝜆D 𝜎X2 = E[X 2 ] − (E[X])2 = 𝜆D
(3.90)
And since the average number of arrivals in interval D is 𝜆D, it follows that the distribution parameter 𝜆 is the average rate of arrivals, which may be estimated as N/D by observing the number of arrivals N in a sufficiently large
3.4 Standard Distribution Functions
interval D, where the value of D is chosen such that 𝜆D > > 1. In practice, 𝜆 may exhibit some recurring variation and so D must be chosen to be just large enough to satisfy this condition while fully spanning only the interval of interest in the cycle. For example, the number of customers arriving at a supermarket checkout each minute will vary with time of day and even day of week, and the number of telephone calls arriving at a telephone exchange will also vary with time of day, so 𝜆 would need to be estimated during peak intervals to allow the service system to be designed to cope with busy hour traffic. Note that the quantity 𝜆D is a dimensionless number and therefore it may be added to its square as done in Eq. (3.90) for E[X2 ] without any dimensional inconsistency. The Poisson process is extensively employed in telecommunications network design to model random behaviour or events from a population of users, such as the arrival of telephone calls at a local exchange or data packets at a queue. It is shown below that a Poisson arrival process gives rise to an exponential distribution for the time between successive arrivals. Furthermore, a Poisson departure process, such as telephone call completion or packet departure from a queue, is produced by an exponential distribution of service duration (e.g. telephone call duration, known as call holding time, and how long it takes for a packet to be served at a queue (excluding the time spent waiting in the queue for service to commence)). In this case the exponential distribution parameter represents the service rate, i.e. the average number of customers (e.g. telephone calls or packets) served per unit time. Let us examine the times 𝜏 between successive arrivals in a Poisson process that generates a random variable X having the Poisson distribution given in Eq. (3.89). In Figure 3.14, we start at time t = 0 to observe these Poisson arrivals, the first of which occurs at time t = T. Consider then some arbitrary time t = 𝜏 into our observation. If 𝜏 < T, then there is no arrival within the interval (0, 𝜏) and thus Pr(𝜏 < T) = Pr(X = 0) = exp(−𝜆𝜏) where we have used Eq. (3.89) with observation interval D ≡ 𝜏 and number of arrivals k = 0. Regarding the value of the resulting 0!, recall that n! is, by definition, the number of ways to order a set with n elements, and there is only one way to order a set with no elements, hence 0! = 1. Since Pr(T ≤ 𝜏) = 1 − Pr(𝜏 < T), it follows that Pr(T ≤ 𝜏) ≡ PT (𝜏) = 1 − exp(−𝜆𝜏)
(3.91)
This is the CDF of the random variable T, which upon differentiation according to Eq. (3.15) yields the corresponding PDF pT (𝜏) as pT (𝜏) =
d P (𝜏) = 𝜆 exp(−𝜆𝜏) d𝜏 T
(3.92)
which is an exponential distribution. Therefore, as earlier stated, a Poisson arrival process gives rise to an exponential distribution for the time 𝜏 between successive arrivals. Similarly, it is easy to show that a Poisson departure process arises out of an exponential service time distribution. Equation (3.83) gives the mean of T as 1/𝜆, which is the average inter-arrival time. An exponential PDF for the distribution of inter-arrival times of a Poisson arrival process is shown in Figure 3.15 along with tabulated cumulative and exceedance probabilities. We see that on average 63% of next arrivals will be within the mean inter-arrival duration and only 5% of next arrivals will exceed three times the mean inter-arrival duration. First arrival time, t t=0 Figure 3.14
t=τ Observation of Poisson arrivals.
t=T
161
3 Time Domain Analysis of Signals and Systems
λ
3λ/4 PDF, pT(τ) = λexp(–λτ)
162
λ/2
τ
CDF ≡ Pr(T ≤ τ)
CCDF ≡ Pr(T > τ)
1/λ
0.6321
0.3679
2/λ
0.8647
0.1353
3/λ
0.9502
0.0498
4/λ
0.9817
0.0183
5/λ
0.9933
0.0067
0.9975
0.0025
6/λ
1/λ ≡ Mean inter-arrival time
λ/4
0 1/λ
0
2/λ
3/λ
4/λ
5/λ
Time between successive arrivals, τ → Figure 3.15 PDF of an exponential distribution along with tabulated cumulative and exceedance probabilities at selected values of inter-arrival time 𝜏.
3.5 Signal Characterisation Since signals are necessarily time-varying, it is often useful to extract from the signal a single feature or parameter that quantifies or summarises the level, effectiveness, strength, or variability of the signal. Some of the basic characterisation parameters include the mean level, root-mean-square (rms) value, power, energy, and autocorrelation of the signal. The similarity or otherwise between two signals may also be specified using measures of their correlation and covariance.
3.5.1 Mean The mean or average value of a signal is the level around which the instantaneous values of the signal vary with time. The signal may be an electric current or voltage, in which case the mean value is referred to as the DC (for direct current) value. We know that the mean A0 of a set of sample values such as {4, 4, 6, 8, 10, 10, 10, 12} is the sum of the samples divided by the number of samples. Rearranging this expression for A0 leads to 4 + 4 + 6 + 8 + 10 + 10 + 10 + 12 8 1 1 3 1 2 = 4 × + 6 × + 8 × + 10 × + 12 × 8 8 8 8 8
A0 =
which indicates that the mean is obtained by summing the product of each sample value and the fraction of the set composition corresponding to that value. This fraction represents the probability of occurrence of each value within the set. In general, the mean A0 of a signal that takes on N possible values {g0 , g1 , g2 , …, gN−1 } with respective probabilities {p0 , p1 , p2 , …, pN−1 } is ∑
N−1
A0 = g0 p0 + g1 p1 + g2 p2 + · · · + gN−1 pN−1 ≡
n=0
gn pn
(3.93)
3.5 Signal Characterisation
This equation is applicable to a wide range of signal types. For example, if the signal is a sequence of samples {g0 , g1 , g2 , …, gN−1 } taken in respective nonuniform intervals {𝜏 0 , 𝜏 1 , 𝜏 2 , …, 𝜏 N−1 }, then the probability of the nth sample gn is ∑
N−1
pn = 𝜏n ∕𝕋 , where 𝕋 =
𝜏n
n=0
is the total duration of the signal. If it is a discrete signal g(nT s ) consisting of a sequence of N samples g(0), g(T s ), g(2T s ), …, g([N−1]T s ) taken at a uniform interval T s , then the probability pn of each sample is simply the fraction of the total signal duration NT s that is occupied by each of the samples, so pn = T s /NT s = 1/N, and gn ≡ g(nT s ) ≡ g(n). For an infinitesimally small sampling interval T s , denoted dt, the discrete time instants nT s become continuous time t, the samples merge into a continuum of values, and the signal gn ≡ g(nT s ) becomes a continuous function of time g(t) having duration NT s ≡ 𝕋 and probabilities pn = Ts ∕NT s = dt∕𝕋 . The summation in Eq. (3.93) thereby reduces to integration over the entire signal duration. If g(t) is periodic, with period T, then there is no need to carry out the averaging process beyond an interval of one cycle since the signal’s repetitive structure ensures the same result within each interval (b, b + T), where b is any real number. For a periodic staircase (or quantised) signal that takes on flat levels or steps g0 , g1 , g2 , …, gN−1 having respective widths 𝜏 0 , 𝜏 1 , 𝜏 2 , …, 𝜏 N−1 that sum up to one period T of the signal, we do the averaging over one cycle of the signal. The probability of the nth step gn is pn = 𝜏 n /T = dn . We may describe this signal as an N-step staircase waveform, and dn as the duty cycle of the nth step. Thus ( / N−1 ) N−1 ⎧ ∑ ∑ ⎪ lim 1 𝜏n gn 𝜏n , (non-uniformly sampled signal) ⎪N→∞ n=0 n=0 ⎪ N−1 ⎪ ∑ ⎪ lim 1 g(n), (discrete or uniformly sampled signal) ⎪N→∞ N n=0 ⎪ 𝕋 ∕2 ⎪ A0 = ⎨ lim 1 (3.94) g(t)dt, (continuous-time (CT) signal) ⎪𝕋 →∞ 𝕋 ∫−𝕋 ∕2 ⎪ b+T T∕2 1 ⎪1 g(t)dt = g(t)dt, (periodic CT signal) ⎪T ∫ T ∫−T∕2 b ⎪ N−1 ⎪ N−1 ∑ ∑ ⎪1 gn 𝜏n = gn dn , (periodic staircase signal) ⎪ T n=0 n=0 ⎩ The last expression of Eq. (3.94) applies to an N-step staircase periodic waveform, an example of which is shown in Figure 3.16 for N = 4. Note that the mean value is computed within one cycle of the signal and that dn = 𝜏 n /T is the fraction of time within each cycle during which the signal has value or level gn .
3.5.2 Power The normalised average power P of a signal, often referred to simply as the power of the signal, is the mean square value of the signal. In what follows we assume a real-valued signal; however, the expressions developed here for power (and in the next subsection for energy) may be applied to complex-valued signals by replacing the signal value g(t) wherever it occurs by its absolute value |g(t)|. We obtain P simply by averaging the square of the signal
163
164
3 Time Domain Analysis of Signals and Systems
g2 g0
g(t) τ2
τ0
g3
τ3 t
τ1
g1
T
Figure 3.16
A 4-step staircase periodic waveform of period T.
values or samples. It follows from the previous discussion, by replacing the signal value in Eq. (3.94) with its square, that ⎧N−1 ∑ gn2 pn , (finite discrete set) ⎪ ⎪ n=0 ( / N−1 ) N−1 ⎪ ∑ ∑ ⎪ lim 1 𝜏n gn 2 𝜏n , (nonuniformly sampled signal) ⎪N→∞ n=0 n=0 ⎪ ⎪ N−1 ∑ ⎪ lim 1 g2 [n], (discrete or uniformly sampled signal) ⎪N→∞ N n=0 P=⎨ 𝕋 ∕2 ⎪ 1 g2 (t)dt, (continuous-time (CT) signal) ⎪𝕋lim →∞ 𝕋 ∫−𝕋 ∕2 ⎪ b+T T∕2 ⎪1 1 ⎪ g2 (t)dt = g2 (t)dt, (periodic CT signal) T ∫−T∕2 ⎪ T ∫b ⎪ N−1 N−1 ∑ ⎪1 ∑ 2 g 𝜏 = gn 2 dn , (periodic staircase signal) ⎪T n n ⎩ n=0 n=0
(3.95)
where b is any real number, pn , n = 0, 1, 2, …, N−1 is the probability that a signal that takes on values drawn only from the discrete set {g0 , g1 , g2 , …, gN−1 } will take on the value gn . The normalised average power defined above is the heat that would be dissipated in a pure resistor of resistance 1 ohm (Ω) if the signal were an electric current flowing through the resistor or a voltage drop across the resistor. Unless otherwise specified, we will work exclusively with normalised average power P. Whenever desired, P may be scaled to obtain the average power PR dissipated in a pure resistor of resistance R as follows { P∕R, g(t) is an electric voltage signal (3.96) PR = PR, g(t) is an electric current signal The SI unit for dissipated power is joule per second (J/s) which is called watt (W) in honour of the British inventor James Watt (1736–1819) for his contribution to the design of the steam engine. It is worth noting that there are other power definitions in electrical engineering. For example, in an electric circuit, instantaneous power P(t) at a given point in the circuit is defined as the product of instantaneous voltage v(t) and instantaneous current i(t) at that point. For a sinusoidal voltage v(t) = V m cos(𝜔t) and current i(t) = I m cos(𝜔t − 𝜑vi ), where 𝜑vi is the phase difference by which the voltage leads the current signal, it follows that P(t) = Vm cos(𝜔t)Im cos(𝜔t − 𝜑vi ) 1 = Vm Im [cos(𝜑vi ) + cos(2𝜔t − 𝜑vi )] 2 1 = Vm Im [cos(𝜑vi ) + cos(2𝜔t) cos(𝜑vi ) + sin(2𝜔t) sin(𝜑vi )] 2
(3.97)
3.5 Signal Characterisation
In the above manipulation, we employed trigonometric identities in line 2 and line 3, namely Eqs. (B.6) and (B.4), respectively, from Appendix B. Thus, we may write P(t) = Pa + Pa cos(2𝜔t) + Q sin(2𝜔t) 1 where, Pa = Vm Im cos(𝜑vi ) 2 1 Q = Vm Im sin(𝜑vi ) (3.98) 2 Let us take a moment to appreciate this important result. Since the sinusoidal function has zero mean, it follows from line 1 of Eq. (3.98) that 1 V I cos(𝜑vi ) 2 mm is the mean of the instantaneous power P(t). Pa is the (average) real power or active power. Measured in watts, it represents the total power that is dissipated as heat in the resistive element R of the circuit or load and depends on a factor cos(𝜑vi ) called the power factor, where 𝜑vi (also known as the power factor angle) is the phase difference between current and voltage. The term Pa + Pa cos(2𝜔t) is the instantaneous active power which varies between 0 and 2Pa , completing each cycle at twice the signal frequency. The term Pa =
1 V I sin(𝜑vi ) 2 mm is the reactive power, whereas Qsin(2𝜔t) is the instantaneous reactive power. Reactive power is sometimes called imaginary power and its unit of measurement is volt-ampere reactive (var) to differentiate it from real power in watts. Reactive power accounts for the rate of energy storage in the capacitive and inductive elements of the load. Energy alternates between being stored in the load during one-quarter of the signal cycle and being returned unchanged to the signal source during the next quarter cycle. That is, there is no absorption (or dissipation) of reactive power by the load. There will, however, inevitably be some loss in the resistance of the line connecting the load and the source during this alternating exchange or flow of reactive power between load and source. Finally, the quantity √ 1 (3.99) S = Pa2 + Q2 = Vm Im 2 is the apparent power, also called the complex power, measured in volt-ampere (VA). The relationship between apparent power (VA), active power (W), and reactive power (var) is illustrated in Figure 3.17. If the load is a pure resistance then voltage and current will be in phase and the power factor angle 𝜑vi = 0. This leads to reactive power Q = 0 so that there is no stored energy and the power is entirely active and dissipative. If the load has zero resistive element then 𝜑vi = ±90∘ and active power Pa = 0 so that there is no power dissipation in the load but only energy storage and release back to source in alternate quarter cycles as reactive power. If the load is a complex impedance having a real resistance part and an imaginary reactance part then there are two Q=
Line
Source
i(t) = Im cos (ωt – φʋi)
ʋ(t) = Vm cos(ωt)
Load
r we t po n e Im par m Ap = ½V S φʋi Pa =
1 2
VmIm cos φʋi
Active power Figure 3.17
Apparent power (S), active power (P a ), and reactive power (Q).
Q=
1 2
VmIm sin φʋi
Reactive power
165
166
3 Time Domain Analysis of Signals and Systems
10
(a)
g(t), volts
5 t
ms
–15 –6
1 10
(b)
3
4
6
12
g(t), volts
5 t
ms
–15 –6 Figure 3.18
1
3
4
6
12
Worked Example 3.4.
possible scenarios. If the reactance is inductive then voltage leads current by an acute angle (0∘ < 𝜑vi < 90∘ ) and reactive power Q is positive. However, if the reactance is capacitive then voltage lags current by an acute angle so that −90∘ < 𝜑vi < 0∘ , which leads to a negative reactive power. That Q is positive when the load is inductive and negative when the load is capacitive is merely indicative of the fact that energy flow in a capacitor and an inductor is always in opposition, with one storing while the other supplies; and the flow is conventionally referenced to the inductor. In general, a signal comprises a DC component (which is the average value A0 of the signal) and several harmonics (which are the sinusoidal components of the signal at frequencies f > 0). It is therefore common practice to define DC power Pdc as the power in the DC component and AC power Pac as the power in the harmonics. Thus, with total power denoted P Pdc = A20 Pac = P − A20
(3.100)
Worked Example 3.4 The periodic staircase voltage waveform g(t) shown in Figure 3.18a is measured across a 50-ohm (Ω) resistor. Determine: (a) The average value of the signal. (b) The normalised average power of the signal. (c) The average power dissipated by the voltage signal in the resistor. We first identify one cycle of g(t) and its period T, as done in Figure 3.18b using the thicker line. Hence T = 6 ms. Next, we identify the discrete values gn of g(t) and their respective probabilities pn . These are gn = {10, −15, 5, 0} and pn = {1/6, 2/6, 1/6, 2/6}. We are now ready to calculate the required parameters.
3.5 Signal Characterisation
(a) The average value of g(t) is obtained using Eq. (3.93) as ∑
N−1
A0 =
gn pn
n=0
2 1 2 1 + (−15) × + 5 × + 0 × 6 6 6 6 = −2.5 V = 10 ×
(b) The normalised average power of g(t) is obtained using the last line of Eq. (3.95) as ∑
N−1
P=
gn2 pn
n=0
1 2 1 2 + (−15)2 × + 52 × + 02 × 6 6 6 6 = 95.833 W = 102 ×
(c) The average power dissipated in the resistor follows from Eq. (3.96) for a voltage signal as Pav =
95.833 P = = 1.917 W R 50
3.5.3 Energy Since power is defined as energy per unit time, it follows that the energy E of a signal is its average power multiplied by the duration of the signal. Thus multiplying the expressions for P in Eq. (3.95) by NT s in the case of a discrete signal, and by 𝕋 for a continuous signal yields N−1 ⎧ ∑ ⎪ lim gn 2 𝜏n , (non-uniformly sampled signal) N→∞ ⎪ n=0 ⎪ N−1 ⎪ ∑ E = ⎨ lim T g2 [n] (discrete or uniformly sampled signal) s ⎪N→∞ n=0 ⎪ 𝕋 ∕2 ⎪ g2 (t)dt, (continuous-time (CT) signal) ⎪𝕋lim →∞ ∫−𝕋 ∕2 ⎩
(3.101)
A more general definition of energy that applies to both real-valued and complex-valued signals is ∞
E=
∫−∞
∞
|g(t)|2 dt =
∫−∞
|G(f )|2 df
(3.102)
where G(f ) is the FT of g(t), which is discussed in the next chapter. Equation (3.102) is a statement of Parseval’s theorem (also known as Rayleigh’s energy theorem), which provides an alternative method of determining the energy of a signal using a frequency domain representation of the signal. This is a very common flexibility in signal analysis and characterisation – that either a time domain or a frequency domain approach may be followed. Note that the energy E of a signal g(t) is the total area under the squared waveform g2 (t) of the signal, whereas power P is the average of g2 (t).
167
168
3 Time Domain Analysis of Signals and Systems
3.5.4 Root-mean-square Value The mean value A0 of a signal is not a very useful parameter for characterising the effectiveness or strength of a signal in those situations when the signal can take on both positive and negative values. This is because, when positive and negative samples are aggregated, they cancel out each other’s contribution which therefore gives rise to a reduced mean value. For example, an alternating electric current signal will be first positive and then negative in alternate half-cycles as the direction of current flow is reversed in each half-cycle. And although the capacity of the signal to do work (such as heat dissipation) is unaffected by this cyclical reversal in direction of current flow, the resulting change in sign of the signal will directly reduce its mean value. In fact, in the case of a sinusoidal signal, the mean will be zero regardless of signal amplitude. We may define an average parameter that more correctly represents the effective value of the signal and its capacity to do work by squaring each sample before averaging. In this way positive and negative samples will both add to the accumulation process rather than cancelling each other’s contribution. Once the squares have been averaged, we take the square root of the result to produce a parameter that has the same unit as the signal it represents. The parameter so obtained is very appropriately called the root-mean-square (rms) value of the signal, which we will denote Arms . Recalling that signal power was defined as the mean-square value of the signal, it follows that Arms is the square root of normalised average power P. In fact, the rms parameter is defined only for power signals and is given by √ (3.103) Arms = P where P is as given in Eq. (3.95) for various categories of signals, including discrete signals, sampled signals, general continuous signals, and periodic continuous and N-step staircase signals. The rms value of a signal is an extremely useful parameter which may be computed using Eq. (3.103) if the signal is deterministic, or may be measured directly in the laboratory using an rms ammeter or voltmeter if the signal can be produced or modelled in electrical form. Once Arms is known, it is a trivial matter to obtain normalised power P as P = A2rms
(3.104)
and hence some of the other power definitions such as in Eq. (3.96). The following expressions for the rms values of special waveforms are derived in the next worked example. A rectangular pulse train of amplitude A and duty cycle d (Figure 3.19a) has rms value √ Arms = A d (3.105) A triangular pulse train of amplitude A and duty cycle d (Figure 3.19b) √ Arms = A d∕3
(3.106)
A sinusoidal signal of amplitude A and any frequency and phase (Figure 2.18) √ Arms = A∕ 2
(3.107)
A DC signal of constant value A Arms = A
(3.108)
A complex exponential function Aexp[j(2𝜋ft + 𝜙)] Arms = A
(3.109)
A sinusoidal pulse train of amplitude A and duty cycle d containing an integer number n of half-cycles within the pulse duration 𝜏 (Figure 3.19c) √ Arms = A d∕2 (3.110)
g(t)
A (a)
d = τ/T t
τ g ( t)
A
(b)
T
d = τ/T t
τ T (c)
A
g(t)
d = τ/T τ = n/F, n = 1, 2, 3, … t
1/F
(d)
A
τ
Figure 3.19
Selected periodic waveforms.
τr
τ
T
g(t)
τc
τf
t T = period of waveform
τr ≡ Rise time of pulse; τf ≡ Fall time of pulse; τc ≡ Flat time of pulse; dr = τr/T ≡ Rise time duty cycle; df = τf /T ≡ Fall time duty cycle; dc = τc/T ≡ Flat time duty cycle; τ = τr + τf + τc ≡ Pulse duration; d = dr + df + dc ≡ Duty cycle of pulse train
170
3 Time Domain Analysis of Signals and Systems
A trapezoidal pulse train of amplitude A and duty cycles dr (rising portion), dc (constant portion), and df (falling portion) (Figure 3.19d) √ Arms = A dc + dr ∕3 + df ∕3 (3.111) Worked Example 3.5 Derive expressions for the rms values of the following periodic signals: (a) The trapezoidal pulse train shown in Figure 3.19d. (b) The sinusoidal pulse train shown in Figure 3.19c. (c) A complex exponential function Aexp[j(2𝜋ft + 𝜙)]. Since all the signals are periodic continuous-time signals, we employ the fifth line of Eq. (3.95) with b = 0 and Eq. (3.104) to obtain the required rms expressions by evaluating T
1 g2 (t)dt (3.112) T ∫0 (a) The trapezoidal pulse comprises three portions, namely rising portion of duration 𝜏 r , flat or constant portion of duration 𝜏 c , and falling portion of duration 𝜏 f . Introducing amplitude A into the expressions for these portions of the pulse given in Eq. (2.25) and integrating yields A2rms =
A2rms
A2 = T
[( ) ]2 ⎫ ⎧ [( ) ]2 𝜏c ∕2 𝜏f +𝜏c ∕2 𝜏c 𝜏c ⎪ −𝜏c ∕2 ⎪ t t 1+ + dt + 1 ⋅ dt + dt⎬ 1+ − ⎨∫ ∫ ∫ 2𝜏 𝜏 2𝜏 𝜏 −(𝜏 −𝜏 𝜏 +𝜏 ∕2) ∕2 ∕2 r r f f r c c c ⎪ ⎪ ⎩ ⎭
Making the substitutions t′ = t + 𝜏r + 𝜏c ∕2 in the first integral (which shifts the rising portion to start at t′ = 0 along the t′ axis) and for a similar reason t′′ = t − 𝜏f − 𝜏c ∕2 in the third integral, and evaluating the second integral straightforwardly to 𝜏c , we obtain } { 𝜏r 0 A2 1 1 2 ′2 ′ ′′ 2 ′′ Arms = t dt + 2 t dt 𝜏c + 2 T 𝜏r ∫0 𝜏 ∫−𝜏f f
=
A2 T
=
A2 T
⎧ ( ) t′ =𝜏r ( )|t′′ =0 ⎫ ⎪ ⎪ 1 t′ 3 || 1 t′′ 3 || + 2 ⎬ ⎨𝜏c + 2 3 || | 3 𝜏 𝜏 | r ⎪ |t′ =0 f | t′′ =−𝜏f ⎪ ⎭ ⎩ { } ) ( 𝜏f 𝜏 𝜏 𝜏r 𝜏 f c r 𝜏c + + = A2 + + 3 3 T 3T 3T
= A2 (dc + dr ∕3 + df ∕3) √ Thus, Arms = A dc + dr ∕3 + df ∕3 for a trapezoidal pulse train. The rectangular pulse train (Figure 3.19a) is a special case of the trapezoidal pulse train with dr = df = 0, 𝜏 c = 𝜏, √ and dc = d, so that Arms = A d, as given in Eq. (3.105). Also, the triangular pulse train (Figure 3.19b) is a special case of trapezoidal with dc = 0, 𝜏 r = 𝜏 f , and 𝜏 r + 𝜏 f = 𝜏 so that 𝜏f 𝜏r 𝜏 d + = = 3T 3T 3T 3 √ and hence, Arms = A d∕3, as given in Eq. (3.106).
3.5 Signal Characterisation
(b) One cycle of the sinusoidal pulse train is defined by { A cos(2𝜋ft + 𝜙), 0 ≤ t ≤ 𝜏 g(t) = 0, 𝜏 T b will only create overlaps between noncorresponding bits bk in g(t − 𝜏) and bk+n in g(t), where n = 1, 2, 3, …, and this also makes zero total contribution to the area since bk and bk+n are equally likely to be the same (+A and + A, or −A and −A) or different (+A and −A, or −A and + A). Thus 𝕋 ∕2
1 g(t)g(t − 𝜏)dt 𝕋 →∞ 𝕋 ∫−𝕋 ∕2
Rg (𝜏) = lim
175
176
3 Time Domain Analysis of Signals and Systems
⎧ 1 NA2 (Tb − |𝜏|), |𝜏| ≤ Tb ⎪ lim = ⎨N→∞ NT b otherwise ⎪0, ⎩ { 2 A (1 − |𝜏|∕Tb ), |𝜏| ≤ Tb = 0, otherwise ) ( 𝜏 = A2 trian 2Tb The autocorrelation function of a bipolar random binary waveform of amplitude A and bit duration T b is therefore a triangular pulse having duration 2T b and amplitude A2 , as shown in Figure 3.21b. Notice that Rg (𝜏)max = Rg (0) = power of g(t); the duration 2T b of Rg (𝜏) is directly proportional to bit duration T b and will become narrower (signifying higher variability and spectral content) as bit duration decreases or, conversely, bit rate increases; and there is no sequence of local maxima in Rg (𝜏), which indicates that g(t) does not contain any periodic components.
3.5.6 Covariance and Correlation Coefficient It is often necessary to obtain an objective estimate of the similarity between two signals using a quantity that gives an indication of the level of synchronisation in the variability of the two signals. For example, during each signalling interval the receiver in a binary digital transmission system may compare an incoming signal with a standard waveform (e.g. the signal designated for conveying binary 0) in order to establish the identity of the received signal. Covariance is the most used measure of joint variability and is applicable to both random and deterministic signals. When normalised to have a value in the range (−1, 1) it is referred to as the correlation coefficient of the two signals. The covariance of two signals X and Y , denoted cov[X, Y ], is the expectation of the product of the deviations of the two signals from their respective means 𝜇 X and 𝜇 Y . Thus cov[X, Y ] = E[(X − 𝜇X )(Y − 𝜇Y )]
(3.116)
Since the expectation operation is linear, we may expand the above equation to express covariance of two signals as the expectation of the product of the two signals less the product of the means (i.e. expected values) of the signals cov[X, Y ] = E[(X − 𝜇X )(Y − 𝜇Y )] = E[XY − 𝜇Y X − 𝜇X Y + 𝜇X 𝜇Y ] = E[XY ] − 𝜇Y E[X] − 𝜇X E[Y ] + 𝜇X 𝜇Y = E[XY ] − 𝜇Y 𝜇X − 𝜇X 𝜇Y + 𝜇X 𝜇Y = E[XY ] − 𝜇X 𝜇Y That is, we have an expression for covariance in terms of uncentred moments, namely cov[X, Y ] = E[XY ] − E[X]E[Y ]
(3.117)
Two signals are said to be uncorrelated if their covariance is zero. You will be able to show in Question 3.10 that if two signals X and Y are independent random variables then E[XY ] = E[X]E[Y ] = 𝜇X 𝜇 Y . It therefore follows from Eq. (3.117) that the covariance of two independent random signals is zero. Thus, if X and Y are independent then they are also uncorrelated. Note, however, that the converse is not necessarily true. That is, two independent signals are always uncorrelated, whereas two uncorrelated signals are not necessarily independent. See Question 3.11.
3.5 Signal Characterisation
The covariance of two signals gives an indication of the linear relationship between the two signals. If both signals vary in sync such that one signal is large when the other is large and small when the other is small then their covariance is positive, whereas if one is small when the other is large (and vice versa) then their covariance is negative. An indication of both the strength and the trend of the association between two signals is obtained by dividing their covariance by the product of the standard deviations 𝜎 X and 𝜎 Y of the two signals to give a value in the range (−1, 1), known as the Pearson correlation coefficient of the two signals X and Y , which we will denote as r X,Y . Thus E[(X − 𝜇X )(Y − 𝜇Y )] cov[X, Y ] = √ √ 𝜎X 𝜎Y E[(X − 𝜇X )2 ] E[(Y − 𝜇Y )2 ] E[XY ] − E[X]E[Y ] = √ (E[X 2 ] − (E[X])2 )(E[Y 2 ] − (E[Y ])2 )
rX,Y =
(3.118)
where we have employed Eq. (3.117) for covariance and Eq. (3.22) for variance. We will often only have a finite set of data comprising N corresponding values of each of X and Y , namely {x1 , x2 , …, xN } and {y1 , y2 , …, yN }. In such cases, we perform the above averaging operations over this dataset to obtain unbiased estimates of covariance and Pearson correlation coefficient, respectively referred to as sample covariance 𝓬𝓸𝓿[X, Y ] and sample Pearson correlation coefficient 𝓻X,Y 1 ∑ (x − X)(yk − Y ) N − 1 k=1 k N
𝓬𝓸𝓿[X, Y ] =
N ∑
𝓻X,Y = √
N ∑
k=1
where, X =
(xk − X)(yk − Y )
k=1
√
(xk − X)2
N 1 ∑ x ; N k=1 k
Y=
N ∑
k=1
(yk − Y )2
N 1 ∑ y N k=1 k
(3.119)
Note that if the population means E[X] ≡ 𝜇 X and E[Y ] ≡ 𝜇 Y are known then these should be used in the above expressions in place of the sample means X and Y , in which case the factor 1/(N − 1) in the first line is replaced with 1/N. The Pearson correlation coefficient is computed on deviations of the raw data from the mean and is widely used in the field of statistics. However, in communication systems applications correlation coefficient is defined slightly differently, based on directly aggregating the signal values (rather than their deviations from the mean). We will refer to this variant definition simply as the correlation coefficient of signals X and Y , denoted as 𝜌X,Y . and given by the expression 𝜌X,Y =
E[XY ] 1 (E[X 2 ] 2
+ E[Y 2 ])
(3.120)
The manner of carrying out the required averaging in the above equation is dictated entirely by the type of signals involved. Denoting the signals as g1 (t) and g2 (t), the averaging is carried out over the entire time duration of the signals if they are energy signals, power signals, or ergodic random processes, or over one cycle if the signals are
177
178
3 Time Domain Analysis of Signals and Systems
periodic. Thus, we have for energy signals of duration T s 𝜌g1 (t),g2 (t) =
1 2
(
1 Ts
T
∫0 s g1 (t)g2 (t)dt
T
1 Ts
∫0 s g12 (t)dt +
1 Ts
T
∫0 s g22 (t)dt
)
T
=
1 2
(
∫0 s g1 (t)g2 (t)dt T
T
∫0 s g12 (t)dt + ∫0 s g22 (t)dt
)
From the definition of energy in Eq. (3.101), we see that the denominator in the last line is the average of the energies E1 and E2 of g1 (t) and g2 (t), respectively. Thus T
𝜌g1 (t),g2 (t) =
∫0 s g1 (t)g2 (t)dt Average energy
T
=
2 ∫0 s g1 (t)g2 (t)dt E1 + E2
(3.121)
Similarly, if g1 (t) and g2 (t) are periodic signals having period T and respective powers P1 and P2 then T∕2
𝜌g1 (t),g2 (t) =
2 ∫−T∕2 g1 (t)g2 (t)dt (P1 + P2 )T
(3.122)
And if the signals are nonperiodic power signals or ergodic random processes with respective powers P1 and P2 , then 𝜌g1 (t),g2 (t) =
lim 2 𝕋 →∞ 𝕋
𝕋 ∕2
∫−𝕋 ∕2 g1 (t)g2 (t)dt P1 + P2
(3.123)
Two signals X and Y are said to be orthogonal if the expectation of their product is zero, i.e. if E[XY ] = 0. It follows from Eq. (3.120) that orthogonal signals have a zero correlation coefficient, and conversely that if two signals have a zero correlation coefficient then they are orthogonal. We will see in later chapters that the principle of signal orthogonality and the process of signal correlation both play a central role in signal transmission and detection. Throughout this book we make use of the correlation coefficient as defined in Eq. (3.120) and applied to special cases in Eqs. (3.121) to (3.123). For comparison, Table 3.1 gives a summary of the properties and special values of the two correlation coefficients discussed in this section. In the table and Eq. (3.124) below, a, b, c, d are constants and b > 0, d > 0. There is an important feature that is worth highlighting. In the third row of entries we see that the standard correlation coefficient distinguishes between two signals X and aX that differ only in amplitude, whereas the Pearson correlation coefficient does not. This is an important feature that makes the standard correlation coefficient a more suitable parameter in telecommunications where it is often necessary to distinguish between signals that differ only in amplitude (e.g. amplitude shift keying (ASK) and amplitude and phase shift keying (APSK) signals). The fourth row also shows that, unlike the standard correlation coefficient, the Pearson correlation coefficient does not distinguish between two signals that differ by a constant value. In general, the Pearson correlation coefficient is invariant under changes to a signal by a constant additive term or multiplicative factor, whereas the standard correlation coefficient is sensitive to any of these changes. That is ra+bX,c+dY = rX,Y 𝜌a+bX,c+dY ≠ 𝜌X,Y
(3.124)
We also see from the last two rows of the table that the Pearson coefficient is more reliable for checking for the independence of random signals, whereas the standard coefficient is better suited to detecting the orthogonality of deterministic signals.
3.5 Signal Characterisation
Table 3.1
Selected properties and values of correlation coefficients. Correlation coefficient
Signal 1
Signal 2
Pearson, Eq. (3.118)
Standard, Eq. (3.120)
X
X
r X,X = +1
𝜌X,X = +1
X
−X
r X,−X = −1
𝜌X,−X = −1
X
bX
r X,bX = +1
𝜌X,bX = 2b/(1 + b2 )
X
a+X
r X,a+X = +1
X
Y orthogonal to X
r X,Y = 0 if and only if
𝜌X,a+X = aE[X] + E[X 2 ] aE[X] + E[X 2 ] + a2 ∕2 𝜌X,Y = 0
i.e. E[XY ] = 0
E[X] = 0 or E[Y ] = 0
Y and X independent i.e. E[XY ] = E[X]E[Y ]
r X,Y = 0
X
𝜌X,Y = 0 if and only if E[X] = 0 or E[Y ] = 0
Worked Example 3.7 In M-ary phase shift keying (M-PSK) modulation scheme, data bits are transmitted in groups of log2 M bits at a time using a set of M unique sinusoidal symbols defined by ( ) ) ( t − 12 Ts 2𝜋k gk (t) = Arect cos 2𝜋fc t + 𝜃o + Ts M k = 0, 1, 2, 3, · · · , M − 1;
fc = n∕Ts
One of the M symbols is selected and transmitted in each signalling interval of duration T s depending on the identity of the log2 M bits within that interval. Each symbol is a sinusoidal carrier of amplitude A, phase 𝜃 o + 2𝜋k/M (where 𝜃 o is a constant phase offset), and frequency f c = n/T s , where n is a positive integer, which ensures that (with a period T s /n) the carrier completes an integer number of cycles in each signalling interval. Derive expressions for: (a) Symbol energy Ek of each transmitted symbol gk (t). (b) The correlation coefficient of adjacent symbols (i.e. two symbols that differ in phase by the smallest amount) in the transmission scheme. ) ) (( (a) The rectangular pulse factor rect t − 12 Ts ∕Ts in the expression for symbol gk (t) ensures that the symbol has duration T s . See Eq. (2.16) for the definition of a rectangular pulse. Thus, each symbol is a sinusoid of amplitude A and duration T s , which yields energy Ek given by Ek = Power × Duration 2 A2 T = A2 × Ts = 2 s All symbols in a PSK transmission scheme therefore have the same energy given by the above expression. (b) These are energy signals, so Eq. (3.121) yields the correlation coefficient of adjacent symbols gk (t) and gk + 1 (t), k = 0, 1, 2, …, M − 2 as
179
180
3 Time Domain Analysis of Signals and Systems
T
𝜌gk (t),gk+1 (t) =
2 ∫0 s gk (t)gk+1 (t)dt Ek + Ek+1
) ( ) 2𝜋(k + 1) 2𝜋k ⋅ A cos 2𝜋fc t + 𝜃o + dt A cos 2𝜋fc t + 𝜃o + M M ) ( ) ( Ts 2 2𝜋k 2𝜋k 2𝜋 ⋅ cos 2𝜋fc t + 𝜃o + + dt = 2 A2 cos 2𝜋fc t + 𝜃o + M M M A Ts ∫0
2 = 2 A Ts ∫0
Ts
(
Using Eq. (B.6) in Appendix B for the product of cosines in the integrand, and noting that f c T s = n and that sin(4𝜋n + 𝛽) = sin(𝛽), we obtain ( )] Ts [ ( ) 2 1 4𝜋k 2𝜋 2𝜋 𝜌gk (t),gk+1 (t) = 2 ⋅ A2 + cos 4𝜋fc t + 2𝜃o + + dt cos M M M A Ts 2 ∫0 ( ) t=Ts ⎡ 4𝜋k 2𝜋 ⎤|| sin 4𝜋f t + 2𝜃 + + ) ( c o M M ⎥| 1 ⎢ 2𝜋 + = t cos ⎢ ⎥|| Ts ⎢ M 4𝜋fc ⎥|| ⎣ ⎦|t=0 ( ) ( ) ⎡ 4𝜋k 2𝜋 4𝜋k 2𝜋 ⎤ ( ) sin 4𝜋fc Ts + 2𝜃o + M + M − sin 2𝜃o + M + M ⎥ ⎢ 1 2𝜋 T cos = + ⎥ Ts ⎢⎢ s M 4𝜋fc ⎥ ⎣ ⎦ 1 = [Ts cos(2𝜋∕M)] Ts = cos(2𝜋∕M) Dropping the subscripts for convenience, we see that 𝜌 = −1 for binary phase shift keying (BPSK, M = 2); 𝜌 = 0 for quaternary PSK (QPSK, M = 4); 𝜌 = 0.7071 for 8PSK; 𝜌 = 0.9239 for 16PSK; etc. Thus, 𝜌 increases rapidly towards +1 as M increases, and this has great significance in the design of communication systems, necessitating a trade-off between transmitted power and transmission bandwidth, an issue we explore further in later chapters.
3.6 Linear Time Invariant System Analysis A linear time invariant (LTI) system is one that is both linear and time invariant. Let the transformation performed by a system on its input x(t) to obtain its output y(t) be represented by the operator such that we may write y(t) = {x(t)}. An LTI system therefore obeys the following rules Given that y1 (t) = {x1 (t)} and y2 (t) = {x2 (t)} then a1 y1 (t) + a2 y2 (t) = {a1 x1 (t) + a2 x2 (t)} (i) y1 (t − 𝜏) = {x1 (t − 𝜏)}
(ii)
(3.125)
where a1 and a2 are arbitrary constants. Rule (i) is known as the principle of superposition, and states that the system produces an output by always doing the same thing to every input and then adding the results together.
3.6 Linear Time Invariant System Analysis
Rule (ii) expresses time-invariance: what the system does is not dependent on the time the input is applied. That is, the only change to the system output in response to a delayed input is a delay of the same amount. In practice the system will not be indefinitely time-invariant, and it will be acceptable to treat the system as time-invariant if it is time-invariant only for the duration of a call or connection. In this case there will be a new challenge to obtain the system or channel characterisation (a task known as channel estimation) at the start of each call since the channel behaviour may vary from call to call. We will also assume that the system is causal, which means that there is no system response before an input is applied. Refer to Section 2.10 in Chapter 2 for a detailed discussion of basic system properties. It will be useful in the ensuing discussion to note the following alternative expression of Eq. (3.125) which uses R
the notation x(t) −−−−→ y(t) to mean ‘x(t) yields response y(t)’ Given that R
x1 (t) −−−−→ y1 (t) and R
x2 (t) −−−−→ y2 (t) then R
a1 x1 (t) + a2 x2 (t) −−−−→ a1 y1 (t) + a2 y2 (t) (i) R
x1 (t − 𝜏) −−−−→ y1 (t − 𝜏)
(3.126)
(ii)
3.6.1 LTI System Response An LTI system may be fully characterised in the time domain by its impulse response h(t), which is the output of the signal when the excitation or input signal is the unit impulse function 𝛿(t), as illustrated in Figure 3.22 for both continuous-time (CT) and discrete-time (DT) LTI systems. We are interested in the response y(t) of a CT LTI system to an arbitrary excitation x(t) and in the response y(n) of a DT LTI system to an arbitrary input sequence x[n], as shown in Figure 3.23. A clarification of notation is important here. We will use square brackets such as in x[n], y[n], h[n], g[n], etc., to refer to the entire sequence of a DT signal, and round brackets such as in y(n) to denote the nth sample of the sequence. So, x(n) is the present sample of the input, x(n − 1) is the immediate past input sample, x(n + 1) is the immediate future or next input sample, and so on. Also, the present output sample y(n) will in general be computed using a mathematical operation that involves one or more input samples (i.e. x[n]) rather than just the present input sample x(n). Considering first the CT case, Figure 3.24a shows a sample (of small thickness Δ𝜏) of an arbitrary input signal x(t) taken at time t = 𝜏. This sample has height x(𝜏) and hence area x(𝜏) ⋅ Δ𝜏, and may be represented as x(𝜏) ⋅ Δ𝜏 ⋅ 𝛿(t − 𝜏), which is an impulse of weight x(𝜏) ⋅ Δ𝜏 located at t = 𝜏. In view of Eq. (3.126), the response of the system Figure 3.22
Impulse response.
δ(t)
CT LTI System
h(t)
δ[n]
DT LTI System
h(n)
181
182
3 Time Domain Analysis of Signals and Systems
x(t)
CT LTI System h(t)
y(t)
x[n]
DT LTI System h[n]
y(n)
Figure 3.23
Response to arbitrary input.
x(t) x(τ) · Δτ · δ(t – τ)
(a) τ
x(τ) · Δτ · h(t – τ)
LTI System h(t)
Δτ t
x(t) ∞
Σ x(kΔτ) · Δτ · δ(t – kΔτ)
k = –∞
(b) kΔτ
t
∞
Σ x(kΔτ) · Δτ · h(t – kΔτ)
LTI System h(t)
k = –∞
x(t)
(c) τ
∞
t
x(t) = ∫ x(τ)δ(t – τ)dτ –∞
LTI System h(t)
∞
y(t) = ∫ x(τ)h(t – τ)dτ –∞
= x(t) * h(t)
Figure 3.24
Response y(t) of an LTI CT system to an arbitrary input x(t).
to this weighted and delayed impulse will be a similarly weighted and delayed impulse response. That is R
x(𝜏) ⋅ Δ𝜏 ⋅ 𝛿(t − 𝜏) −−−−→ x(𝜏) ⋅ Δ𝜏 ⋅ h(t − 𝜏)
(3.127)
Figure 2.24b shows the entire signal x(t) being approximated as a sequence of samples, each of thickness Δ𝜏, taken at t = kΔ𝜏, where k takes on integer values from −∞ to ∞, as necessary in order to span the entire duration of x(t). The kth sample occurs at t = kΔ𝜏, has height x(kΔ𝜏) and thickness Δ𝜏, and is therefore the impulse x(kΔ𝜏) ⋅ Δ𝜏 ⋅ 𝛿(t − kΔ𝜏). The input signal x(t) is therefore approximated by the sum of these impulses from k = −∞ to k = ∞ as x(t) ≈
∞ ∑
x(kΔ𝜏) ⋅ Δ𝜏 ⋅ 𝛿(t − kΔ𝜏)
(3.128)
k=−∞
In line with Eq. (3.126), the response to this sum of weighted and delayed impulses will be a sum of similarly weighted and delayed impulse responses. That is ∞ ∑
R
x(kΔ𝜏) ⋅ Δ𝜏 ⋅ 𝛿(t − kΔ𝜏) −−−−→
k=−∞
∞ ∑
x(kΔ𝜏) ⋅ Δ𝜏 ⋅ h(t − kΔ𝜏)
(3.129)
k=−∞
In the limit Δ𝜏 → 0, the approximation of x(t) in Eq. (3.128) becomes exact, as shown in Figure 3.24c, the discrete instants kΔ𝜏 become a continuous variable 𝜏, and the summations in the Eq. (3.129) (and in Figure 3.24b) become integrations with Δ𝜏 replaced by d𝜏, so that ∞
x(t) =
∫−∞
R
x(𝜏)𝛿(t − 𝜏)d𝜏 −−−−→ y(t) =
∞
∫−∞
x(𝜏)h(t − 𝜏)d𝜏 ≡ x(t) ∗ h(t)
3.6 Linear Time Invariant System Analysis
Notice that the left-hand side of the arrow is simply the sifting property of the impulse function encountered earlier in Eq. (2.31), taking the even property of the impulse function into account. The right-hand side is called the convolution integral, the convolution operation being denoted as above with an asterisk. This is a very important result. It states that the output y(t) of an LTI system in response to an arbitrary input x(t) is obtained by convolving x(t) with the impulse response h(t) of the system ∞
y(t) =
∫−∞
x(𝜏)h(t − 𝜏)d𝜏 ≡ x(t) ∗ h(t)
(3.130)
Noting that t is a constant as far as the integration is concerned, and substituting t − 𝜆 = 𝜏 (so that d𝜏 = −d𝜆, 𝜆 = t − 𝜏 = −∞ when 𝜏 = ∞, and 𝜆 = ∞ when 𝜏 = −∞) yields −∞
y(t) = −
∫∞
∞
x(t − 𝜆)h(𝜆)d𝜆 =
∫−∞
h(𝜆)x(t − 𝜆)d𝜆
Restoring our use of 𝜏 (in place of 𝜆) yields ∞
y(t) =
∫−∞
h(𝜏)x(t − 𝜏)d𝜏 ≡ h(t) ∗ x(t)
(3.131)
Thus, the convolution operation is commutative, which means that the same result is obtained whatever the ordering of the signals being convolved. The convolution operation is also associative and distributive as summarised in the next equation. x(t) ∗ h(t) = h(t) ∗ x(t)
Commutative law
x(t) ∗ [h1 (t) ∗ h2 (t)] = [x(t) ∗ h1 (t)] ∗ h2 (t)
Associative law
x(t) ∗ [h1 (t) + h2 (t)] = x(t) ∗ h1 (t) + x(t) ∗ h2 (t)
Distributive law
(3.132)
These properties have important practical implications. For example, it implies that the order of arrangement of a cascade connection of LTI stages does not affect the overall output, and that the impulse response of a system comprising two LTI stages is the convolution of the impulse responses of the two stages. Note that a cascade connection means a series connection in which there is impedance matching between stages so that there is no signal reflection and the output of one stage is entirely delivered to the next stage and equals the input of that stage. If the LTI system is causal then h(t) = 0 for t < 0, since h(t) is the response to a signal 𝛿(t) that is applied at t = 0. Since the integrand in Eq. (3.131) has h(𝜏) as a factor, the integral will be zero in the interval 𝜏 = (−∞, 0) where h(𝜏) = 0, and hence 0
∞
y(t) =
∫−∞
h(𝜏)x(t − 𝜏)d𝜏 =
∫−∞
∞
h(𝜏)x(t − 𝜏)d𝜏 +
∫0
h(𝜏)x(t − 𝜏)d𝜏
∞
=
h(𝜏)x(t − 𝜏)d𝜏
∫0
Now making the substitution t − 𝜏 = 𝜆 in the last integral, which means 𝜏 = t − 𝜆, d𝜏 = −d𝜆, 𝜆 = t when 𝜏 = 0, 𝜆 = −∞ when 𝜏 = ∞, we obtain ∞
y(t) =
∫0 t
≡
t
−∞
h(𝜏)x(t − 𝜏)d𝜏 = −
∫−∞
x(𝜏)h(t − 𝜏)d𝜏
∫t
h(t − 𝜆)x(𝜆)d𝜆 =
∫−∞
x(𝜆)h(t − 𝜆)d𝜆
183
184
3 Time Domain Analysis of Signals and Systems
The response y(t) of a causal LTI system having impulse response h(t) to an input signal x(t) is therefore given by the expressions ∞
y(t) = =
∫0
h(𝜏)x(t − 𝜏)d𝜏 (Causal system)
t
∫−∞
(3.133)
x(𝜏)h(t − 𝜏)d𝜏
Notice that the second integral indicates that only values of the input occurring prior to time t in the interval (−∞, t) contribute to the output y(t) at time t, and specifically that future input values occurring in the interval (t, ∞) do not contribute to y(t). This is a manifestation of the system’s causality. If both the system and the input signal are causal (i.e. h(t), x(t) = 0 for t < 0), then the second integral, which contains the factor x(𝜏) in its integrand, will be zero in the interval 𝜏 = (−∞, 0) where x(𝜏) = 0, so that we have t
y(t) = =
∫0 ∫0
x(𝜏)h(t − 𝜏)d𝜏 t
h(𝜏)x(t − 𝜏)d𝜏
(Causal input signal and causal system)
(3.134)
where we have used the substitution t − 𝜏 = 𝜆 (as previously done) to derive the last integral from the first. The first integral indicates that, because the system is causal, only past inputs (starting in this case at t = 0 in view of the input signal’s causality) contribute to y(t). Now turning our attention to a DT system that is characterised by impulse response h[n], let the system operation be denoted by the operator , so that h(n) = {𝛿(n)}
(3.135)
Recall from Eq. (2.34) that an arbitrary DT signal x[n] may be expressed as a sum of delayed weighted impulses as x[n] =
∞ ∑
x(k)𝛿(n − k)
k=−∞
where x[n] denotes the entire DT signal, whereas x(k) is the value of x[n] at the kth sampling instant. By virtue of linearity, the response y(n) of the system to x[n] will be { ∞ } ∑ y(n) = {x[n]} = x(k)𝛿(n − k) k=−∞
=
∞ ∑
x(k){𝛿(n − k)}
k=−∞
In view of time invariance, it follows from Eq. (3.135) that {𝛿(n − k)} = h(n − k) Therefore, if an LTI system is characterised by impulse response h[n] then its output y(n) in response to an arbitrary input x[n] is given by the convolution sum y(n) =
∞ ∑ k=−∞
x(k)h(n − k) = x[n] ∗ h[n]
(3.136)
3.6 Linear Time Invariant System Analysis
x(k) 12 10 8 (a)
6 4 2 k
0 –5 –4 –3 –2 –1
0
1
2
3
4
5
6
2
3
4
5
6
h(n) 2 (b) n –3 –2 –1 Figure 3.25
0
1
7
Finite duration sequences: (a) Input sequence; (b) Finite impulse response.
This convolution sum is the discrete equivalent of the convolution integral of Eq. (3.130). In the above summation, substituting n − k = m (and hence k = n − m, m = −∞ when k = ∞, m = ∞ when k = −∞) yields y(n) =
−∞ ∑
x(n − m)h(m) =
m=∞ ∞
≡
∑
∞ ∑
h(m)x(n − m)
m=−∞
h(k)x(n − k) = h[n] ∗ x[n]
(3.137)
k=−∞
Thus, y(n) = x[n] ∗ h[n] = h[n] ∗ x[n], which means that the convolution of DT signals is commutative just as is the convolution of CT signals. In fact, the commutative, associative, and distributive laws discussed earlier for convolution of CT signals are also applicable in the DT case. It is often the case that the input sequence x[n] is of finite duration so that x(k) in Eq. (3.136) is zero outside the interval k = (k1 , k2 ). For example, in Figure 3.25a x(k) is zero outside the range k = (−2, 3), so k1 = −2 and k2 = 3. The summation in Eq. (3.136) now only needs to be performed within this range to obtain system response y(n) =
k2 ∑
x(k)h(n − k),
x(k) = 0 for k < k1 or k > k2
(3.138)
k=k1
Making the substitution m = n − k in the above summation, and subsequently restoring k as the index of summation (in place of m) yields ∑
n−k2
y(n) =
h(k)x(n − k)
k=n−k1
If in addition to the input sequence x[k] being confined to the range k = (k1 , k2 ), the impulse response h[k] is also zero outside the range k = (n1 , n2 ) then the factor h(k) in the above summation will be zero when k (= n − k1 ) is
185
186
3 Time Domain Analysis of Signals and Systems
less than n1 , i.e. n < n1 + k1 , and when k (= n − k2 ) exceeds n2 , i.e. n > n2 + k2 . This means that the output sequence y[n] will be zero for n < n1 + k1 or n > n2 + k2 , and so we may write When x(k) = 0 for k < k1 or k > k2 And h(n) = 0 for n < n1 or n > n2 Then k2 ⎧∑ ⎪ x(k)h(n − k), y(n) = ⎨k=k 1 ⎪ ⎩0,
n = n1 + k1 , · · · , →, · · · , n2 + k2
(3.139)
Otherwise
Figure 3.25b shows a plot of h[n] for a causal finite impulse response (FIR) system in which n1 = 0 (as is always the case for a causal system) and n2 = 4. Convolution is a key operation in signal transmission, so it is important to have complete clarity about its meaning and evaluation. We do this in the next two sections through further discussions and worked examples that emphasise both a graphical interpretation and direct mathematical computation in the case of CT signals and a tabulated approach for DT signals. You may at this point wish to review the basic signal operations of time shifting and time reversal discussed in Section 3.2.
3.6.2 Evaluation of Convolution Integral The convolution y(t) = x(t) ∗ h(t) of two CT signals x(t) and h(t) has a value y(to ) at time t = to that is computed through the following steps. (a) Time reverse h(t) to obtain h(−t). (b) Delay h(−t) by to to obtain h(−t + to ) ≡ h(to − t). These two steps can be denoted as h(to − t) ≡ 𝔻[𝕋 ℝ[h(t)], to ], where 𝕋 ℝ[g(t)] denotes time reversal of signal g(t) and 𝔻[g(t), to ] denotes delay of g(t) by to . (c) Multiply x(t) by h(to − t) to obtain the product waveform x(t)h(to − t). (d) The desired value y(to ) is given by the total area under x(t)h(to − t). This is the integration of x(t)h(to − t) with respect to t in the range t = (−∞, ∞). The above steps describe a graphical approach to obtain the value of the convolution of two CT signals x(t) and h(t) at any given time t = to . Mathematically, these steps correspond to determining the output at any arbitrary time t by evaluating the integral ∞
y(t) =
∫−∞
x(𝜏)h(t − 𝜏)d𝜏
Worked Example 3.8
(3.140)
Graphical Evaluation of Convolution Integral
A system has the causal finite impulse response h(t) shown in Figure 3.26a. Determine the output y(t) of this system when the input is the rectangular pulse x(t) also shown. Discuss in what ways the input signal x(t) has been modified by this system. We solve this problem by evaluating ∞
y(t) = Area under x(𝜏) × 𝔻[𝕋 ℝ[h(𝜏)], t] =
∫−∞
x(𝜏)h(t − 𝜏)d𝜏
3.6 Linear Time Invariant System Analysis
h(t)
3
x(t)
2 (a) 0
t, sec
4
–5
D[TR[h(τ)], t] ≡ h(t – τ), t ≤ – 5
t –5
(c) t–4
a2 –5
b
a1
3 2
0
5
t–4
5
(d) –5
t
τ, sec
t–4
τ, sec h(t – τ), 5 ≤ t ≤ 9 x(τ)
3 2
x(τ)
t
t, sec
x(τ)
a t–4
–5
5
t
τ
d
h(t – τ), –1 ≤ t ≤ 5
3 2
Figure 3.26
5
h(t – τ), t ≥ 9
3 2
(b)
h(t – τ), –5 ≤ t ≤ –1
0
x(τ)
0
t
5
τ, sec
Worked Example 3.8: (a) Impulse response h(t) and input signal x(t).
at enough values of t to obtain a smooth waveform of the output y(t). In carrying out this evaluation, it is convenient to partition t into distinct regions within which the area under the product waveform is derived from the same geometric shape. Figure 3.26b shows that for t ≤ −5 and for t ≥ 9 there is no overlap between x(𝜏) and h(t − 𝜏), so the product waveform x(𝜏)h(t − 𝜏) = 0. Thus y(t) = 0, t ≤ −5 y(t) = 0, t ≥ 9 Next, the left-hand side of Figure 3.26c shows that for −5 ≤ t ≤ −1 the shaded portion of h(t − 𝜏) overlaps with x(𝜏). The area under the product waveform is the area of a vertical trapezium (of base b and parallel sides a1 and a2 ) multiplied by x(𝜏) = 2. Notice that b is the gap between t and − 5, so b = t − (−5) = t + 5. Also, a1 = 3, and the time taken by the linear slope to rise to level a2 is the gap between −5 and t − 4, which is −5 − (t − 4) = −(1 + t). Since this slope rises through 3 units in 4 seconds, it means that a2 = −(1 + t) × 3∕4. Therefore, in this region 1 (a1 + a2 ) × b × 2 (2 ) 3 = 3 − (1 + t) (t + 5) 4 = 3(3 − t)(t + 5)∕4, −5 ≤ t ≤ −1
y(t) =
Another region of overlap, 5 ≤ t ≤ 9, is shown on the right-hand side of Figure 3.26c. Here the area under the product waveform is the area of a triangle (of base d and height a) multiplied by x(𝜏) = 2. Note that d = 5 − (t − 4) = 9 − t and a is the level reached in a time of d seconds by the above linear slope (which rises by
187
188
3 Time Domain Analysis of Signals and Systems
12 y(t) 10 8 6 4 x(t)
2 0
–5
Figure 3.27
–4
–3
–2
–1
0
1
2
3
4
5
6
7
8
9
10
t, sec
Output y(t) of the system in Worked Example 3.8 along with the input x(t) for comparison.
3 units in 4 seconds), so a = 3d/4. Thus, in this region 3 1 a × d × 2 = d2 2 4 3 = (9 − t)2 , 5≤t≤9 4
y(t) =
Figure 3.26d shows one more region of overlap in which h(t − 𝜏) lies fully within the interval of x(𝜏). The area under the product waveform is therefore the area of a triangle (of base 4 and height 3) multiplied by x(𝜏) = 2. Thus, in this region y(t) =
1 × 4 × 3 × 2 = 12, 2
−1 ≤ t ≤ 5
To summarise, the output of the system is given by ⎧0, ⎪3(3 − t)(t + 5)∕4, ⎪ y(t) = ⎨12, ⎪3(9 − t)2 ∕4, ⎪ ⎩0,
t ≤ −5 −5 ≤ t ≤ −1 −1 ≤ t ≤ 5 5≤t≤9 t≥9
This output is plotted in Figure 3.27 along with the input for ease of comparison. We see that the system has changed the input signal in several ways: (i) scaling: the amplitude of y(t) is six times that of x(t); (ii) delay: y(t) reaches its peak 4 s after x(t) reaches peak; (iii) smoothing: a rectangular input waveform x(t) with abrupt transitions to/from peak is smoothed into an output y(t) having a gradual transition to/from peak; (iv) spreading: the duration of y(t) is 4 s longer than that of x(t). Note that only the last two changes constitute a distortion. Worked Example 3.9
Direct Evaluation of Convolution Integral
A system has the impulse response h(t) = e−𝛽t u(t), 𝛽 > 0. Determine the output y(t) of this system when the input is the sinusoidal signal x(t) = cos(𝜔t). Discuss in what ways the input signal x(t) has been modified by this system and identify the type of system.
3.6 Linear Time Invariant System Analysis
This problem requires evaluation of Eq. (3.140) with x(t) = cos(𝜔t) and h(t) = e−𝛽t u(t), and hence x(𝜏) = cos(𝜔𝜏) and h(t − 𝜏) = e−𝛽(t − 𝜏) u(t − 𝜏) ∞
y(t) =
∫−∞
But
cos(𝜔𝜏)e−𝛽(t−𝜏) u(t − 𝜏)d𝜏 {
u(t − 𝜏) =
𝜏≤t 𝜏>t
1, 0,
so the integral simplifies to t
y(t) = e−𝛽t
∫−∞
cos(𝜔𝜏)e𝛽𝜏 d𝜏
To avoid needing to use integration by parts, let us invoke Euler’s formula ej𝜔𝜏 = cos(𝜔𝜏) + j sin(𝜔𝜏), and hence 1 cos(𝜔𝜏) = (ej𝜔𝜏 + e−j𝜔𝜏 ) 2 1 1 sin(𝜔𝜏) = (ej𝜔𝜏 − e−j𝜔𝜏 ) = − j(ej𝜔𝜏 − e−j𝜔𝜏 ) 2j 2 to transform the integrand into exponentials that are readily integrated t
y(t) = = = = =
1 −𝛽t e [e(𝛽+j𝜔)𝜏 + e(𝛽−j𝜔)𝜏 ]d𝜏 ∫−∞ 2 ]t ] [ [ (𝛽+j𝜔)t 1 −𝛽t e(𝛽+j𝜔)𝜏 e(𝛽−j𝜔)𝜏 || e(𝛽−j𝜔)t 1 e e + + −0 | = e−𝛽t 2 𝛽 + j𝜔 𝛽 − j𝜔 ||−∞ 2 𝛽 + j𝜔 𝛽 − j𝜔 [ j𝜔t ] [ j𝜔t ( ) ( )] −j𝜔t 𝛽 − j𝜔 𝛽 + j𝜔 1 e 1 e−j𝜔t e e + = × + × 2 𝛽 + j𝜔 𝛽 − j𝜔 2 𝛽 + j𝜔 𝛽 − j𝜔 𝛽 − j𝜔 𝛽 + j𝜔 ] [ 1 j𝜔t 1 1 j𝜔t −j𝜔t −j𝜔t 𝛽 (e + e ) − 𝜔 j(e − e ) 2 𝛽 2 + 𝜔2 2 1 [𝛽 cos(𝜔t) + 𝜔 sin(𝜔t)] 𝛽 2 + 𝜔2
Combining the cosine and sine terms as learnt in Section 2.7.3 finally yields 1 √ 2 y(t) = 2 𝛽 + 𝜔2 cos(𝜔t − tan−1 (𝜔∕𝛽)) 𝛽 + 𝜔2 1 = √ cos(𝜔t − tan−1 (𝜔∕𝛽)) 2 𝛽 + 𝜔2 We see that the response of this system to a sinusoidal input is also a sinusoidal signal of the same frequency but of different amplitude and phase. This describes the behaviour of LTI systems in general towards an input signal that comprises one or more sinusoidal signals. The output frequencies will always be the same as input frequencies and the only changes effected by the system on the signal will be some modification of input amplitudes through the system’s gain response and some phase shift through the system’s phase response. We will learn more about this in the next chapter. In this case, with input and output amplitudes given, respectively, by Ain = 1, Aout = √
1 𝛽2
+ 𝜔2
189
190
3 Time Domain Analysis of Signals and Systems
the attenuation L (in dB) of the system is ( ) Ain L = 20log10 Aout √ = 20log10 ( 𝛽 2 + 𝜔2 ) = 10log10 (𝛽 2 + 𝜔2 ) = 10log10 [𝛽 2 (1 + 𝜔2 ∕𝛽 2 )] = 20log10 𝛽 + 10log10 (1 + 𝜔2 ∕𝛽 2 ) This attenuation increases with (angular) frequency 𝜔 from a minimum Lmin = 20log10 𝛽 dB at DC (𝜔 = 0). The system is therefore a lowpass filter, and its cut-off frequency f 1 (or 3 dB bandwidth B3dB ) is defined as the frequency in hertz (Hz) at which its attenuation is 3 dB higher than Lmin . Clearly, this is the frequency at which 10log10 (1 + 𝜔2 /𝛽 2 ) = 3 dB, which means 𝜔2 /𝛽 2 = 1, so 𝜔 = 𝛽. Hence (since frequency is 𝜔/2𝜋) f1 = B3dB =
𝛽 2𝜋
3.6.3 Evaluation of Convolution Sum We will discuss the evaluation of convolutions sums by learning how to determine the output of two classes of systems, namely recursive and non-recursive discrete-time (DT) systems. A non-recursive DT system is one whose output y(n) depends only on the present and past values of the input sequence x[n]. Thus, the output of this system is given by the linear difference equation y(n) =
N ∑
(nonrecursive DT system)
ak x(n − k)
(3.141)
k=0
Comparing this equation with Eq. (3.137) for the output of a DT LTI system having impulse response h[n] y(n) = h[n] ∗ x[n] =
∞ ∑
h(k)x(n − k) ≡
k=−∞
N ∑
ak x(n − k)
k=0
we see that h(k) = ak in the above line, which means that the impulse response h[n] of the system corresponds to the multiplier coefficients ak in Eq. (3.141). That is h[n] = {a0 , a1 , a2 , · · · , aN }
(3.142)
Figure 3.28 shows an illustrative sketch of this impulse response along with a block diagram of the system, where the delay operation 𝔻[x(n), 1] that delays its input by one sampling interval is now denoted in short form simply as 𝔻. The convention used to represent information in the block diagram is as follows: ● ●
●
The direction of signal flow along a branch is indicated by an arrow. Branches leading into the delay elements are labelled with their signal values. For example, the input of the first delay element is x(n); the input of the second delay element is x(n − 1), and so on. ∑ Branches leading into the summing device are labelled with a branch transmittance or coefficient that specifies a multiplication factor by which the branch modifies a signal flowing through it. For example, moving left to right, the first branch feeds signal x(n)h(0) into the summer, the second branch feeds x(n − 1)h(1) into the summer, and so on.
Equation (3.141) and Figure 3.28 represent an N th order finite impulse response (FIR) system, also described as an FIR filter or processor, since its impulse response is a sequence of finite length N + 1, where N is the number of delay elements in the processor.
3.6 Linear Time Invariant System Analysis
h(n)
a3
a0 (a)
aN
a1 a2 –2
x(n)
(b)
–1
0
1
x(n – 1)
D h(0)
2
D
h(1)
3
x(n – 2)
N N+1 N+2
D
x(n – 3)
h(2)
x(n – N + 1)
n
D
x(n – N)
h(3)
Σ
h(N)
h(k) = ak y(n)
Figure 3.28 Non-recursive DT system or finite impulse response (FIR) filter: (a) Impulse response h(n), and (b) block diagram of Nth -order FIR filter.
An alternative and straightforward way to obtain the impulse response h[n] of an FIR system from its linear difference equation is by noting that h(n) is the output when the input x[n] is the unit impulse sequence 𝛿[n]. Thus, replacing x[n] by 𝛿[n] in Eq. (3.141) yields h(n) =
N ∑
ak 𝛿(n − k)
k=0
= a0 𝛿(n) + a1 𝛿(n − 1) + a2 𝛿(n − 2) + a3 𝛿(n − 3) + … + aN 𝛿(n − N) ⎧a0 , ⎪ ⎪a 1 , ⎪a , =⎨ 2 ⎪⋮ ⎪a N , ⎪0, ⎩
n=0 n=1 n=2 n=N n > N, or n < 0
An FIR system is guaranteed always to be stable since a necessary and sufficient condition for the stability of an LTI system is that ∞
|h(t)|dt < ∞, ∫−∞ ∞ ∑ |h(n)| < ∞,
Stable CT LTI System Stable DT LTI System
n=−∞
and in this case ∞ ∑ n=−∞
|h(n)| =
N ∑ |an | < ∞ n=0
provided all coefficients ak are finite, which will be the case if the system is realisable.
(3.143)
191
192
3 Time Domain Analysis of Signals and Systems
A recursive DT system is one whose output y(n) depends on past and present values of both the input x[n] and output y[n]. A recursive implementation always involves feedback, and in general may depend on the present input, N past values of the input, and M past values of the output. A general form of the linear difference equation of a recursive DT system is therefore y(n) =
N ∑
bk x(n − k) −
k=0
M ∑
ak y(n − k)
(3.144)
k=1
Obviously, it is essential that at least one of the ak coefficients is nonzero for the system to be recursive (i.e. for the present output to depend on past output values). When that is the case, the impulse response will be an unending (i.e. infinite) sequence. For this reason, a recursive DT system is called an infinite impulse response (IIR) filter. The stability of an IIR filter is not guaranteed but depends on the values of the coefficients being carefully chosen in order for the resulting impulse response h[n] to satisfy Eq. (3.143). Worked Example 3.10 Convolution Sum for a FIR System We wish to determine the response of a causal FIR system to the input sequence x[n] shown in Figure 3.29a and defined by ⎧4, ⎪ ⎪8, ⎪12, x(n) = ⎨ ⎪8, ⎪4, ⎪0, ⎩
n = −2 n = −1 n=0 n=1 n=2 Otherwise
(3.145)
The impulse response h[n] of the system, shown in Figure 3.29b, is defined by ⎧10, ⎪6, ⎪ h(n) = ⎨3, ⎪2, ⎪ ⎩0,
n=0 n=1 n=2 n=3 Otherwise
(3.146)
The output of this FIR system is obtained by evaluating the convolution sum specified by Eq. (3.139) k2 ⎧∑ ⎪ x(k)h(n − k), y(n) = ⎨k=k 1 ⎪ ⎩0,
n = n1 + k1 , · · · , →, · · · , n2 + k2 Otherwise
noting that the input sequence x[k] is zero outside the interval k = (−2, 2), so k1 = −2 and k2 = 2; and the impulse response sequence h[n] is zero outside the interval n = (0, 3), so n1 = 0 and n2 = 3. Therefore y(n) is nonzero only within the interval n = (n1 + k1 , n2 + k2 ) = (−2, 5). This means that we only need to compute y(−2), y(−1), y(0), y(1), y(2), y(3), y(4), and y(5), the other output samples being zero. What the above convolution sum operation means is that the output sample y(n) is the sum of the product of corresponding elements of the sequences x[k] and h[n − k]. Figure 3.29c–g should provide complete clarity about this important operation, so let us consider them one by one. The sequence h[−k] is shown in Figure 3.29c, obtained by time-reversing the sequence h[k] in Figure 3.29b. The sequence h[n − k] for n = −3 is required to compute y(−3) and is shown in the bottom half of Figure 3.29d, with x[k] shown in the top half for ease of matching corresponding elements. We see that at least one element in every pair of corresponding elements is zero. For example, at k = −6, h(n − k) = 2 but x(k) = 0; at k = 1, x(k) = 8
3.6 Linear Time Invariant System Analysis
x(k)
12 8 (a) 4 –4 –3 –2 –1
(d)
(e)
(f)
0
1
2
3
Figure 3.29
h(–k)
h(k) (c) 0 1 x(k)
2
3
2 k 4
12 8 4 0 h(n – k), n = –3 10 6 3 0 –8 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 x(k) 12 8 4 0 h(n – k), n = 6 10 6 3 0 –8 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 x(k) 12 8 4 0 h(n – k), n = 0 10 6 3 0 –8 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 12
(g)
4
10 (b) 6 3 k –1
–4 –3 –2 –1
0
1
k
k
5
6
7
8
k
k
4
5
6
7
8
k
k
4
5
6
7
8
k
x(k)
8 4 0 h(n – k), n = 2 10 6 3 0 –8 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 4
k
5
6
7
8
k
Worked Example 3.10.
but h(n − k) = 0; and at k = 4, both x(k) and h(n − k) are zero. Thus, every product of corresponding elements is zero and hence y(−3) = 0. From Figure 3.29d it is easy to see that this will be the outcome for all n ≤ −3. That is, y(n) = 0 for n ≤ −3. Figure 3.29e shows h[n − k] for n = 6 in the bottom half and x[k] in the top half, from which we see that there is no overlap between the two sequences when n ≥ 6. That is, the product of every pair of corresponding elements is zero. Thus, y(n) = 0 for n ≥ 6.
193
194
3 Time Domain Analysis of Signals and Systems
Figure 3.29f shows h[n − k] for n = 0 in the bottom half and x[k] in the top half. We see that there is some overlap of the two sequences, which yields y(0) = (4 × 3) + (8 × 6) + (12 × 10) = 180 Figure 3.29g shows h[n − k] for n = 2 in the bottom half and x[k] in the top half. Multiplying corresponding elements within the overlap region and summing these products, we obtain y(2) = (8 × 2) + (12 × 3) + (8 × 6) + (4 × 10) = 140 Carrying on in this way, we compute y(−2), y(−1), …, y(5) to obtain the output sequence y[n] with the complete set of values ⎧40, ⎪104, ⎪ ⎪180, ⎪184, ⎪ y(n) = ⎨140, ⎪72, ⎪ ⎪28, ⎪8, ⎪ ⎩0,
n = −2 n = −1 n=0 n=1 n=2 n=3 n=4 n=5 Otherwise
Convolution sum evaluation involves summing the sequence x(k)h[n − k] from k = k1 to k = k2 to obtain y(n). This operation may be conveniently and more compactly carried out using a tabular layout in which the first row contains the elements x(k1 )h[n − k1 ], the second row is x(k1 + 1)h[n − (k1 + 1)], and so on to the last row x(k2 )h[n − k2 ]. The first column of the table is for n = n1 + k1 , the second column for n = n1 + k1 + 1, and so on to the last column n = n2 + k2 . The rows are therefore x(k1 ){h(n1 + k1 − (k1 )), · · · , h(n2 + k2 − (k1 ))} x(k1 + 1){h(n1 + k1 − (k1 + 1)), · · · , h(n2 + k2 − (k1 + 1))} ⋮ x(k2 ){h(n1 + k1 − (k2 )), · · · , h(n2 + k2 − (k2 ))} and the output y(n) is obtained as the sum of each column n, for n = n1 + k1 to n = n2 + k2 . In this example, with k1 = −2, k2 = 2, n1 = 0, n2 = 3, making use of the specifications of x(n) and h(n) in Eqs. (3.145) and (3.146), the first row is x(−2){h(0), · · · , h(7)} = 4 × {10, 6, 3, 2, 0, 0, 0, 0} the second row is x(−1){h(−1), · · · , h(6)} = 8 × {0, 10, 6, 3, 2, 0, 0, 0} and so on until the last row k2 , which is x(2){h(−4), · · · , h(3)} = 4 × {0, 0, 0, 0, 10, 6, 3, 2}
3.6 Linear Time Invariant System Analysis
The tabular layout is shown below. n→
−2
−1
0
1
2
3
4
5
x(−2)h[n − (−2)]
4 × 10
4×6
4×3
4×2
4×0
4×0
4×0
4×0
x(−1)h[n − (−1)]
8×0
8 × 10
8×6
8×3
8×2
8×0
8×0
8×0
x(0)h[n]
12 × 0
12 × 0
12 × 10
12 × 6
12 × 3
12 × 2
12 × 0
12 × 0
x(1)h[n − 1]
8×0
8×0
8×0
8 × 10
8×6
8×3
8×2
8×0
x(2)h[n − 2]
4×0
4×0
4×0
4×0
4 × 10
4×6
4×3
4×2
40
104
180
184
140
72
28
8
y(n)
Note that the last row y[n] is the sum of the products in each column. Worked Example 3.11 IIR System (a) Determine the impulse response of a causal first-order recursive system governed by the linear difference equation y(n) = x(n) + 𝛼y(n − 1) (b) Hence determine the output y(n) of this system at instant n = 8 when the input is the step sequence 5u[n] and 𝛼 = 0.85. Comment on the stability of this system. (a) Impulse response h(n) is the output y(n) when the input x(n) is the unit impulse 𝛿(n). Substituting in the given linear difference equation yields h(n) = 𝛿(n) + 𝛼h(n − 1) Let us write out the output sequence starting at n = −2, recalling that h(n) = 0 for n < 0, since the system is causal; and 𝛿(k) = 0 for k ≠ 0, 𝛿(k) = 1 for k = 0 For n = −2 ∶ h(−2) = 𝛿(−2) + 𝛼h(−2 − 1) = 0 + 𝛼 × h(−3) = 0 For n = −1 ∶ h(−1) = 𝛿(−1) + 𝛼h(−1 − 1) = 0 + 𝛼 × h(−2) = 0 For n = 0 ∶ h(0) = 𝛿(0) + 𝛼h(0 − 1) = 1 + 𝛼 × h(−1) = 1 For n = 1 ∶ h(1) = 𝛿(1) + 𝛼h(1 − 1) = 0 + 𝛼 × h(0) = 𝛼 × 1 = 𝛼 For n = 2 ∶ h(2) = 𝛿(2) + 𝛼h(2 − 1) = 0 + 𝛼 × h(1) = 𝛼 × 𝛼 = 𝛼 2 For n = 3 ∶ h(3) = 𝛿(3) + 𝛼h(3 − 1) = 0 + 𝛼 × h(2) = 𝛼 × 𝛼 2 = 𝛼 3 ⋮ Therefore, the system has impulse response given by h(n) = 𝛼 n u(n) which is clearly an unending or IIR.
(3.147)
195
196
3 Time Domain Analysis of Signals and Systems
(b) The output of this system is obtained by evaluating the convolution sum ∞ ∑
y(n) =
x(k)h(n − k)
k=−∞
with h(n − k) obtained by substituting n − k for n in Eq. (3.147) { n−k 𝛼 , k≤n n−k h(n − k) = 𝛼 u(n − k) = 0, Otherwise Furthermore
{
x(k) = 5u(k) =
k≥0 Otherwise
5, 0,
Substituting these expressions and constraints into the summation yields y(n) = 5
n ∑
𝛼 n−k = 5
k=0
n ∑
𝛼 n ∕𝛼 k = 5𝛼 n
k=0
n ∑
(1∕𝛼)k
k=0
= 5𝛼 n [1 + (1∕𝛼) + (1∕𝛼)2 + · · · + (1∕𝛼)n ] The term in square brackets is the sum Sn+1 of the first n + 1 terms of a geometric series having first term = 1 and constant ratio = 1/𝛼, which is given by Sn+1 =
1 − (1∕𝛼)n+1 1 − (1∕𝛼)
Therefore
[
y(n) = 5𝛼 n
] 𝛼 n − 1∕𝛼 1 − (1∕𝛼)n+1 =5 1 − (1∕𝛼) 1 − 1∕𝛼
For n = 8, 𝛼 = 0.85, we obtain y(8) = 25.6128. If the coefficient 𝛼 > 1, then y(n) → ∞ as n → ∞, so the output increases indefinitely, and the system is unstable. If 𝛼 < 1, then y(n) → 5/(1 − 𝛼) as n → ∞ and the system is stable. The system is also unstable if 𝛼 = 1.
3.6.4 Autocorrelation and Convolution We discuss the mathematical operations of autocorrelation and convolution in Sections 3.5.5 and 3.6.2 along with their defining equations (repeated below for convenience) ∞
Rx (𝜏) =
x(t)x(t − 𝜏)dt,
Autocorrelation of x(t)
x1 (𝜏)x2 (t − 𝜏)d𝜏,
Convolution of x1 (t) and x2 (t)
∫−∞ ∞
x1 (t) ∗ x2 (t) =
∫−∞
Autocorrelation is an operation on a single signal, whereas convolution involves two signals. However, the mathematical expressions of these two operations bear some similarity that is worth exploring to establish a relationship which we may exploit in system analysis. Consider the convolution of x(t) with a second signal which is a time-reversed version of x(t). As discussed in Section 3.6.2, computing the value of x(t) ∗ x(−t) at t = 𝜏 involves the following steps: 1. Time-reverse the second signal x(−t). This yields x(t). 2. Delay the time-reversed signal by 𝜏. This yields x(t −𝜏).
3.7 Summary
3. Take the area under the product of the first signal and the time-reversed and 𝜏-delayed second signal. This is the integration of x(t)x(t − 𝜏) between the limits t = −∞ and t = ∞, which is simply the autocorrelation of x(t). Therefore, the autocorrelation of a signal x(t) is equivalent to convolving the signal with a time-reversed version of itself, and we may write Rx (𝜏) = x(𝜏) ∗ x(−𝜏)
(3.148)
We return to this relationship between autocorrelation and convolution in the next chapter after learning about FTs. By taking the FT of both sides of the above equation, we will obtain a very important result that states that the FT of the autocorrelation function of a signal is the spectral density of the signal.
3.7 Summary Telecommunication is primarily concerned with the processing of information and the transfer of informationbearing signals from one point to another through a transmission medium. Chapters 2 and 3 have provided a comprehensive introduction to signals and systems and their characterisation and analysis in the time domain. The topics covered, and the approach and depth of treatment, were carefully selected to strike a delicate balance between comprehensive rigour and succinct simplicity. The aim was to minimise needless mathematical hurdles and present material that is fresh and accessible for newcomers, insightful and informative for everyone, and free of knowledge gaps in supporting further study and the material presented in subsequent chapters. Randomness, arising from an intractable interplay of deterministic laws and causes, is an ever-present phenomenon in real life, and telecommunication is no exception. Whether in the form of noise as an unwanted addition to the wanted signal, undesirable haphazard environmental interventions in a wireless transmission medium, sequence of binary 1’s and 0’s in an information-bearing bitstream, or the behaviour and service demands of a sizeable population of users, random signals are a constant presence in telecommunications. We discussed the most important aspects of the basic tools needed to characterise and analyse random signals and treated in some detail six of the standard statistical distributions that are most frequently encountered in communication systems design and analysis. This chapter began with a treatment of basic signal operations and continued into a discussion of various measures of a signal’s strength and patterns of variability applicable to both deterministic and random signals. The concepts of active and reactive power were also briefly introduced, and we explored, with the help of worked examples and consideration of implications, various approaches to determine expressions for the power, autocorrelation functions, and the correlation coefficient of different signals. We also discussed the analysis of LTI systems, an important class of CT and DT systems which obey the principle of superposition in addition to having characteristics that do not change with time. We characterised an LTI system in terms of its impulse response, which is the output of the system when the input is a unit impulse. Armed with this impulse response, the response of the system to an arbitrary input signal may be obtained through time domain analysis by convolving the input signal with the system’s impulse response. We distinguished between non-recursive or FIR and recursive or IIR DT systems, and learnt various graphical, mathematical, and tabular approaches for evaluating convolution integrals (in the case of CT systems) and convolution sums (for DT systems). In the next chapter, we turn our attention to frequency domain representation and analysis of signals and systems. We will develop the tools and sound understanding of concepts needed to exploit all the simplifications afforded by the frequency domain in the analysis and design of communication systems.
197
198
3 Time Domain Analysis of Signals and Systems
References 1 ITU-R. (2007). Recommendation P.1057-2: probability distributions relevant to radiowave propagation modelling. https://www.itu.int/rec/R-REC-P.1057-2-200708-S/en (accessed 12th June 2019). 2 Rice, S.O. (1944). Mathematical analysis of random noise. Bell System Technical Journal 23 (3): 282–332. 3 Rice, S.O. (1945). Mathematical analysis of random noise. Bell System Technical Journal 24 (1): 46–156. 4 Schwartz, M. (2005). Mobile Wireless Communications. Cambridge: Cambridge University Press.
Questions 3.1
A 925 MHz radio signal is received via a primary path and three secondary paths of excess delays 100, 150, and 326.67 ns and respective attenuations (relative to the primary path) 1.2, 3.5, and 4.2 dB. (a) Determine the total excess delay, the mean delay, and the rms delay spread of the multipath transmission medium. (b) If the power arriving via the primary path is −50 dBm, determine the total received power at the receiver.
3.2
A 1 GHz radio signal is received via a primary path and one secondary path. The primary signal power is −100 dBm, and the secondary path is attenuated by 3 dB and delayed by 100.25 ns relative to the primary path. Determine the received signal power.
3.3
A random variable Θ is uniformly distributed in the interval (−𝜋, 𝜋). (a) Determine the mean and variance of Θ. (b) Determine the characteristic function of Θ. (c) Use the characteristic function obtained in (b) to evaluate the mean and variance of Θ and compare these results with those of (a).
3.4
Show that the mean E[X] (i.e. expected number of arrivals in interval D) and mean square value E[X 2 ] of a Poisson arrival process X are as given by the expressions in Eq. (3.90).
3.5
Using the definition of energy in Eq. (3.101) or otherwise, determine the energy of the following signals: (a) g1 (t) = Arect(t∕𝜏) (b) g2 (t) = A cos(2𝜋ft)rect(t∕𝜏), where 𝜏 = n∕f and n is an integer ≥ 1 (c) g3 (t) = Atrian(t∕𝜏) (d) g4 (t) = Asinc(t∕Ts )
3.6
Derive an expression (in terms of amplitude A and duty cycle d) for the rms value of the raised cosine pulse train g(t) shown in Figure Q3.6. One cycle of the pulse train is defined by ( )] {A [ 2𝜋 𝜏 𝜏 1 + cos t , − ≤t≤ gT (t) = 2 𝜏 2 2 0, Otherwise
3.7
Derive an expression for the autocorrelation function of a unipolar random binary waveform in which a rectangular pulse of duration T b is sent with amplitude +A for binary 1 and amplitude 0 for binary 0, and binary 1’s and 0’s are equally likely to occur. (HINT: follow the same steps as in Worked Example 3.6c with only one change: use a voltage level of 0 V for binary 0, rather than a voltage level of −A used in that worked example.)
Questions
g(t)
d = τ/T
A
–T/2 –τ/2 Figure Q3.6
0
τ/2
T/2
t
Raised cosine pulse train of amplitude A and duty cycle d.
3.8
Derive expressions for the autocorrelation function of the following signals: (a) g1 (t) = Arect(t∕𝜏) (b) g2 (t) = A cos(2𝜋ft)rect(t∕𝜏), where 𝜏 = n∕f and n is an integer ≥ 1 (c) g3 (t) = Atrian(t∕𝜏) (d) g4 (t) = Asinc(t∕Ts )
3.9
The alternate mark inversion (AMI) line code represents binary 1 using alternate-polarity half-width rectangular pulses of amplitude A, and binary 0 with no pulse (i.e. a pulse of zero amplitude). A half-width pulse is a pulse of amplitude A and width T b /2 at the start of the bit interval followed by no pulse (i.e. zero amplitude) for the remaining half of the bit interval, where T b is bit duration. If binary 1’s and 0’s are equally likely, show that the autocorrelation function of the AMI line code (random waveform) is ( ) ( ) 𝜏 ± Tb A2 A2 𝜏 trian trian − RAMI (𝜏) = 4 Tb 8 Tb
3.10
The expectation of the product of two random signals X and Y is defined as ∞ ∞ E[XY ] = ∫−∞ ∫−∞ xypX,Y (x, y)dxdy where pX,Y (x, y) is the joint PDF of X and Y . Show that if X and Y are independent then E[XY ] = E[X]E[Y ]
3.11
Consider a random signal X which is uniformly distributed in the interval (−a, a), where a is some positive real constant. Let us define another random signal Y = 2X 2 . Clearly, X and Y are not independent. Show that X and Y are uncorrelated, which proves that two uncorrelated signals are not necessarily independent.
3.12
Show that the Pearson correlation coefficient is invariant under changes to one or both signals by a multiplicative constant factor or additive constant term. (Note: this requires you to show that the Pearson correlation coefficient of signals X and Y , denoted r X,Y , is the same as the Pearson correlation coefficient of a + bX and c + dY , denoted r a+bX, c+dY , where a, b, c, d are constants and b > 0, d > 0.)
3.13
M-ary amplitude shift keying (M-ASK) employs the following set of symbols for transmission ( ) t − 12 Ts gk (t) = kArect cos(2𝜋fc t) Ts k = 0, 1, 2, 3, · · · , M − 1; fc = n∕Ts where A is some constant and n > 0 is a positive integer.
199
200
3 Time Domain Analysis of Signals and Systems
Derive an expression for the correlation coefficient of the largest-amplitude adjacent symbols gM−1 (t) and gM−2 (t) and examine how this coefficient varies with M. [Note that M is always an integer power of 2, so it may take on only the values M = 2, 4, 8, 16, 32, 64, …]. 3.14
M-ary frequency shift keying (M-FSK) employs the following set of symbols for transmission ( ) t − 12 Ts gk (t) = Arect cos[2𝜋(k + 1)(fc ∕2)t] Ts k = 0, 1, 2, 3, · · · , M − 1; fc = n∕Ts where A is some constant and n > 0 is a positive integer. Show that every pair of symbols gm (t) and gp (t), m ≠ p, in the set is orthogonal over a symbol duration in the interval (0, T s ) and hence has the correlation coefficient 𝜌gm (t), gp (t) = 0.
3.15
A causal CT system has impulse response ) ( t 1 − h(t) = 5rect 6 2 where t is time in seconds. Use a graphical convolution approach to determine and plot a smooth waveform of the output of the system for each of the (following input signals ) t (a) x(t) = rect 12( ) t 1 − (b) x(t) = 10trian 8 4 (NOTE: you may wish to refer to Sections 2.6.3 and 2.6.5 for definitions of the rect() and trian() pulses used above.)
3.16
The impulse response h[n] of an LTI discrete-time system is shown in Figure Q3.16a. (a) Determine and fully sketch the output sequence y[n] of this system when the input x[n] is the discrete-time signal shown in Figure 3.31b. h(n) 12 10 8
(a)
6 4 2 –3 –2 –1
0
1
2
3
4
5
6
2
3
4
5
6
7
8
n
x(n) 5
(b)
–6 –5 –4 –3 –2 –1 Figure Q3.16
0
1
n
Impulse response h[n] and input sequence x[n] for Question 3.16.
Questions
(b) Based on your result in (a), discuss all the effects of this system on a signal passed through it, and identify what type of filter the system might be. 3.17
If the input sequence of Figure 3.25a is transmitted through a finite impulse response (FIR) filter with impulse response as shown in Figure 3.25b, determine and fully sketch the resulting output sequence y[n] of the system.
3.18
Calculate the energy or power (depending on type) of each of the following signals. (a) Impulse function 𝛿(t) (b) Sinc function sinc(t) (c) Unit step function u(t) (d) Signum function sgn(t) (e) Complex exponential function exp(j𝜔t).
3.19
Obtain expressions for the (a) average value, (b) rms value, and (c) power of a full-wave rectified sinusoidal signal g(t) = |Acos(2𝜋ft + 𝜙)|. Your formulas should be applicable at all values of initial phase 𝜙 and not only at 𝜙 = −90∘ . How do your expressions compare with those for an unrectified sinusoidal signal Acos(2𝜋ft + 𝜙)?
201
203
4 Frequency Domain Analysis of Signals and Systems
There is a smart way and a hard way in every engineering problem-solving. The fun is in doing it smart. In this Chapter ✓ Fourier series: a step-by-step treatment of the frequency domain representation of periodic signals with application to the analysis of important telecom signals, including sinusoidal pulse trains, binary amplitude shift keying, staircase waveforms, flat-top sampling, and trapezoidal pulse trains. ✓ Fourier transform (FT): frequency domain analysis of nonperiodic waveforms and energy signals. ✓ Discrete Fourier transform (DFT): an extension of frequency domain analysis techniques to discrete-time (DT) signals and systems, along with richly illustrated discussions of practical issues such as fast Fourier transform (FFT), frequency resolution, spectral leakage, spectral smearing, periodograms, etc. ✓ Laplace transform (LT) and z-transform (ZT): a brief introduction to give you a complete picture of frequency domain tools and their relationship with other transform techniques. ✓ Frequency domain characterisation and analysis of linear systems. ✓ Worked examples: a wide selection to further deepen your insight and help you master problem-solving application and interpretation of important concepts. ✓ End-of-chapter questions: an essential test of your grasp of the subject matter.
4.1 Introduction Telecommunication signals and systems can be characterised in both the time and the frequency domains. Time domain description is presented in the previous two chapters. In the frequency domain we specify the frequency, phase, and relative amplitude of each of the sinusoidal components that constitute the signal, either through a spectral plot as displayed on a spectrum analyser or through a mathematical expression giving the Fourier series or FT of the signal. From these the bandwidth, spectral density, and other spectral characteristics of the signal can be determined, as well as signal energy or power. For a system, we specify its gain response and phase response, which may be combined into a single complex quantity known as the transfer function of the system. This function describes how the system scales the amplitude and shifts the phase of a sinusoidal signal that is transmitted through it. The transfer function also leads to a specification of the bandwidth of the system. We hope that it is your ambition to have a sound understanding of the principles of communication engineering and to establish a robust foundation and the confidence necessary for a successful and pleasurable engagement Communication Engineering Principles, Second Edition. Ifiok Otung. © 2021 John Wiley & Sons Ltd. Published 2021 by John Wiley & Sons Ltd. Companion Website: www.wiley.com/go/otung
204
4 Frequency Domain Analysis of Signals and Systems
with telecoms, either in your career or in further studies and research. In that case, we recognise that this chapter is pivotal for you. So, we have taken great care to present the material in a step-by-step manner, using appropriate worked examples at every stage to help you see how each key concept is interpreted in the right context and applied to engineering problem solving. It is strongly recommended that you give yourself plenty of time and perhaps set yourself small section-by-section targets as a motivation to work diligently and carefully through the entire chapter. It is much better to take a little more time to reach a place of excellence and confidence than to rush through this chapter and risk coming away with hazy information or, worse, a misplaced confidence in your knowledge and ability. We start our discussion with the Fourier series applicable to continuous-time (CT) periodic signals. Using a mix of heuristic, graphical, and mathematical approaches, we explore the topic of Fourier series at a depth and breadth that are considered complete for the needs of modern engineering. We emphasise how to derive the Fourier series either from first principles by evaluating integrals or by applying some of its properties. We develop in full the relationships between the sinusoidal and complex exponential forms of the Fourier series and link these to double-sided and single-sided amplitude and phase spectra. We then employ the tool of Fourier series to analyse flat-top sampling, sinusoidal pulse trains, and binary amplitude shift keying (BASK) with very interesting and insightful results. We also derive Fourier series expressions for the most general specification of a staircase waveform and for the trapezoidal pulse train (which reduces to rectangular, triangular, sawtooth, and ramp pulse trains as special cases). We then derive the FT applicable to continuous-time nonperiodic signals as a limiting case of the Fourier series when signal period T tends to infinity. We learn how to calculate the FT of a signal from first principles, discuss the properties of the FT and how they are leveraged in problem solving, and present a tabulated list of standard FT pairs for reference purposes. To illustrate the relevance and power of the FT as a system analysis and design tool, we undertake the evaluation of digital transmission pulses from both time domain and frequency domain perspectives and come away with interesting findings on system capacity implications. Today’s world is increasingly driven by digital technology and dominated by discrete signals. So, it is only a logical step for us to extend the powerful tools of Fourier series and FT to be applicable to discrete-time (DT) signals in the respective forms of discrete-time Fourier series (DTFS) and discrete-time Fourier transform (DTFT), which we merge into one tool known simply as the discrete Fourier transform (DFT). We discuss the DFT in detail, including the basics of an algorithm for its efficient computation known as the fast Fourier transform (FFT). We endeavour to fully equip you to avoid common pitfalls in the use of FFT for data analysis, and to be able to correctly select the values of parameters needed for reliable spectral analysis especially of nondeterministic signals, and to correctly interpret results with full knowledge of their limitations. The scope of this book precludes a detailed treatment of other transform techniques, but for the sake of completeness and, importantly, to help you see how they fit into the frequency domain picture, we briefly introduce the Laplace transform (LT) and z-transform (ZT). The LT is presented as a generalisation of the FT. It is a versatile tool in the design and analysis of continuous-time systems. The ZT is a generalisation of the DTFT for application in the design and analysis of DT systems, most popularly in the field of digital filters and digital signal processing (DSP). It is hoped that readers who are involved with control systems, electrical circuits, DSP, etc., will find the interrelationships among FT, LT, and ZT presented in this chapter both illuminating and of practical use. We briefly outline the inverse relationships between time and frequency domains primarily to help you gain an important early appreciation for the inherency of trade-offs in system design. The brief discussion of this issue is also aimed at establishing how operations and forms (e.g. periodic versus nonperiodic and continuous versus discrete) in one domain translate into the other. Finally, we turn our attention to using FT tools to characterise and analyse linear systems. The main parameter of such a system is its transfer function, which we derive and link to the impulse response of the previous chapter. We attempt to bring clarity to the oft-misused terminology of bandwidth. And we rely on the transfer function parameter as we explore the concepts of distortionless transmission, equalisation, amplitude and phase distortions, and
4.2 Fourier Series
the calculation of the output signal and output spectral density of linear systems. For completeness, we also briefly discuss and demonstrate the harmonic and intermodulation distortions that arise due to nonlinear operation.
4.2 Fourier Series We learnt in connection with Figure 2.26 that adding together harmonically related sinusoids yields a periodic signal. By appropriately selecting the amplitude, frequency, and phase of such sinusoids, any arbitrary periodic signal can be realised as the sum of these harmonically related sinusoids. This is in fact the Fourier theorem, which states: Any periodic function or waveform g(t) having period T can be expressed as the sum of sinusoidal signals with frequencies at integer multiples (called harmonics) of the fundamental frequency f o = 1/T, and with appropriate amplitudes and phases. Figure 4.1 illustrates the realisation of an arbitrary periodic staircase waveform g(t) by adding together its harmonic sinusoids. This process is known as Fourier synthesis. In Figure 4.1a the synthesised waveform gs (t) is the sum of a DC component of value 1.05, a first harmonic – a sinusoid of amplitude 2.83 volts (V), frequency f o , and phase −98∘ , and a second harmonic – a sinusoid of amplitude 0.62 V, frequency 2f o , and phase 47.4∘ , where f o = 1/T, and T is the period of g(t) as indicated on the figure. That is g (t) = 1.05 + 2.83 cos(2𝜋f t − 98∘ ) + 0.62 cos(4𝜋f t + 47.4∘ ) s
o
o
Using the techniques learnt in Section 3.5, the normalised powers of g(t) and gs (t) are readily calculated as 5.65 and 5.3 W respectively, which means that the DC and first two harmonics of this particular waveform g(t) 4V
gs(t) = 1.05 + 2.83 cos(2π fot – 98˚) + 0.62 cos(4π fot + 47.4˚); fo = 1/T g(t)
DC + two harmonics
(a) 0 –2V 4V
t T DC + 15 harmonics
(b) 0
t
–2V 4V DC + 80 harmonics (c) 0
t
–2V Figure 4.1 Fourier synthesis of periodic staircase waveform g(t) using (a) DC + first two harmonics; (b) DC + first 15 harmonics; and (c) DC + first 80 harmonics.
205
206
4 Frequency Domain Analysis of Signals and Systems
contain 93.83% of its power. The approximation of g(t) by gs (t) is improved as more harmonic sinusoids of the right amplitude and phase are added to gs (t). This is illustrated in Figure 4.1b in which gs (t) consists of the DC and the first 15 harmonics, making up 98.53% of the power. In Figure 4.1c the DC and first 80 harmonics are summed to produce an even closer approximation to g(t) containing 99.71% of the power. Notice that ripples have been markedly reduced in the synthesised waveform from Figure 4.1b–c. However, overshoots will persist at points of a jump discontinuity (where, for example, an ideal rectangular pulse train (RPT) changes level instantaneously from one value to another). This problem is known as Gibbs phenomenon, which refers to the oscillation of a Fourier synthesised waveform near a jump, with an overshoot that does not die out but approaches a finite limit as more and more harmonics are added. One lesson to draw from this brief exercise in Fourier synthesis is that a waveform transmitted through a transmission medium or system will be noticeably distorted unless the system has enough bandwidth to pass all significant frequency components of the waveform. We return to this important thought later in the chapter. It should be pointed out that the summation of harmonic sinusoids as stipulated in the above Fourier theorem is only guaranteed to converge (i.e. to approximate the function g(t) better and better as more harmonics are added) if, over one cycle, the function (i) is absolutely integrable, (ii) is of bounded variation, and (iii) has a finite number of finite discontinuities. These three conditions are known as Dirichlet conditions and are satisfied by the telecommunication signals that we will deal with, so this caveat is more for information than for any serious practical caution. We may therefore proceed with confidence to the task of the Fourier analysis of periodic waveforms to determine the amplitudes and phases of their harmonic components.
4.2.1 Sinusoidal Form of Fourier Series The Fourier theorem stipulates that a periodic signal g(t) may be expanded as a Fourier series of the form g(t) = A0 +
∞ ∑
an cos(2𝜋nf o t) +
n=1
∞ ∑
bn sin(2𝜋nf o t)
(4.1)
n=1
where ● ● ● ●
f o = 1/T is the fundamental frequency, T being the period of g(t). nf o is the nth harmonic frequency; n = 1, 2, 3, … A0 is the DC or average value of g(t). an is the cosine coefficient and bn the sine coefficient.
The amplitude and phase of the nth harmonic of g(t) are not obvious in Eq. (4.1) because in this form of the Fourier series the harmonic at each frequency is split into a cosine component an cos(2𝜋nf o t) and a sine component bn sin(2𝜋nf o t). We know from Section 2.7.3 that we can combine these two components (since they have the same frequency nf o but differ in phase by 90∘ ) into a single resultant sinusoid An cos(2𝜋nf o t + 𝜙n ). The Fourier series of g(t) therefore takes the more compact and informative form g(t) = A0 +
∞ ∑
An cos(2𝜋nf o t + 𝜙n )
(4.2)
n=1
where An and 𝜙n are, respectively, the amplitude and phase of the nth harmonic component of g(t). Figure 4.2 shows the geometry and expression for An and 𝜙n in terms of an and bn for all combinations of the magnitude and sign of the coefficients an and bn . We see that to determine the phase 𝜙n we first compute an acute angle 𝛼 using the absolute values of an and bn as 𝛼 = tan−1 (|bn |∕|an |)
(4.3)
4.2 Fourier Series
Condition:
an > 0, bn > 0 an ϕn
ϕn = –tan–1
an > 0, bn < 0
bn an
bn
ϕn an
bn
ϕn = tan–1
bn an An =
Condition:
an > 0, bn = 0
an < 0, bn = 0
an
an
ϕn = 0
ϕn
ϕn = 180˚
an < 0, bn > 0
an < 0, bn < 0
an ϕn
bn
+
ϕn an
ϕn = tan–1 an2
bn
bn an
–180˚ ϕn = 180˚ –tan–1
bn an
bn2
an = 0, bn > 0 bn
an = 0, bn < 0 bn ϕn
ϕn
ϕn = –90˚
An = an
ϕn = 90˚ An = bn
cos (Reference) Phasor relationships: sin
Figure 4.2
Amplitude An and phase 𝜙n of nth harmonic in terms of cosine and sine coefficients an and bn .
The phase of the nth harmonic is then given by ⎧ ⎪𝛼, ⎪ ⎪−𝛼, 𝜙n = ⎨ ⎪180 − 𝛼, ⎪ ⎪𝛼 − 180, ⎩ And the amplitude is √ An = a2n + b2n
bn ≤ 0, an ≥ 0 bn > 0, an ≥ 0 bn ≤ 0, an < 0
(4.4)
bn > 0, an < 0
(4.5)
A plot of the amplitudes An of the harmonic sinusoids against their frequencies nf o is called the magnitude or amplitude spectrum, whereas a plot of 𝜙n versus nf o is called the phase spectrum. The normalised DC power Pdc of the signal, the normalised power Pnfo in the nth harmonic of the signal (for n > 0), and the AC power Pac (which is the total normalised power in all the harmonics of the signal, i.e. power in all the sinusoidal components of the signal having frequency f > 0) are given by Pdc = A20 1 Pnfo = A2n 2 ∞ 1∑ 2 Pac = A 2 n=1 n
(4.6)
Using the last line of the above equation to determine AC power is understandably cumbersome since it involves summing a large and theoretically infinite number of components. It is usually more straightforward to obtain the
207
208
4 Frequency Domain Analysis of Signals and Systems
1st harmonic of frequency fo = 1/T
t
2nd harmonic of frequency 2fo
t
3rd harmonic of frequency 3fo
t
4th harmonic of frequency 4fo
t
–T/2
–T/4
0
T/4
T/2
Figure 4.3 Total area under the waveform of a harmonic sinusoid in an interval T spanning one cycle of the fundamental waveform (of frequency f o ) is zero.
AC power of g(t) by first calculating the total power Pt of g(t) in the time domain (from its waveform structure, using equations such as (3.104) to (3.111)) and then determining AC power as (4.7)
Pac = Pt − Pdc
It is a trivial matter to determine the frequency of each of the harmonic sinusoids present in a periodic signal g(t). The frequency f n of the nth harmonic is simply nf o , which is n times the reciprocal of the period of g(t) (4.8)
fn = nf o = n∕T
Note that the frequency of the first harmonic (n = 1) is the same as the fundamental frequency f o of the signal. Determining the amplitude An and phase 𝜙n of the nth harmonic, for n = 1, 2, 3, …, in general requires first calculating the cosine and sine coefficients an and bn in Eq. (4.1) and then employing Eqs. (4.4) and (4.5) or Figure 4.2 to obtain An and 𝜙n . So how do we determine these Fourier coefficients? An important observation for this task is illustrated in Figure 4.3, namely that the area of any harmonic sinusoid in a span of one cycle of the fundamental waveform is zero. Figure 4.3 shows only the first four harmonic sinusoidal waveforms, but the scenario will be similar for all other harmonics: the nth harmonic completes precisely n positive half-cycles and n negative half-cycles within the interval T, so their contributions to the total area cancel exactly. A mathematical proof of the above observation is straightforward. The sinusoids cos(2𝜋mf o t)
and
sin(2𝜋mf o t),
m = 1, 2, 3, · · ·
are harmonics of fundamental frequency f o and fundamental period T = 1/f o (which means that f o T = 1). The area of cos(2𝜋mf o t) in an interval (−T/2, T/2) spanning one fundamental cycle is given by the integral T∕2
∫−T∕2
cos(2𝜋mf o t)dt = =
sin(2𝜋mf o t) |T∕2 | 2𝜋mf o ||−T∕2 sin(2𝜋mf o T∕2) − sin(2𝜋mf o (−T∕2)) 2𝜋mf o
4.2 Fourier Series
2 sin(𝜋mf o T) 2𝜋mf o sin(m𝜋) = (since fo T = 1) 𝜋mf o =
=0
(since sin(m𝜋) = 0,
m = 1, 2, 3, · · ·)
Similarly, the area of sin(2𝜋mf o t) in this interval is − cos(2𝜋mf o t) |T∕2 | | 2𝜋mf o |−T∕2 ] [ cos(2𝜋mf o T∕2) − cos(2𝜋mf o (−T∕2)) =− 2𝜋mf o
T∕2
∫−T∕2
sin(2𝜋mf o t)dt =
=0
since cos(𝜃) = cos(−𝜃)
Armed with this knowledge, we make the following observations in Eq. (4.1) repeated here for convenience with the index n replaced by m g(t) = A0 +
∞ ∑
am cos(2𝜋mf o t) +
m=1
∞ ∑
bm sin(2𝜋mf o t)
(4.9)
m=1
Since the right-hand side consists of A0 and harmonics, taking the area of (i.e. integrating) both sides of Eq. (4.9) over an interval of one cycle eliminates the contribution of each harmonic and leaves us with an expression for A0 . By trigonometric identity, multiplying cos(2𝜋mf o t) or sin(2𝜋mf o t) by cos(2𝜋nf o t) creates harmonics at frequencies (m + n)f o and (m − n)f o , except when m = n. Therefore, multiplying both sides of Eq. (4.9) by cos(2𝜋nf o t) before integrating over an interval of one cycle eliminates the contribution of every newly created harmonic and leaves us with an expression for an . Similarly, multiplying both sides of Eq. (4.9) by sin(2𝜋nf o t) before integrating over an interval of one cycle leads to an expression for bn . Thus, for the DC component A0 , integrating both sides of Eq. (4.9) ] ] [∞ [∞ T∕2 T∕2 T∕2 ∑ T∕2 ∑ g(t)dt = A dt + a cos(2𝜋mf o t) dt + b sin(2𝜋mf o t) dt ∫−T∕2 0 ∫−T∕2 m=1 m ∫−T∕2 m=1 m ∫−T∕2 Interchanging the order of integration and summation on the right-hand side T∕2
∫−T∕2
T∕2
g(t)dt =
∫−T∕2
A0 dt +
∞ ∑ m=1
[
] ] ∞ [ T∕2 ∑ am cos(2𝜋mf o t)dt + bm sin(2𝜋mf o t)dt ∫−T∕2 ∫−T∕2 m=1 T∕2
The second and third terms on the right-hand side are zero, being the sums of areas of various harmonic sinusoidal waveforms in an interval of one fundamental cycle. Therefore T∕2
∫−T∕2
T∕2
g(t)dt =
∫−T∕2
T∕2
A0 dt = A0 t|−T∕2 = A0
) ( T T − A0 − = A0 T 2 2
which yields T∕2
A0 =
1 g(t)dt T ∫−T∕2
(4.10)
209
210
4 Frequency Domain Analysis of Signals and Systems
For the cosine coefficient an , we multiply Eq. (4.9) through by cos(2𝜋nf o t) before integrating ] [∞ T∕2 T∕2 T∕2 ∑ g(t) cos(2𝜋nf o t)dt = A cos(2𝜋nf o t)dt + a cos(2𝜋mf o t) cos(2𝜋nf o t)dt ∫−T∕2 0 ∫−T∕2 m=1 m ∫−T∕2 [∞ ] T∕2 ∑ + b sin(2𝜋mf o t) cos(2𝜋nf o t)dt ∫−T∕2 m=1 m Recognising that on the right-hand side the first term evaluates to zero, and interchanging the order of integration and summation in the second and third terms yields T∕2
g(t) cos(2𝜋nf o t)dt =
∫−T∕2
∞ ∑ m=1
T∕2
∫−T∕2
[am cos(2𝜋mf o t) cos(2𝜋nf o t) + bm sin(2𝜋mf o t) cos(2𝜋nf o t)]dt
Using trigonometric identities (Appendix B), the integrand on the right-hand side expands to am b [cos(2𝜋(m − n)fo t) + cos(2𝜋(m + n)fo t)] + m [sin(2𝜋(m − n)fo t) + sin(2𝜋(m + n)fo t)] 2 2 These are harmonic sinusoids which all integrate to zero over interval (−T/2, T/2), except at m = n, where the first term is a an a cos(2𝜋(n − n)fo t) = n cos(0) = n 2 2 2 and the third term is zero, since b bn sin(2𝜋(n − n)fo t) = n sin(0) = 0 2 2 Therefore, the only surviving term in the integrand is an /2, so that T∕2
T∕2
g(t) cos(2𝜋nf o t)dt =
∫−T∕2
∫−T∕2
an T dt = an 2 2
which yields T∕2
an =
2 g(t) cos(2𝜋nf o t)dt T ∫−T∕2
(4.11)
Finally, for the sine coefficient bn , multiply Eq. (4.9) by sin(2𝜋nf o t) and integrate ] [∞ T∕2 T∕2 T∕2 ∑ g(t) sin(2𝜋nf o t)dt = A sin(2𝜋nf o t)dt + a cos(2𝜋mf o t) sin(2𝜋nf o t)dt ∫−T∕2 ∫−T∕2 0 ∫−T∕2 m=1 m ] [∞ T∕2 ∑ b sin(2𝜋mf o t) sin(2𝜋nf o t)dt + ∫−T∕2 m=1 m On the right-hand side, ignore the first term (which evaluates to zero), and interchange the order of integration and summation in the second and third terms T∕2
∫−T∕2
g(t) sin(2𝜋nf o t)dt =
∞ ∑ m=1
T∕2
∫−T∕2
[am cos(2𝜋mf o t) sin(2𝜋nf o t) + bm sin(2𝜋mf o t) sin(2𝜋nf o t)]dt
Using trigonometric identities, the integrand on the right-hand side expands to b am [sin(2𝜋(n − m)fo t) + sin(2𝜋(n + m)fo t)] + m [cos(2𝜋(m − n)fo t) − cos(2𝜋(m + n)fo t)]. 2 2
4.2 Fourier Series
These are harmonic sinusoids which all integrate to zero over interval (−T/2, T/2), except at m = n, where the first term is zero and the third term is bn /2. Therefore, the only surviving term in the integrand is bn /2, so that T∕2
T∕2
g(t) sin(2𝜋nf o t)dt =
∫−T∕2
∫−T∕2
bn T dt = bn 2 2
which yields T∕2
bn =
2 g(t) sin(2𝜋nf o t)dt T ∫−T∕2
(4.12)
We have now derived expressions for all three Fourier coefficients, which reveal that the DC component A0 is the average of g(t); the cosine coefficient an is twice the average of g(t)cos(2𝜋nf o t), and the sine coefficient bn is twice the average of g(t)sin(2𝜋nf o t). To summarise, any periodic signal g(t) of period T may be expressed in a sinusoidal form of the Fourier series as g(t) = A0 +
∞ ∑
An cos(2𝜋nf o t + 𝜙n )
n=1
where T∕2
1 g(t)dt T ∫−T∕2 √ An = a2n + b2n ; n = 1, 2, 3, · · ·
A0 =
⎧𝛼, ⎪ ⎪−𝛼, 𝜙n = ⎨ ⎪180 − 𝛼, ⎪𝛼 − 180, ⎩
bn ≤ 0, an ≥ 0 bn > 0, an ≥ 0 bn ≤ 0, an < 0
( ;
𝛼 = tan−1
|bn | |an |
)
bn > 0, an < 0
T∕2
an =
2 g(t) cos(2𝜋nf o t)dt T ∫−T∕2
bn =
2 g(t) sin(2𝜋nf o t)dt T ∫−T∕2
T∕2
(4.13)
The Fourier series stated in full in Eq. (4.13) is not a mere mathematical indulgence. Rather, it has important practical applications, giving a complete picture of the frequency content or spectrum and hence bandwidth of g(t). This information is vital in the design and analysis of signal transmission systems and will be emphasised throughout this chapter. It is not always necessary to evaluate all the integrals in Eq. (4.13) in order to derive the Fourier series of g(t). The following properties will sometimes apply and may be exploited to significantly simplify or reduce computation. 4.2.1.1 Time Shifting
By substituting t − to for t wherever it occurs in the Fourier series equation, we obtain the Fourier series of the signal g(t − to ) as g(t − to ) = A0 +
∞ ∑
An cos(2𝜋nf o (t − to ) + 𝜙n )
n=1 ∞
= A0 +
∑ n=1
An cos(2𝜋nf o t + 𝜙n − 2𝜋nf o to )
211
212
4 Frequency Domain Analysis of Signals and Systems
We see that time shifting by to changes only the phase spectrum by adding a shift of −2𝜋nf o to to the phase of the nth harmonic, but the amplitude spectrum is unchanged. Letting 𝔸x(t),n and Φx(t),n , respectively, denote the amplitude and phase spectra of signal x(t), we may write 𝔸g(t−to ), n = 𝔸g(t), n Φg(t−to ), n = Φg(t), n − 2𝜋n𝛼;
𝛼 = to ∕T
(4.14)
where we have expressed the time shift to as a fraction 𝛼 of the waveform’s period and invoked f o T = 1. If signal x(t) is the result of a combination of horizontal and vertical shifts of another signal whose Fourier series is already known then this time shifting property may be exploited to obtain the Fourier series of x(t) without the need to work from first principles involving the evaluation of the integrals in Eq. (4.13). Note that a downward (upward) vertical shift corresponds to subtracting (adding) a DC component. For example, in Figure 4.4 the bipolar rectangular waveform g1 (t) is the result of delaying the unipolar RPT g(t) by to = T/8 and subtracting A1 = 10 V. So (with 𝛼 = 1/8 in this case), if the Fourier series of g(t) is known, which means that expressions for An and 𝜙n as functions of n are known for g(t), then the Fourier series of g1 (t) may be quickly derived as g1 (t) = g(t − to ) − A1 = A0 − 10 +
∞ ∑
An cos(2𝜋nf o t + 𝜙n − 2𝜋n𝛼)
n=1
= A0 − 10 +
∞ ∑
An cos(2𝜋nf o t + 𝜙n − n𝜋∕4)
n=1
As another example of this application, the bipolar triangular waveform g3 (t) in Figure 4.4d is derived from the unipolar triangular pulse train g2 (t) in Figure 4.4c by setting 𝜏 = T (which implies duty cycle d = 1), subtracting a DC component A1 , and applying a time shift to = −T/5. So, if the Fourier series of g2 (t) is known, we may obtain the Fourier series of g3 (t) simply by setting d = 1 in the Fourier series for g2 (t), subtracting A1 , and adding 2n𝜋/5 to the phase spectrum expression. 4.2.1.2 Time Reversal
By substituting −t for t wherever it occurs in the Fourier series equation, we obtain the Fourier series of the signal g(−t) as g(−t) = A0 +
∞ ∑
An cos(2𝜋nf o (−t) + 𝜙n )
n=1 ∞
= A0 +
∑
An cos(−(2𝜋nf o t − 𝜙n ))
n=1 ∞
= A0 +
∑
An cos(2𝜋nf o t − 𝜙n )
since cos(𝜃) = cos(−𝜃)
n=1
Therefore, the time domain operation of time reversal does not alter the amplitude spectrum, but it changes the phase spectrum by a factor of −1. That is 𝔸g(−t), n = 𝔸g(t), n Φg(−t), n = −Φg(t), n
(4.15)
4.2.1.3 Even and Odd Functions
If g(t) is an even function, i.e. g(−t) = g(t), then the integrand g(t)sin(2𝜋nf o t) in the integral for bn is an odd function and its area in the range (−T/2 ≤ t ≤ 0) is the same magnitude but of opposite sign as its area in the range (0 ≤ t ≤ T/2), which leads to bn = 0. In that case, we see from Figure 4.2 that the phase spectrum will take on only two possible values, namely 0∘ if an is positive or 180∘ if an is negative.
4.2 Fourier Series
g(t), volts d = τ/T
τ
A = 30 V
(a)
T
t
g1(t), volts to
A2 = 20 V
(b)
A1 = 10 V
τ
T
to g2(t)
A
T
(c)
τ = T/4; to = T/8; d = τ/T = ¼ t
d = τ/T fo = 1/T t
τ g3(t), volts
(d) A2 = 40 V
τ1
τ2
A1 = 20 V
t T
A
g4(t) d = τ/T
τ/2 (e)
t τ –A
A3 A1 (f) A2 Figure 4.4
g5(t)
T dk = τk/T, k = 1, 2, 3, …
τ3
τ1
t
τ2 T Waveform examples for Fourier Series.
If, on the other hand, g(t) is an odd function, i.e. g(−t) = −g(t), then the integrand g(t)cos(2𝜋nf o t) in the integral for an is an odd function, and this leads to an = 0. Under this condition, the phase spectrum will also have only two possible values, namely −90∘ if bn is positive or 90∘ if bn is negative. To summarise { 0, an positive g(t) even bn = 0, 𝜙n = 180∘ , an negative { −90∘ , bn positive an = 0, 𝜙n = g(t) odd (4.16) 90∘ , b negative n
213
214
4 Frequency Domain Analysis of Signals and Systems
Therefore, when deriving the Fourier series of a given signal g(t) from first principles, if the function is not already either even or odd, it is prudent to check whether a horizontal shift of g(t) by some amount to would result in either an even or an odd function g(t − to ). If that is the case then only one coefficient (an if even or bn if odd) needs to be calculated to obtain the Fourier series of g(t − to ), which significantly reduces computational effort. The correct Fourier series for g(t) is then obtained by invoking the time shifting property and adding 2𝜋nf o to to the phase of the derived Fourier series for g(t − to ). 4.2.1.4 Piecewise Linear Functions
If the waveform g(t) is piecewise linear or a combination of standard geometrical shapes (e.g. rectangular, triangular, sawtooth, ramp, trapezoidal, etc.) then A0 is the mean value of g(t) obtained by dividing the net area within one cycle of g(t) by the period T of g(t). This net area may be easily calculated from the dimensions of the shape of the waveform without the need for any integration. Consider again the waveforms g(t), g1 (t), g2 (t), and g3 (t) in Figure 4.4. The net area in one cycle of g(t) is A𝜏, so the DC component of g(t) is A0 = A𝜏/T = Ad, where d is the duty cycle of the waveform (as discussed in Section 2.6). The net area in one cycle of g1 (t) is A2 𝜏 − A1 (T − 𝜏). Dividing by T and using the values A2 = 20, A1 = 10, and 𝜏 = T/4 shown on the figure yields the DC component of g1 (t) as A0 =
A2 𝜏 − A1 (T − 𝜏) 1 3 = A2 − A1 = 5 − 7.5 = −2.5 V T 4 4
The net area in one cycle of g2 (t) is A𝜏∕2. Dividing this by T yields A0 = Ad/2. To determine the DC component of g3 (t), we note that 𝜏 1 + 𝜏 2 = T, and (since the upward and downward slopes of the triangle are equal), A2 /A1 = 𝜏 2 /𝜏 1 . Dividing the net area in one cycle of g3 (t) by T yields A0 as 1 (A2 𝜏2 − A1 𝜏1 ) 1 (A2 𝜏2 − A1 𝜏1 ) 1 (A2 𝜏2 ∕𝜏1 − A1 ) = = 2 T 2 𝜏2 + 𝜏1 2 𝜏2 ∕𝜏1 + 1 2 2 1 (A2 × A2 ∕A1 − A1 ) 1 (A2 − A1 ) = = 2 A2 ∕A1 + 1 2 A2 + A1
A0 =
If, for example, A2 = 40 V and A1 = 20 V, then A0 = 20 V. The next three worked examples will apply the concepts we have discussed so far in this section and are designed to give you important skills in Fourier analysis and synthesis. In the first worked example, we derive the Fourier series of rectangular and triangular pulse trains from first principles. In the second, we learn how to quickly derive the Fourier series of a signal by manipulating a known Fourier series of a related standard waveform. In the third, we learn to interpret the Fourier series and extract some of its frequency domain information for use in Fourier synthesis. There are many other useful Fourier series exercises in the end-of-chapter questions for you to work on to gain confidence in your mastery of the topic. Worked Example 4.1
Derivation of Fourier Series from First Principles
We wish to derive from first principles the Fourier series of the centred unipolar RPT g(t) shown in Figure 4.4a and the centred unipolar triangular pulse train g2 (t) of Figure 4.4c. This means that we will evaluate the various Fourier series expressions given in Eq. (4.13). (NOTE: this worked example involves evaluation of integrals, including integration by parts. The next two worked examples will focus on the use and interpretation of the results obtained here without need for calculus.) Deriving the Fourier series of a waveform simply involves calculating its DC component A0 , and obtaining expressions for the amplitude An and phase 𝜙n of the nth harmonic, n = 1, 2, 3, … The unipolar RPT (g(t) in Figure 4.4a) has amplitude A, pulse width 𝜏, waveform period T, and duty cycle d = 𝜏/T. We observe that this waveform is an even function, so the sine coefficient bn in Eq. (4.13) is zero, and the
4.2 Fourier Series
Fourier series of g(t) in Eq. (4.1) simplifies to g(t) = A0 +
∞ ∑
an cos(2𝜋nf o t)
n=1
which means that only the DC component A0 and the cosine coefficient an need to be calculated. As discussed above, the RPT has DC component A0 = Ad. The cosine coefficient an is obtained by evaluating the integral in Eq. (4.11), noting that within the integration interval (−T/2, T/2) the waveform has a constant value of A in the range (−𝜏/2, 𝜏/2) and is zero elsewhere. Thus 𝜏∕2
T∕2
2 2 g(t) cos(2𝜋nf o t)dt = A cos(2𝜋nf o t)dt T ∫−T∕2 T ∫−𝜏∕2 [ ] [ ] 𝜏∕2 2A sin(𝜋nf o 𝜏) − sin(−𝜋nf o 𝜏) 2A sin(2𝜋nf o t) || = = T T 2𝜋nf o ||−𝜏∕2 2𝜋nf o [ [ ] ] 2A sin(𝜋nf o 𝜏) 𝜏 2A 2 sin(𝜋nf o 𝜏) = × = T T 𝜏 2𝜋nf o 𝜋nf o [ ] 2A𝜏 sin(𝜋nf o 𝜏) = T 𝜋nf o 𝜏
an =
(since d = 𝜏∕T;
= 2Ad sinc(nd),
fo 𝜏 = d)
where we have introduced the sinc function, sinc(x) = sin(𝜋x)/𝜋x, discussed in Section 2.6.8. The Fourier series of a centred unipolar RPT of amplitude A, waveform period T, and duty cycle d is therefore given by g(t) = Ad + 2Ad
∞ ∑
sinc(nd) cos(2𝜋nf o t);
fo = 1∕T
(4.17)
n=1
A unipolar square wave is a special case of the above pulse train with d = 1/2 for which the formula for the amplitude of the nth harmonic sinusoid is An = 2Ad sinc(nd) = A sinc(n∕2) ( ) sin(n𝜋∕2) 2A 𝜋 =A = sin n n𝜋∕2 n𝜋 2 This simplifies to ⎧0, ⎪ An = ⎨ 2A ⎪ (−1)(n−1)∕2 , ⎩ n𝜋
n even (4.18) n odd
Turning our attention to the centred triangular pulse train g2 (t) (Figure 4.4c) of amplitude A, pulse width 𝜏, waveform period T, and duty cycle d = 𝜏/T, we note that one cycle g2,T (t) of g2 (t) lies in the range (−T/2, T/2) and is defined by ⎧ ⎪A(1 − 2|t|∕𝜏), g2,T (t) = ⎨ ⎪0, ⎩
|t| ≤ 𝜏∕2 Otherwise
This waveform is also an even function, so only A0 and the cosine coefficient an need to be calculated. As discussed earlier, this triangular pulse train has DC component A0 = Ad/2. The cosine coefficient an is obtained by
215
216
4 Frequency Domain Analysis of Signals and Systems
evaluating the integral in Eq. (4.11) using the above expression for one cycle of the waveform. Thus 𝜏∕2
T∕2
an = =
2 2 g (t) cos(2𝜋nf o t)dt = A(1 − 2|t|∕𝜏) cos(2𝜋nf o t)dt T ∫−T∕2 2 T ∫−𝜏∕2 𝜏∕2 𝜏∕2 |t| 4A 2 cos(2𝜋nf o t)dt A cos(2𝜋nf o t)dt − ∫ ∫ T −𝜏∕2 T −𝜏∕2 𝜏
Notice that the first integral on the right-hand side has already been evaluated above as 2Adsinc(nd), and the second integral may be doubled and evaluated in the half-interval (0, 𝜏/2). Thus an = 2Ad sinc(nd) −
8A T𝜏 ∫0
𝜏∕2
t cos(2𝜋nf o t)dt
The integral on the right-hand side is of the form ∫ f ′ gdt, which equals fg − ∫ f g′ dt, with g ≡ t, cos(2𝜋nf o t); g′ = 1, f = sin(2𝜋nf o t)∕2𝜋nf o . Employing this (integration by parts technique) yields ∫
t cos(2𝜋nf o t)dt = t sin(2𝜋nf o t)∕2𝜋nf o −
∫
f′ ≡
[sin(2𝜋nf o t)∕2𝜋nf o ]dt
= t sin(2𝜋nf o t)∕2𝜋nf o + cos(2𝜋nf o t)∕(2𝜋nf o )2 and hence 8A 𝜏∕2 [t sin(2𝜋nf o t)∕2𝜋nf o + cos(2𝜋nf o t)∕(2𝜋nf o )2 |0 ] T𝜏 ( ( ) ) ⎤ ⎡𝜏 𝜏 𝜏 sin 2𝜋nf cos 2𝜋nf − 1⎥ o o 2 2 8A ⎢ 2 + = 2Ad sinc(nd) − ⎥ T𝜏 ⎢⎢ 2𝜋nf o (2𝜋nf o )2 ⎥ ⎦ ⎣ 2A sin(𝜋nd) 2AT [1 − cos(𝜋nd)] = 2Ad sinc(nd) − + 𝜋n 𝜏 (𝜋n)2
an = 2Ad sinc(nd) −
Multiplying the second term by d/d (in order to make its denominator equal to the sine function argument in its numerator, which allows us to introduce the sinc function), and noting that in the third term: 2AT/𝜏 = A/(d/2), 1 − cos(𝜋nd) has trigonometric identity 2 sin2 (𝜋nd/2); and multiplying the third term by (d/2)/(d/2), we obtain 2 2A sin(𝜋nd) d A [2sin (𝜋nd∕2)] d∕2 × + × 𝜋n d d∕2 d∕2 (𝜋n)2 ]2 [ sin(𝜋nd∕2) = 2Ad sinc(nd) − 2Ad sinc(nd) + Ad (𝜋nd∕2)
an = 2Ad sinc(nd) −
= Ad sinc2 (nd∕2) Therefore, the desired Fourier series (of a centred unipolar triangular pulse train g2 (t) (Figure 4.4c) of amplitude A, waveform period T, and duty cycle d) is g2 (t) = Ad∕2 + Ad
∞ ∑
sinc2 (nd∕2) cos(2𝜋nf o t);
fo = 1∕T
(4.19)
n=1
Worked Example 4.2
Derivation of Fourier Series by Manipulating Standard Waveforms
The Fourier series of a wide selection of signals may be derived, without evaluating any integrals, simply by manipulating already known Fourier series of standard waveforms. This worked example discusses such an approach. We derive the Fourier series here and leave its interpretation and application for Fourier synthesis to the next worked example.
4.2 Fourier Series
Equations (4.17) and (4.19), respectively, give the Fourier series of the centred unipolar RPT (g(t) in Figure 4.4a) and centred unipolar triangular pulse train (g2 (t) in Figure 4.4c). Based on these equations, determine the Fourier series of each of the following periodic waveforms: (a) (b) (c) (d)
A bipolar rectangular waveform (g1 (t) in Figure 4.4b). A bipolar triangular waveform (g3 (t) in Figure 4.4d). A bipolar rectangular pulse train (RPT) (g4 (t) in Figure 4.4e). An arbitrary periodic staircase waveform having amplitude Ak in the kth step of width 𝜏 k , k = 1, 2, 3, … (g5 (t) in Figure 4.4f).
(a) As earlier discussed g1 (t) = g(t − to ) − A1 . Using this relation in the Fourier series of g(t) given in Eq. (4.17), by replacing t with t − to wherever it occurs in that equation and subtracting A1 from the series, we obtain the desired Fourier series of g1 (t) as g1 (t) = g(t − to ) − A1 = Ad − A1 + 2Ad
∞ ∑
sinc(nd) cos(2𝜋nf o (t − to ))
n=1
= Ad − A1 + 2Ad
∞ ∑
sinc(nd) cos(2𝜋nf o t − 2𝜋nf o to )
n=1
Putting A = 30, A1 = 10, to = T/8, and d = 1/4, and recalling that f o T = 1, yields 1∑ 1 − 10 + 60 × sinc(nd) cos(2𝜋nf o t − 2𝜋nf o T∕8) 4 4 n=1 ∞
g1 (t) = 30 ×
= −2.5 + 15
∞ ∑
sinc(nd) cos(2𝜋nf o t − n𝜋∕4)
n=1
(b) The triangular waveform g3 (t) is a special case of the triangular pulse train g2 (t) of Figure 4.4c with d = 1, amplitude A = A2 + A1 , DC component A1 subtracted (because it has been shifted vertically downwards through A1 ), and time shifted by to = −T/5 (since it has been shifted horizontally to the left by T/5 – a value determined by counting the grids on the graph of g3 (t)). This means that g3 (t) has been advanced and made to start earlier. Therefore, with the Fourier series of g2 (t) given by Eq. (4.19) as g2 (t) =
∞ ∑ Ad + Ad sinc2 (nd∕2) cos(2𝜋nf o t) 2 n=1
the desired Fourier series of g3 (t) is obtained as g3 (t) = g2 (t − to ) − A1 =
∞ ∑ Ad sinc2 (nd∕2) cos(2𝜋nf o t − 2𝜋nf o to ) − A1 + Ad 2 n=1
= 10 + 60
∞ ∑
sinc2 (nd∕2) cos(2𝜋nf o t + 2n𝜋∕5)
n=1
where we substituted d = 1, to = −T∕5, A = 60, A1 = 20. (c) The waveform g4 (t) has bipolar pulse width 𝜏, waveform period T and duty cycle d = 𝜏/T. In Figure 4.5, we show that g4 (t) is the sum of two RPTs g4a (t) and g4b (t). Since g4a (t) has amplitude A, duty cycle d/2, and is time-shifted relative to the centred RPT g(t) by to = 𝜏/4, the Fourier series of g4a (t) is obtained by replacing d
217
218
4 Frequency Domain Analysis of Signals and Systems
g4(t)
A
d = τ/T
τ/2
t τ –A
T
= g4a(t)
A
τ/2
+
T g4b(t)
t
t
τ/2 –A T Figure 4.5
Worked Example 4.2(c).
with d/2 and t with t − 𝜏/4 in Eq. (4.17), noting that f o 𝜏 = f o dT = d. Thus g4a (t) = Ad∕2 + Ad
∞ ∑
sinc(nd∕2) cos(2𝜋nf o t − 2𝜋nf o 𝜏∕4)
n=1 ∞
= Ad∕2 + Ad
∑
sinc(nd∕2) cos(2𝜋nf o t − nd𝜋∕2)
n=1
The Fourier series of g4b (t) is similarly obtained from Eq. (4.17) by replacing d with d/2, replacing t with t + 𝜏/4 (since g4b (t) is advanced by 𝜏/4 relative to g(t)), and replacing A with −A (since g4b (t) has amplitude −A). Thus g4b (t) = −Ad∕2 − Ad
∞ ∑
sinc(nd∕2) cos(2𝜋nf o t + nd𝜋∕2)
n=1
The desired Fourier series of g4b (t) is the sum of the two series g4 (t) = Ad
∞ ∑
sinc(nd∕2)[cos(2𝜋nf o t − nd𝜋∕2) − cos(2𝜋nf o t + nd𝜋∕2)]
n=1
The term in square brackets is of the form cos(D) − cos(E) which has a trigonometric identity ) ( ) ( E−D E+D sin cos(D) − cos(E) = 2 sin 2 2 Substituting E ≡ 2𝜋nf o t + nd𝜋∕2; D ≡ 2𝜋nf o t − nd𝜋∕2 yields ( ) ( ) ∞ ∑ d d g4 (t) = 2Ad sinc n sin 𝜋n sin(2𝜋nf o t) 2 2 n=1
4.2 Fourier Series
Multiplying the sin(𝜋nd/2) factor by (𝜋nd/2)/(𝜋nd/2) allows us to convert it into sinc(nd/2) and this leads to g4 (t) = 𝜋Ad2
∞ ∑
n sinc2 (nd∕2) sin(2𝜋nf o t)
n=1 ∞
= 𝜋Ad2
∑
n sinc2 (nd∕2) cos(2𝜋nf o t − 90∘ )
n=1
The form of this Fourier series is as expected since g4 (t) is an odd function, so its Fourier series consists only of sine harmonics sin(2𝜋nf o t) having a phase of ±90∘ ; and it has no DC component. In this case all harmonics will have the same phase of −90∘ since the amplitude of the nth harmonic, which is the factor 𝜋Ad2 n sinc2 (nd∕2) in the above series, will always be positive. (d) This task is a generalisation of (c). A periodic staircase waveform can be treated as the sum of RPTs with appropriate delays. For example, the waveform g5 (t) in Figure 4.4f is the sum of (i) RPT of amplitude A1 and delay 𝜏 1 /2 having pulse width 𝜏 1 and duty cycle d1 = 𝜏 1 /T; (ii) RPT of amplitude A2 and delay 𝜏 1 + 𝜏 2 /2 having pulse width 𝜏 2 and duty cycle d2 = 𝜏 2 /T; (iii) RPT of amplitude A3 and delay 𝜏 1 + 𝜏 2 + 𝜏 3 /2 having pulse width 𝜏 3 and duty cycle d3 = 𝜏 3 /T; and so on. Referring to each component RPT as a step within one cycle of the staircase waveform, the Fourier series of the kth RPT (of amplitude Ak and pulse width or step duration 𝜏 k ) is [ ( )] k−1 ∞ ∑ dk ∑ + sinc(ndk ) cos 2𝜋nf o t − 2𝜋n di gk (t) = Ak dk + 2Ak dk 2 n=1 i=1 The Fourier series of an m-step staircase periodic waveform is therefore [ ( { )]} m k−1 ∞ ∑ ∑ dk ∑ + sinc(ndk ) cos 2𝜋nf o t − 2𝜋n di Ak dk + 2Ak dk g(t) = 2 n=1 i=1 k=1 𝜏 1 (4.20) dk = k ; fo = T T This result is completely general and applies to an arbitrary staircase waveform with m steps in one cycle, including flat-top-sampled signals, in which case the steps will be of the same duration equal to the sampling interval. It is, however, important to strictly follow the correct convention in identifying the steps: the m steps are counted starting at t = 0 within the interval (0, T). If the first step straddles the y axis then it must be counted as two steps, the first being the portion of the step beyond t = 0 and the second being the remaining portion of the step ending at t = T. Worked Example 4.3
Fourier Synthesis
The Fourier series of the bipolar rectangular waveform g1 (t) shown in Figure 4.4b was derived in the previous worked example as g1 (t) = −2.5 + 15
∞ ∑
sinc(nd) cos(2𝜋nf o t − n𝜋∕4)
n=1
For waveform period T = 0.1 ms (a) Determine the frequency, amplitude, and phase of the DC component and first 12 harmonic components of g1 (t). Present your result in tabular form. (b) Synthesise g1 (t) using its DC component and first five harmonics. (c) Write a MATLAB code to synthesise and plot g1 (t) using its DC component and first 20 harmonics.
219
220
4 Frequency Domain Analysis of Signals and Systems
(a) We start by simply comparing the Fourier series of g1 (t) with the standard form of Fourier series in Eq. (4.2) and matching corresponding terms −2.5 + 15
∞ ∑
sinc(nd) cos(2𝜋nf o t − n𝜋∕4) ≡ A0 +
n=1
∞ ∑
An cos(2𝜋nf o t + 𝜙n )
n=1
This shows that: • DC component of g1 (t): A0 = −2.5 = 2.5∠180∘ • Amplitude of nth harmonic of g1(t): An = 15 sinc(nd) • Phase of nth harmonic of g1 (t): 𝜙n = −n𝜋∕4 = −n × 45∘ Notice how we treated the negative sign of the DC component as a phase shift of 180∘ which will be taken to the phase spectrum, leaving just the magnitude of the DC component in the amplitude spectrum. This is done because the amplitude spectrum An must have only positive real values. All phase information in the amplitude (e.g. a factor of −1 for 180∘ , or a factor of j for 90∘ , or any other angle) is transferred and consigned to the phase spectrum. The amplitude of each harmonic is calculated using the above formula for An with d = 1/4 (as specified in Figure 4.4b), for n = 1, 2, …, 12. For example, A1 = 15sinc( 1/4) = 15sin(𝜋/4)/(𝜋/4) = 13.505; A2 = 15sinc( 1/2) = 9.55; and so on. If an amplitude turns out negative then 180∘ is added to the phase of that harmonic (to account for the factor of −1) and the absolute value is retained as amplitude. The phase of each harmonic is computed using the above formula for 𝜙n and then converted to a value in the range −180∘ to 180∘ by adding or subtracting an integer number of cycles (i.e. 360∘ ) as necessary. For example, the phase of the 10th harmonic is 𝜙10 = −10 × 𝜋/4 = −450∘ ≡ −450 + 360 = −90∘ . The phase of the fifth harmonic is the phase given by the above formula plus 180∘ transferred from the negative amplitude. Thus, 𝜙5 = −5 × 45∘ + 180∘ = −225∘ + 180∘ = −45∘ . If the amplitude of a harmonic is zero then its phase is by convention set to zero irrespective of the value given by the phase formula. For example, 𝜙4 = 0∘ because A4 = 0. The fundamental frequency of the waveform is f o = 1/T = 1/(0.1 × 10−3 ) = 10 kHz. So, the nth harmonic has frequency nf o = 10n kHz. The full set of results is tabulated in Table 4.1. Table 4.1 n
0
Harmonic components of g1 (t) in Figure 4.4b. Frequency = nf o , kHz
0(DC)
15 sinc(n/4)
—
−n × 45∘
An
𝝓n , deg
—
2.5
180
1
10
13.50
−45
13.50
−45
2
20
9.55
−90
9.55
−90
3
30
4.50
−135
4.50
−135
4
40
0
−180
0
0
5
50
−2.70
−225
2.70
−45
6
60
−3.18
−270
3.18
−90
7
70
−1.93
−315
1.93
−135
8
80
0
−360
0
0
9
90
1.50
−405
1.50
−45
10
100
1.91
−450
1.91
−90
11
110
1.23
−495
1.23
−135
12
120
0
−540
0
180
4.2 Fourier Series
(b) The amplitude and phase of each sinusoidal component of g1 (t) is displayed in the table above up to the 12th harmonic. Using these values, the synthesised waveform obtained by summing the DC and first five harmonics is the signal gs (t) = −2.5 + 13.50 cos(2𝜋fo t − 45∘ ) + 9.55 cos(4𝜋fo t − 90∘ ) + 4.50 cos(6𝜋f t − 135∘ ) + 2.70 cos(10𝜋f t − 45∘ ) o
o
where f o = 10 000 Hz. This signal is plotted in Figure 4.6a in the interval t = (−150, 150) μs to cover three cycles of 100 μs (= 0.1 ms) each. Synthesising and plotting a waveform as we have done provides a good check that the amplitudes and phases of the harmonics were correctly calculated. (c) The following MATLAB code generates and plots the synthesised waveform by adding the DC component and first 20 harmonics. The plot is shown in Figure 4.6b after some tidying up to include labels, grid, etc. T = 0.1e-3; %Period of waveform fo = 1/T; %Fundamental frequency d = 1/4; %Duty cycle Nh = 20; %Number of harmonics summed t = (-1.5*T:T/200:1.5*T); %Time instants for synthesised signal gst gst = -2.5 + zeros(size(t)); %Initialise gst with DC component for n = 1:Nh %Now add the harmonics, one at a time. gst = gst + 15*sinc(n*d)*cos(2*pi*n*fo*t - n*pi/4); end plot(t, gst);
20 10 (a) 0 –10 –150 μs
–100 μs
0
100 μs
150 μs
–100 μs
0
100 μs
150 μs
20 10 (b) 0 –10 –150 μs Figure 4.6
Worked Example 4.3: synthesis of g1 (t) of Figure 4.4b using DC and (a) 5 harmonics; (b) 20 harmonics.
221
222
4 Frequency Domain Analysis of Signals and Systems
4.2.2 Complex Exponential Form of Fourier Series The Fourier series discussed in the previous section expresses a periodic signal g(t) as a sum of harmonic sinusoids. This format directly represents a single-sided spectrum view of g(t) having amplitude and phase spectral lines of respective heights An and 𝜙n at frequencies nf o , n = 0, 1, 2, 3, … A complex exponential form of the Fourier series, on the other hand, expresses g(t) as a sum of complex exponentials and directly represents a double-sided spectrum view of g(t). If g(t) is real then the complex exponentials summed will occur as complex conjugate pairs so that their imaginary parts cancel when added, thereby producing a real-valued signal. The implication of this observation is that the amplitude spectrum of a real signal will always be an even function of frequency, whereas the phase spectrum will always be an odd function. The complex exponential form of the Fourier series may be derived from the sinusoidal form of the series by invoking Eqs. (2.47) and (2.48) (based on Euler’s formula) to express cos(2𝜋nf o t) and sin(2𝜋nf o t) in terms of complex exponentials as 1 j2𝜋nf o t + e−j2𝜋nf o t ) (e 2 1 sin(2𝜋nf o t) = −j (ej2𝜋nf o t − e−j2𝜋nf o t ) 2
cos(2𝜋nf o t) =
(4.21)
Note that ej2𝜋nf o t and e−j2𝜋nf o t are complex conjugates. When these are added together on the right-hand side of the first line of Eq. (4.21), their imaginary parts (which are equal and of opposite sign) cancel, whereas the real part is doubled. Subtracting the complex conjugates (as done in the second line of the above equation) cancels out their real parts (which are equal) and doubles their imaginary part, which is then converted to a real value by the factor −j. Replacing cos(2𝜋nf o t) and sin(2𝜋nf o t) in the Fourier series of Eq. (4.1) using the right-hand side of Eq. (4.21) yields g(t) = A0 +
∞ ∑
[an cos(2𝜋nf o t) + bn sin(2𝜋nf o t)]
n=1 ∞
= A0 + = A0 +
∑1 2 n=1 ∞ { ∑ n=1
[an (ej2𝜋nf o t + e−j2𝜋nf o t ) − jbn (ej2𝜋nf o t − e−j2𝜋nf o t )] 1 1 (a − jbn )ej2𝜋nf o t + (an + jbn )e−j2𝜋nf o t 2 n 2
}
Note that A0 , an , and bn in the above equation are real coefficients if g(t) is real, and they are given by the following equations (derived in the previous section) T∕2
A0 =
1 g(t)dt; T ∫−T∕2
T∕2
an =
2 g(t) cos(2𝜋nf o t)dt; T ∫−T∕2
T∕2
bn =
2 g(t) sin(2𝜋nf o t)dt T ∫−T∕2
The factor 12 (an − jbn ) in the first term of the above summation for g(t) will in general be a complex coefficient (which we will denote as Cn ) given by 1 (a − jbn ) 2 n T∕2 T∕2 1 1 = g(t) cos(2𝜋nf o t)dt − j g(t) sin(2𝜋nf o t)dt T ∫−T∕2 T ∫−T∕2
Cn =
T∕2
=
1 g(t)[cos(2𝜋nf o t) − j sin(2𝜋nf o t)]dt T ∫−T∕2
=
1 g(t)e−j2𝜋nf o t dt T ∫−T∕2
T∕2
4.2 Fourier Series
Setting n = 0 in this equation, we see that T∕2
C0 =
T∕2
1 1 g(t)e−j2𝜋×0×fo t dt = g(t)dt = A0 T ∫−T∕2 T ∫−T∕2
Therefore, substituting Cn into the summation for g(t) yields ∞ { } ∑ 1 1 (an − jbn )ej2𝜋nf o t + (an + jbn )e−j2𝜋nf o t g(t) = A0 + 2 2 n=1 =
∞ ∑
Cn ej2𝜋nf o t +
n=0
∞ ∑ 1 n=1
2
(an + jbn )e−j2𝜋nf o t
Making the substitution n = −m in the second summation (which means that m = −n and therefore at the summation limits when n = 1, m = −1, and when n = ∞, m = −∞) g(t) =
∞ ∑
Cn ej2𝜋nf o t +
n=0
−∞ ∑ 1 (a + jb−m )ej2𝜋mf o t 2 −m m=−1
To determine how a−m and b−m are, respectively, related to am and bm , we simply use the expressions for these coefficients given above, whence we find that T∕2
T∕2
a−m =
2 2 g(t) cos(2𝜋(−m)fo t)dt = g(t) cos(2𝜋mf o t)dt = am ∫ T −T∕2 T ∫−T∕2
b−m =
2 2 g(t) sin(2𝜋(−m)fo t)dt = − g(t) sin(2𝜋mf o t)dt = −bm T ∫−T∕2 T ∫−T∕2
T∕2
T∕2
That is a−m = am (4.22)
b−m = −bm and hence g(t) =
∞ ∑
Cn ej2𝜋nf o t +
−∞ ∑ 1 (a + jb−m )ej2𝜋mf o t 2 −m m=−1
Cn ej2𝜋nf o t +
−∞ ∑ 1 (am − jbm )ej2𝜋mf o t 2 m=−1
n=0
=
∞ ∑ n=0
=
∞ ∑
Cn ej2𝜋nf o t +
n=0
−∞ ∑
Cm ej2𝜋mf o t
m=−1
Returning to our (preferred) use of index n (rather than m) in the second summation, we finally obtain g(t) =
∞ ∑
Cn ej2𝜋nf o t +
n=0 ∞
=
∑
−∞ ∑
Cn ej2𝜋nf o t
n=−1
Cn ej2𝜋nf o t
n=−∞
which completes our work. So, what have we achieved? We have established that any periodic signal g(t) of period T may be expressed as a sum of complex exponentials g(t) =
∞ ∑ n=−∞
Cn ej2𝜋nf o t =
∞ ∑ n=−∞
|Cn |ej2𝜋nf o t+𝜙n
(4.23)
223
224
4 Frequency Domain Analysis of Signals and Systems
where 𝜙n is the angle of the complex coefficients Cn determined from g(t) by integration over one cycle as T∕2
Cn =
1 g(t)e−j2𝜋nf o t dt; T ∫−T∕2
n = · · · , −2, −1, 0, −1, −2, · · ·
(4.24)
This coefficient is related to the coefficients an and bn of the sinusoidal Fourier series (discussed in the previous section) by 1 (4.25) Cn = (an − jbn ) 2 Since the summation for g(t) in Eq. (4.23) is from −∞ to ∞, the index n occurs in pairs n = ±1, ±2, ±3, …, and so do the coefficients (C−1 , C1 ), (C−2 , C2 ), (C−3 , C3 ), … Considering, for example, n = 2, the pair of coefficients is 1 C2 = (a2 − jb2 ) 2 1 1 C−2 = (a−2 − jb−2 ) = (a2 + jb2 ) = C2∗ 2 2 where we have made use of Eq. (4.22), and the asterisk denotes complex conjugation. In general 1 (a + jbn ), n = 1, 2, 3, · · · (4.26) 2 n Corresponding coefficients on either side of the summation in Eq. (4.23) are therefore complex conjugates. This means that we only need to calculate the coefficients for the positive side of the summation (n = 1, 2, 3, …). The corresponding coefficients for the negative side (n = −1, −2, −3, …) then follow simply by complex conjugation (i.e. changing the sign of the imaginary part). It is useful to note that replacing n with −n in Eq. (4.24) has the effect of complex conjugating the right-hand side. This provides alternative proof of the complex conjugate relationship between Cn and C−n for real g(t). Note also that if g(t) is an even function then bn = 0 (according to Eq. (4.16)) and Cn will be exclusively real-valued. But if g(t) is odd then an = 0 and Cn is exclusively imaginary-valued, with C0 = 0. That is C−n = Cn∗ =
Cn = C−n = an ∕2,
g(t) even
Cn = −C−n = −jbn ∕2,
g(t) odd
(4.27)
It is worth emphasising that the Fourier series technique provides alternative methods of describing the same periodic signal, in the time domain by specifying the waveform g(t) or in the frequency domain by specifying the Fourier coefficients Cn . Once we have a specification in one domain, the specification of the signal in the other domain may be fully determined. So, given g(t), we may obtain Cn using Eq. (4.24), a process known as Fourier analysis; and given Cn , we may fully reconstruct g(t) using Eq. (4.23), a process known as Fourier synthesis.
4.2.3 Amplitude and Phase Spectra Insight into the utility of the sinusoidal and complex exponential forms of the Fourier series and how each form is interpreted and visualised to extract practical information about the frequency content, bandwidth, and power of the signal may be gained by employing the concept of phasors, which was introduced in Chapter 2. Figure 4.7a shows a circle of unit radius |EG| = 1, centred in the complex plane with real and imaginary axes shown. Considering the triangle EFG with angle 𝜃 at E, it follows by definition of the cosine and sine functions that |EF| = cos 𝜃, |FG| = sin 𝜃. A few comments on basic arithmetic are in order here. Every number (in the complex plane) has both magnitude and direction or angle. For example, +5 has magnitude 5 and angle 0∘ , denoted 5∠0∘ (pronounced ‘5 angle 0’); −5 = 5∠180∘ ; j5 = 5∠90∘ ; and (in Figure 4.7a) EG = |EG|∠𝜃; etc. The correct addition and subtraction of numbers always respects their angles, although phasor addition (which is basically the same process) is often treated as if it were something special. It isn’t! For example, 4 – 3 = 1 because we add 3∠180∘ to 4∠0∘ by moving (from the origin) through 4 units in the direction of 0∘ to get to a point, and then moving from that
4.2 Fourier Series
Counterclockwise rotation at f cycles/sec, starting at θ = ϕ
Imaginary G
jθ
ejθ
= (t)
jsin θ θ F Real E cos θ
Ae
g1
jAsin θ
Acos θ –jAsin θ
(t) g2
Acos θ
= –j
Ae
θ
Clockwise rotation at f cycles/sec, starting at θ = –ϕ
(a)
(b)
(c)
Figure 4.7 Concept of positive and negative frequencies: (a) Unit radius in complex plane; (b) Phasor or complex exponential function at positive frequency; (c) Phasor at negative frequency.
point through 3 units in the direction of 180∘ to complete the operation, which takes us to the point 1∠0∘ as the answer. Without this respect for the angle of each number, we could have ended up with the wrong result, such as 7∠0∘ or 5∠36.87∘ (if −3 were added as if it had angle 0∘ or 90∘ ). Notice therefore in Figure 4.7a that |EG|∠𝜃 = 1∠𝜃 = |EF|∠0∘ + |FG|∠90∘ = cos 𝜃 + j sin 𝜃 = ej𝜃
(by Euler’s formula)
This means that Aej𝜃 , or A exp(j𝜃), represents a complex number A∠𝜃 having magnitude A and angle 𝜃. In Figure 4.7b and c, this idea is extended to a complex exponential function by making 𝜃 a function of time as follows: in (b), the radial arm or phasor of magnitude A rotates counterclockwise at a rate of f cycles per second, starting initially at angle 𝜙 (radian) at time t = 0. Thus, 𝜙 is the phase (i.e. initial angle) of the function. Angular displacement is by convention positive in the counterclockwise direction, so this phasor is said to have a positive frequency f Hz (or positive angular frequency 2𝜋f rad/s), the angle of the phasor at any time instant t will be 𝜃 = 2𝜋ft + 𝜙, and this variation produces a complex exponential function g1 (t) = A ej(2𝜋ft+𝜙) = A cos(2𝜋ft + 𝜙) + jA sin(2𝜋ft + 𝜙)
(4.28)
which has constant magnitude A, constant positive frequency f , constant phase 𝜙, and sinusoidally varying real and imaginary parts. In Figure 4.7c the rotation is clockwise at f cycles/sec starting initially at −𝜙, so (since clockwise angular displacement is negative), the angle of the phasor at time t is − (2𝜋ft + 𝜙) ≡ 2𝜋(−f )t − 𝜙, and this variation produces a complex exponential function g2 (t) = A ej(2𝜋(−f )t−𝜙) = A cos(2𝜋ft + 𝜙) − jA sin(2𝜋ft + 𝜙)
(4.29)
which has constant magnitude A and negative frequency −f . The two complex exponentials g1 (t) and g2 (t), produced by phasors counter-rotating at equal rate, are complex conjugates which when combined (i.e. by adding Eqs. (4.28) and (4.29)) yields A j(2𝜋ft+𝜙) A j(2𝜋(−f )t−𝜙) e + e 2 2 Figure 4.7b and c and Eq. (4.30) show that A cos(2𝜋ft + 𝜙) =
(4.30)
225
226
4 Frequency Domain Analysis of Signals and Systems
Function: (a)
Amplitude spectrum:
A exp( j(2π ft + ϕ)) 2
Phase spectrum: ϕ
A/2
f Freq., Hz
f Freq., Hz A/2
(b) A exp( j(–2π ft – ϕ)) 2
–f –f
Freq., Hz
Freq., Hz –ϕ
(c)
ϕ
A/2
A cos(2π ft + ϕ) –f
–f f Freq., Hz –ϕ
A (d)
ϕ
Single-sided spectrum of A cos(2πft + ϕ) f Freq., Hz
Figure 4.8
●
●
●
f Freq., Hz
f Freq., Hz
Frequency domain view of sinusoidal and complex exponential functions.
The complex exponential function A2 ej(2𝜋ft+𝜙) has at all times a real and an imaginary component, and corresponds to a single phasor of amplitude A/2 that rotates counterclockwise at f cycles per second in the complex plane, starting initially at angle (i.e. orientation relative to the +x axis reference direction) equal to 𝜙. This function is said to have phase 𝜙 and positive frequency f (in view of the convention for angular measure that regards counterclockwise angular displacement as positive and clockwise angular displacement as negative). Figure 4.8a gives a frequency domain view of this complex exponential, depicting its amplitude spectrum which contains a single vertical line of height A/2 located at point f along the frequency axis, and its phase spectrum which has a single vertical line of height 𝜙 at f . The complex exponential function A2 e−j(2𝜋ft+𝜙) corresponds to a single phasor of amplitude A/2 that rotates clockwise at f cycles per second in the complex plane, starting initially at angle −𝜙. This function is said to have phase −𝜙 and (negative) frequency −f . Its amplitude spectrum contains a single vertical line of height A/2 located at point −f along the frequency axis, and its phase spectrum contains a single vertical line of height −𝜙 at frequency −f . See Figure 4.8b. The sinusoidal function A cos(2𝜋ft + 𝜙) consists of two counter-rotating phasors or complex exponentials, each of amplitude A/2. One phasor has phase 𝜙 and rotates counterclockwise at f cycles/second (which corresponds to positive frequency f ). The other phasor has phase −𝜙 and rotates clockwise at f cycles/second (which corresponds to negative frequency −f ). When these two phasors are added, their imaginary parts cancel, producing a resultant function that is always real-valued. Therefore, as shown in Figure 4.8c, the amplitude spectrum of the sinusoid A cos(2𝜋ft + 𝜙) contains two lines of height A/2 at locations ±f ; and its phase spectrum also contains two lines, one of height 𝜙 at f , and the other of height −𝜙 at −f .
It should therefore be stressed that negative frequency is not merely an abstract mathematical concept but a practical quantity with physical meaning. Negative frequency is the rate of change of angle in the clockwise direction exhibited by the clockwise-rotating phasor or complex exponential exp(−j2𝜋ft) – assuming f > 0, whereas positive frequency is the rate of change of angle in the counterclockwise direction exhibited by the
4.2 Fourier Series
counterclockwise-rotating phasor or complex exponential exp(j2𝜋ft). A sinusoidal signal A cos(2𝜋ft + 𝜙) is composed of two complex exponentials of amplitude A/2 that vary in sync at frequencies ±f so that their real parts reinforce at all times and their imaginary parts cancel. This requires the complex exponentials to have respective phases ±𝜙. The correct and complete spectrum of this sinusoid is therefore as shown in Figure 4.8c for amplitude and phase. However, the spectrum of the sinusoid A cos(2𝜋ft + 𝜙) is often reduced for convenience to the single-sided format shown in Figure 4.8d, where its amplitude spectrum is depicted as a single spectral line of height A at positive frequency f , and its phase spectrum as a single line of height 𝜙 at f . 4.2.3.1 Double-sided Spectrum
A double-sided spectrum provides amplitude and phase information on both the positive and negative frequency content of a signal, which means all the complex exponentials that constitute the signal. This double-sided spectral representation is really the accurate and unabridged frequency domain view of a signal and leads to a lot of mathematical simplification when dealing, for example, with passband transmission systems in which the signal is translated to a different frequency band. The complex exponential form of the Fourier series, given by Eq. (4.23), indicates that a periodic signal g(t) of period T is made up of a DC component C0 and pairs of complex exponentials Cn ej2𝜋nf o t ;
n = ±1, ±2, ±3, · · · ;
fo = 1∕T
where Cn is in general complex-valued, having magnitude |Cn | and angle 𝜙n , so it may be expressed as Cn = |Cn |∠𝜙n = |Cn |ej𝜙n which means that the complex exponentials are of the form |Cn |ej𝜙n ej2𝜋nf o t = |Cn |ej(2𝜋nf o t+𝜙n ) ;
n = ±1, ±2, ±3, · · ·
revealing that their amplitude is |Cn |, phase is 𝜙n , and frequency is nf o . From Eqs. (4.26), (4.25), and (4.13), it follows that these amplitudes and phases satisfy the relations √ 1 1 |Cn | = |C−n | = a2n + b2n = An ; n > 0 (4.31) 2 2 where An ≡ amplitude of the nth harmonic in the sinusoidal form of Fourier series, and { } 1 𝜙n = Angle of Cn ≡ (an − jbn ) 2 ⎧𝛼, bn ≤ 0, an ≥ 0 ⎪ ) ( |bn | bn > 0, an ≥ 0 ⎪−𝛼, =⎨ ; 𝛼 = tan−1 |an | ⎪180 − 𝛼, bn ≤ 0, an < 0 ⎪𝛼 − 180, b > 0, a < 0 n n ⎩ { } 1 𝜙−n = Angle of C−n = (an + jbn ) = −𝜙n 2
(4.32)
Note that this phase 𝜙n is the same as the phase of the nth harmonic in the sinusoidal form of Fourier series. We may now summarise the features of the double-sided spectrum of a periodic signal g(t) as follows: ●
●
The double-sided spectrum consists of a DC component and pairs of spectral lines at negative and positive frequencies ±nf o , for n = 1, 2, 3, … The spectrum is a discrete line spectrum, with a constant gap or spacing between adjacent spectral lines equal to the fundamental frequency f o of the periodic signal. This spacing depends only on the period of the signal according to f o = 1/T.
227
228
4 Frequency Domain Analysis of Signals and Systems
Cn C3
(a)
–3fo –2fo
–ϕ3
(b)
–2f o
fo
0
–f o –ϕ1
–ϕ2
A1
An =
A2
A0
0
–fo
ϕn
–3f o
An
C0 C 1
C1
C2
ϕ0 0
2f o
ϕ1
A0 , Cn = 1 A , 2 n
C3 3fo
fo
n=0 n≥1
n=0 n≥1
nfo
ϕ2
ϕ0 = 3f o
2 Cn ,
nf o
3f o
2fo
C0 ,
A3 fo
C2
2f o
C0 ≥ 0
0˚,
180˚, C0 < 0
nf o
ϕ3
ϕn ϕ0 0
ϕ1
ϕ2 3f o
fo
2f o
nf o
ϕ3
(d)
(c)
Figure 4.9 (a) Double-sided amplitude spectrum; (b) Double-sided phase spectrum; (c) Single-sided amplitude spectrum; (d) Single-sided phase spectrum. ●
●
Each spectral line in a double-sided amplitude spectrum (DSAS) represents a complex exponential component of the signal g(t). The height and location (along the frequency axis) of a spectral line respectively specify the amplitude and frequency of the complex exponential that the line represents. The DSAS is therefore a plot of |Cn | against nf o for n = …, −3, −2, −1, 0, 1, 2, 3, …, which, in view of Eq. (4.31), is an even function of frequency if the signal g(t) is real. The double-sided phase spectrum (DSPS) of g(t) is a plot of the phase 𝜙n of the complex exponential components of g(t) against nf o , n = …, −3, −2, −1, 0, 1, 2, 3, … If g(t) is a real signal then 𝜙−n = −𝜙n and the phase spectrum is therefore an odd function of frequency. Note that the phase 𝜙0 of the DC component C0 will be either 0∘ if C0 ≥ 0 or 180∘ if C0 is negative.
Figure 4.9a and b show the double-sided amplitude and phase spectra, respectively. Since each spectral line in Figure 4.9a is a complex exponential of normalised power |Cn |2 , according to Eq. (3.109), the total normalised power of g(t) may be determined in the frequency domain by summing the powers of all the complex exponential components represented in its amplitude spectrum, or in the time domain (discussed in Chapter 3) by averaging the squared signal over one cycle. Thus P=
∞ ∑ n=−∞
T∕2
|Cn |2 =
1 g2 (t)dt T ∫−T∕2
(4.33)
4.2.3.2 Single-sided Spectrum
We emphasise from the outset that the single-sided spectrum is a contraction of the true spectrum of real signals that is commonly used simply for convenience. Because the amplitude spectrum of real signals is even and the phase spectrum is odd, the right-hand side of the spectrum (for n ≥ 0) contains all the amplitude and phase
4.2 Fourier Series
information needed to fully represent a real signal in the frequency domain. However, we cannot simply delete the left half of a DSAS of (Figure 4.9a), as this would eliminate half of the ac power content of the signal. To ensure that it contains the full power as well as the spectral information of the signal, a single-sided spectrum is a representation of the sinusoidal Fourier series g(t) = A0 +
∞ ∑
An cos(2𝜋nf o t + 𝜙n )
n=1
using one spectral line per harmonic sinusoid. A single-sided amplitude spectrum (SSAS) is a plot of the values of |An | in the above equation against nf o for n = 0, 1, 2, 3, …, as shown in Figure 4.9c. The SSAS is a discrete line spectrum covering only positive frequencies with a constant gap or spacing between adjacent spectral lines equal to the fundamental frequency f o of g(t). Using Eq. (4.31), the height of the spectral line at f = nf o in a SSAS is related to that of a DSAS by { n=0 |C0 |, (4.34) |An | = 2|Cn |, n ≥ 1 An SSAS, shown in Figure 4.9d, is a discrete line plot of the values of 𝜙n (confined to the range −180∘ to 180∘ and including an additional phase shift of 180∘ if A < 0) against nf for n = 0, 1, 2, 3, … n
o
From the above discussion and Figure 4.9, we see that a double-sided spectrum may be converted into a single-sided spectrum by transferring all spectral lines at negative frequency locations in the amplitude spectrum to add to their counterparts at positive frequency locations, and by discarding all negative frequency lines in the phase spectrum. Conversely, we may convert a single-sided spectrum into double-sided by taking each spectral line at f = nf o , n = 1, 2, 3, … (representing the nth harmonic sinusoidal component of g(t) and having amplitude An and phase 𝜙n ) and dividing it equally to a pair of locations on the frequency axis, one at the positive frequency f = nf o having amplitude An /2 and phase 𝜙n ; the other at the negative frequency f = −nf o having amplitude An /2 and phase −𝜙n . Note that, although the DC component A0 is untouched in this conversion, it may in fact be viewed as a pair of components A0 /2 at f = +0 and A0 /2 at f = −0. For later application to signal bandwidth and power analysis, we note here that the total normalised power PN in the DC and first N harmonics of a signal may be determined from the SSAS (where each line represents a sinusoid of normalised power A2n ∕2) or from the DSAS (where each line represents a complex exponential of normalised power |Cn |2 ), except for the DC component with normalised power A20 PN = A20 +
N N ∑ 1∑ 2 An = C02 + 2 |Cn |2 2 n=1 n=1
(4.35)
Finally, it is worth noting that the relationship between double-sided and single-sided spectra may be heuristically demonstrated as follows, without involving complex exponential functions as done hitherto. Since cos(𝜃) = cos(−𝜃), we may write A A An cos(𝜃) = n cos(𝜃) + n cos(−𝜃) 2 2 Employing this identity in the sinusoidal form of the Fourier series, with 𝜃 ≡ 2𝜋nf o t + 𝜙n , leads to g(t) = A0 +
∞ ∑
An cos(2𝜋nf o t + 𝜙n );
(single-sided)
n=1 ∞ [
] ∑ An An = A0 + cos(2𝜋nf o t + 𝜙n ) + cos(2𝜋(−nf o )t − 𝜙n ) 2 2 n=1 = A0 +
∞ ∑ An cos(2𝜋nf o t + 𝜙n ); n=−∞; 2 n≠0
𝜙−n = −𝜙n ;
(double-sided)
(4.36)
229
230
4 Frequency Domain Analysis of Signals and Systems
which, with the convention of using one spectral line per sinusoid, shows that the sinusoid An cos(2𝜋nf o t + 𝜙n ) is converted from a single spectral line of height An at f = nf o in a SSAS to a pair of spectral lines of height An /2 at f = ±nf o in a DSAS. In its phase spectrum, the sinusoid is also converted from a single line 𝜙n at f = nf o to a pair of lines of respective values ±𝜙n at f = ±nf o . A note of caution is necessary here to avoid a common misunderstanding. Treating the first line of Eq. (4.36) as leading to a single-sided spectrum is not mathematically sound. This is a practice adopted simply for convenience in which the amplitude spectrum of the sinusoid An cos(2𝜋nf o t + 𝜙n ) is represented as a single spectral line of height An at frequency f = nf o ; and its phase spectrum as a single spectral line of height 𝜙n at f = nf o . A correct and mathematically rigorous interpretation has been discussed at length in the preceding pages, namely (i) a real signal has a double-sided spectrum; (ii) each spectral line represents a complex exponential function, rather than a sinusoidal function; (iii) the amplitude spectrum of the sinusoid An cos(2𝜋nf o t + 𝜙n ) consists of a pair of spectral lines of value An /2 at frequencies f = ±nf o ; and its phase spectrum has corresponding values ±𝜙n at f = ±nf o . Worked Example 4.4
Parameters of Complex Exponential Functions
The purpose of this worked example is to use a series of simple exercises to ensure you have complete clarity about complex exponential functions. Let us determine the following: √ (a) The frequency and phase of g(t) = −10 − 𝜋∕3)]. ( exp[j(t ) (b) The frequency and phase of g(t) = exp j 𝜋2 t . (c) The frequency of g(t) = 20e−j(200𝜋t−𝜋∕6) and the angle, magnitude, and real part of this function at time t = 2.5 ms. (d) The parameters of ej . (e) The parameters of g(t) = B0 e-𝜆t . (f) The parameters of g(t) = 50 exp[(j30𝜋 − 2)t]. (a) We start with a straightforward manipulation to move all phase information from the coefficient the angle of the complex exponential function g(t) √ g(t) = −10 exp[j(t − 𝜋∕3)] √ √ = j 10ej(t−𝜋∕3) = ( 10∠𝜋∕2)ej(t−𝜋∕3) √ √ = ( 10ej𝜋∕2 )ej(t−𝜋∕3) = 10ej(t+𝜋∕2−𝜋∕3) √ = 10ej(t+𝜋∕6)
√ −10 into
Now comparing the above expression to the standard form of a complex exponential function Aej(2𝜋ft+𝜙) , where A > 0 ≡ Amplitude f ≡ Frequency 𝜙 ≡ Phase 2𝜋ft + 𝜙 ≡ Angle at time t we see that, for the given signal, 2𝜋f = 1, which yields frequency f as 1 Hz = 159.2 ms 2𝜋 And the phase is read as 𝜙 = 𝜋/6 rad = 30∘ . (b) The given function compares as follows with the standard form [ ( )] ( ) 𝜋 𝜋 t + 0 ≡ A exp[j(2𝜋ft + 𝜙)] g(t) = exp j t = exp j 2 2 f =
4.2 Fourier Series
Thus, phase 𝜙 = 0, and frequency f is obtained from 2𝜋f =
𝜋 ; 2
and hence f = 0.25 Hz
(c) Again, the given function compares as follows with the standard form g(t) = 20e−j(200𝜋t−𝜋∕6) = 20ej(−200𝜋t+𝜋∕6) ≡ Aej(2𝜋ft+𝜙) Thus Frequency f is given by 2𝜋f = −200𝜋, which implies f = −100 Hz Angle at t = 2.5 ms 𝜃(t)|t=2.5×10−3 = −200𝜋 × 2.5 × 10−3 + 𝜋∕6 = −200𝜋 × 2.5 × 10−3 + 𝜋∕6 = −𝜋∕2 + 𝜋∕6 = −𝜋∕3 radian = −60∘ The magnitude of a complex exponential function is constant and in this case is |A| = 20 at all times. The real part of Aej(2𝜋ft+𝜙) is A cos(2𝜋ft + 𝜙), which for the given function at t = 2.5 ms is 20 cos(−200𝜋 × 2.5 × 10−3 + 𝜋∕6) = 20 cos(−60∘ ) = 10 (d) The term ej is not a function of time. It is simply a complex number in the same form as ej𝜃 which equals 1∠𝜃. So, in this case ej = 1∠1 rad = 1∠(180∕𝜋)∘ = 1∠57.3∘ = cos(57.3∘ ) + j sin(57.3∘ ) = 0.5403 + j0.8415 (e) The function g(t) = B0 e−𝜆t is not a complex exponential function, seeing there is no j factor in its exponent. Rather, g(t) is a real function that decays exponentially with time from an initial value g(0) = B0 at t = 0 to a final value g(∞) = 0 at t = ∞. An exponentially decaying function is often characterised by its time constant 𝜏, which is the time taken for the function to reduce to 1/e (or e−1 ) times its initial value. Here, 𝜏 = 1/𝜆, since g(𝜏) = B0 e−𝜆𝜏 ≡ B0 e−1 . Note that e ≈ 2.718281828459046, so 1/e = 36.788%. (f) The given function can be rearranged as g(t) = 50e(j30𝜋−2)t = 50e−2t ej30𝜋t ≡ Aej(2𝜋ft+𝜙) which reveals that it is an amplitude-modulated complex exponential function. Its amplitude decays exponentially from an initial value of 50 with a time constant of 0.5 s, whereas its frequency and phase are constant (i.e. unmodulated) and, respectively, f = 15 Hz, 𝜙 = 0∘ .
Worked Example 4.5
Amplitude and Phase Spectra of Pulse Train
We wish to sketch the single-sided and double-sided amplitude and phase spectra of the unipolar RPT x(t) shown in Figure 4.10 up to the 12th harmonic and to discuss its features. The Fourier series of the centred unipolar RPT g(t) in Figure 4.4a was derived in Worked Example 4.1 as given in Eq. (4.17). Since the given waveform x(t) is the centred waveform g(t) delayed by half a pulse width 𝜏, we make
231
232
4 Frequency Domain Analysis of Signals and Systems
x(t), volts τ
A = 100 V
Figure 4.10
d = τ/T = ¼ T = 1 ms t
T
Worked Example 4.5.
use of the relationship x(t) = g(t − 𝜏∕2) in Eq. (4.17) to obtain x(t) = g(t − 𝜏∕2) = Ad + 2Ad
∞ ∑
sinc(nd) cos(2𝜋nf o (t − 𝜏∕2))
n=1
= Ad + 2Ad
∞ ∑
sinc(nd) cos(2𝜋nf o t − n𝜋d),
since fo 𝜏 = d
n=1
Substituting the signal parameters (A = 100, d = 1/4) given in Figure 4.10 yields x(t) = 25 + 50
∞ ∑
sinc(n∕4) cos(2𝜋nf o t − n𝜋∕4)
n=1
fo = 1∕10−3 = 1000 Hz Therefore A0 = 25 𝔸n = 50 sinc(n∕4) An = |𝔸n | ⎧−45n∘ , ⎪ 𝜙n = ⎨180∘ − 45n∘ , ⎪ ∘ ⎩0 ,
𝔸n > 0 𝔸n < 0 𝔸n = 0
Notice that (as discussed earlier) the amplitude spectrum An is defined to be positive and an extra phase shift of 180∘ is added to the phase of the nth harmonic whenever its amplitude is negative. Also, the phase of a harmonic is set to zero if its amplitude is zero. Computed values are tabulated in Table 4.2 up to the 12th harmonic. To obtain the values listed in the last three columns of the table for the doubled-sided spectrum, the values of An obtained as above were divided in two to give the positive and negative frequency coefficients Cn and C−n , and the phase of the positive frequency was negated to obtain the phase of the negative frequency component. Take care to see how the values of 𝜙n in column six of the table are derived from column four. For example, for the 11th harmonic, 360∘ is added to the value of −495∘ in column four to bring it into the required interval of −180∘ to 180∘ ; and for the fifth harmonic, 180∘ is added to the value in column four because the amplitude (in column three) is negative at that harmonic. Figure 4.11 shows a discrete line plot of 𝜙n and 𝜙−n against ±nf o to produce the double-sided phase spectrum. The SSAS is shown in Figure 4.12. This is a discrete line plot of An against nf o . A plot of Cn and C−n against ±nf o gives the DSAS shown in Figure 4.13. We can now make the following observations regarding the spectrum of an RPT. Spectral Line Spacing: spectral lines occur at frequencies f = 0, ±f o , ±2f o , ±3f o , …, where f = f o = 1/T is the fundamental frequency component, f = nf o is the nth harmonic frequency, and T is the period of the pulse train. The question is often asked whether the spectral line spacing f o is affected by the pulse width 𝜏 or duty cycle d of
4.2 Fourier Series
Table 4.2
Worked Example 4.5: spectral values for rectangular pulse train of Figure 4.10.
Frequency, nf o , kHz
n
An = 50sinc(n/4)
0
0
—
1
1
45.02
2
2
3
3
−n×45∘
—
An = |An |
25
−45
45.02
31.83
−90
15.01
−135
−90
15.92
15.92
90
15.01
−135
7.50
7.50
135
0
−180
0
−9.00
−225
9.00
6
6
−10.61
−270
7
7
−6.43
−315
0
0
4.50
4.50
10.61
−90
5.31
5.31
90
6.43
−135
3.22
3.22
135
8
0
−360
0
9
5.00
−405
5.00
10
10
6.37
−450
11
11
4.09
−495
12
12
0
−540
0
Phase ϕn, deg
8
0
45
−45
9
–4
0
31.83
4
–6
C0 = 25
𝝓−n = −𝝓n
22.51
5
–8
0
C −n = An /2
22.51
4
–10
C n = An /2
−45
5
135 90 45 0 –45 –90 –135 –12
𝝓n , deg
0
0
−45
2.50
2.50
6.37
−90
3.18
3.18
90
4.09
−135
2.05
2.05
135
0
0
–2 0 2 Frequency nfo, kHz
0
0 45
0
4
6
8
10
0 45
0
12
Figure 4.11 Worked Example 4.5: Double-sided phase spectrum of waveform x(t) in Figure 4.10 having amplitude 100 V, duty cycle 1/4, period 1 ms, and delay 1/8 ms relative to centred waveform.
the pulse train. It is true that we may write d 1 = T 𝜏 from which it may be noted that keeping d fixed while changing 𝜏, or keeping 𝜏 fixed while varying d, would alter f o . However, f o is changed only because each of these actions changes T. In other words, f o will change if and only if T changes; and changing d and/or 𝜏 will influence f o only if that change affects T. So, yes, spectral line spacing f o depends exclusively on period T. Spectral Nulls: the amplitude of the nth harmonic sinusoidal component is An = 2Ad sinc(nd), n = 1, 2, 3, … However, since (see Section 2.6.8) fo =
sinc(1) = sinc(2) = sinc(3) = · · · = 0 harmonics corresponding to nd = 1, 2, 3, …, (hence n = 1/d, 2/d, 3/d, …) will have zero amplitude. Therefore, every (1/d)th harmonic will be absent from the spectrum. For example, a square wave is an RPT with d = 0.5, so it will not contain every (1/0.5)th harmonic. That is, even harmonics 2f o , 4f o , 6f o , etc. are absent from a square
233
4 Frequency Domain Analysis of Signals and Systems
46
Amplitude An,V
40
30 spectral line 20
10
0
0
1
2
3
4
5 6 7 Frequency nfo, kHz
8
9
10
11
12
Figure 4.12 Worked Example 4.5: Single-sided amplitude spectrum of rectangular pulse train of amplitude 100 V, duty cycle 1/4, period 1 ms.
25 spectral envelope 20 Amplitude |Cn|, V
234
15
10
5
0 –12
–10
–8
–6
–4
–2 0 2 Frequency nfo, kHz
4
6
8
10
12
Figure 4.13 Worked Example 4.5: Double-sided amplitude spectrum of rectangular pulse train of amplitude 100 V, duty cycle 1/4, period 1 ms.
wave. In the current example (Figure 4.10), d = 1/4, and this explains why there is a null in the amplitude spectrum (Figure 4.12) at every fourth harmonic 4f o , 8f o , 12f o , etc. Density of Spectral Lines in each Lobe: since nulls occur at every 1/d harmonic count, and spectral line spacing is f o , it means that one lobe (between adjacent nulls) of the spectral envelope has width f = f o (1/d) = 1/𝜏. The number of spectral lines contained in one lobe therefore increases as duty cycle d decreases. If 1/d is an integer
4.2 Fourier Series
then the relationship is exactly 1 −1 (4.37) d Keeping pulse width 𝜏 constant also fixes the spectral lobe width. Under this condition, if the period T of the RPT is increased (meaning that d = 𝜏/T decreases) then spectral line spacing (f o = 1/T) decreases and hence the number of spectral lines increases. Null Bandwidth: the width of the main lobe in the positive frequency range of the amplitude spectrum is known as the null bandwidth Bnull of the pulse train. This is the range from f = 0 to the first null in the spectrum, which occurs at n = 1/d or f = nf o = (1/d)f o = 1/𝜏. Thus 1 (4.38) Bnull = 𝜏 and the bandwidth of the pulse train is inversely proportional to the pulse width. Narrower pulses are often required, for example, in time division multiplexing to pack several independent (information-bearing) pulses into a fixed sampling period T. We see therefore that such capacity increase is achieved at the expense of (at least) a proportionate increase in bandwidth requirements. The constant of inverse proportionality is 1 for rectangular-shaped pulses, as in Eq. (4.38), but will be a different value for other pulse shapes. Equation. (4.38) is a special case of what is actually a more general phenomenon of the inverse relationship between time domain and frequency domain representations. A shorter duration event (such as a pulse) in the time domain is represented in the frequency domain using a wider bandwidth (or frequency range) and vice versa. Number of spectral lines in one lobe =
4.2.4 Fourier Series Application to Selected Waveforms Fourier analysis is applicable to a very wide range of signals that may be generated in the laboratory or encountered in telecommunications, and it gives valuable insight into the frequency content, bandwidth, and distribution of power among the various frequency components of the signal. Apart from the Dirichlet conditions discussed earlier (which will always be satisfied by such signals), the only other requirement for the applicability of the Fourier series technique is that the signal must be periodic. This later condition can be satisfied by focusing on the signal interval of interest and assuming that outside of this interval the signal has the same repetitive waveform. In this section we wish to briefly demonstrate the versatility of the Fourier series tool by applying it to a selection of three special waveforms, namely the flat-top-sampled signal, the BASK signal, and the trapezoidal pulse train. Other interesting applications are left as exercises at the end of this chapter. 4.2.4.1 Flat-top-Sampled Signal
Sampling is an important signal processing task in telecommunications and Chapter 9 is devoted to its detailed treatment. Here we wish to demonstrate how the Fourier series technique may be employed to gain useful insight into the frequency domain effect of two types of sampling, namely flat-top sampling and instantaneous sampling. For this purpose, consider one cycle of the signal (in Figure 4.14a) g(t) = 1 + 5 sin(2𝜋fo t) + 3 sin(4𝜋fo t) + 2 sin(8𝜋fo t) = 1 + 5 cos(2𝜋fo t − 90∘ ) + 3 cos(4𝜋fo t − 90∘ ) + 2 cos(8𝜋fo t − 90∘ )
(4.39)
which contains frequency components of respective amplitudes 1, 5, 3, and 2 units at DC, f o , 2f o , and 4f o . This signal is sampled at regular intervals T s corresponding to a sampling frequency F s = 1/T s chosen to be four times the maximum frequency component of g(t) or F s = 16f o . There are therefore 16 samples in one cycle of g(t). Figure 4.14a depicts instantaneous sampling, which requires a switch to operate momentarily at each sampling instant nT s , n = 0, 1, 2, …, to produce an impulsive sample of weight equal to g(nT s ), which we may express in terms of the unit impulse as g𝛿 (n) = g(nT s )𝛿(t − nT s );
n = 0, 1, 2, · · · ;
Ts = T∕16;
T = 1∕fo
235
236
4 Frequency Domain Analysis of Signals and Systems
8
g(t) gδ(t)
(a)
t
0
Ts
–6 8
g(t) gп(t)
(b) t
0
–6 Figure 4.14
One cycle of signal g(t) and its (a) Instantaneously sampled signal g𝛿 (t); and (b) Sample-and-hold signal gΠ (t).
Figure 4.14b shows a more realisable sample-and-hold signal gΠ (t) in which the value of g(t) at each sampling instant is held constant until the next sampling instant, thereby producing a staircase waveform gΠ (t) as shown. The Fourier series of gΠ (t) is given by Eq. (4.20), repeated below for convenience gΠ (t) =
m ∑
{ Ak dk + 2Ak dk
n=1
k=1
dk =
𝜏k ; T
∞ ∑
fo =
[ sinc(ndk ) cos 2𝜋nf o t − 2𝜋n
(
dk ∑ + di 2 i=1 k−1
)]}
1 T
with number of steps m = 16, pulse width 𝜏 k = T s for all k, and hence dk = T s /T = 1/16, Ak = g((k − 1)Ts ) = g((k − 1)∕16fo ). For example, using Eq. (4.39), A2 is calculated as A2 = g(Ts ) = g(1∕16fo ) ( ) ( ) ( ) 1 1 1 = 1 + 5 sin 2𝜋fo × + 3 sin 4𝜋fo × + 2 sin 8𝜋fo × 16fo 16fo 16fo ( ) ( ) ( ) 𝜋 𝜋 𝜋 + 3 sin + 2 sin = 1 + 5 sin 8 4 2 = 1 + 1.9134 + 2.1213 + 2 = 7.0347 The Fourier series for the sample-and-hold signal gΠ (t) in Figure 4.14b is therefore 1 ∑ gΠ (t) = 16 k=1 16
{ ( [ ) ]} ) ( ) ∞ ( ( ) n k−1 ∑ 2k − 1 k−1 cos 2𝜋nf o t − n𝜋 + 2g sinc g 16 16 16fo 16fo n=1
(4.40)
4.2 Fourier Series
This equation states that gΠ (t) has DC component ) ( 16 1 ∑ k−1 A0 = g 16 k=1 16fo
(4.41)
and that the amplitude An and phase 𝜙n of the nth harmonic of gΠ (t) is obtained by adding m sinusoids, each of frequency nf o , having respective amplitudes An,k and phases 𝜙n,k given by ) ( ) ( ( ) 1 n k−1 1 − 2k An,k = g sinc ; 𝜙n,k = n𝜋; 8 16 16 16fo k = 1, 2, 3, · · · , 16
(4.42)
Using the method for sinusoidal addition in Section 2.7.3, it is straightforward to obtain the results tabulated in Table 4.3 for the frequency components of gΠ (t) up to the fourth harmonic. For ease of comparison, the table also includes the frequency components of g(t) from Eq. (4.39), with their amplitudes and phases denoted as Ano and 𝜙no merely for distinction from those of gΠ (t). The aperture distortion loss (ADL), defined as 20 log10 (Ano ∕An ) dB, for each harmonic is also included in the last column of the table. The following vital insight into the effect of sample-and-hold operation is unveiled. The sample-and-hold operation introduces a distortion known as aperture distortion that affects both the amplitude and phase of the frequency components of gΠ (t) when compared to those of the original signal g(t). It may be seen in the last column of Table 4.3 that the amplitude of the nth harmonic of gΠ (t) is reduced from that of the corresponding harmonic in g(t) by a factor equal to the ADL, which increases with harmonic frequency nf o . It will become clear in Chapter 9 that this increase of ADL with frequency is according to the relation ( )) ( | nf o || | (4.43) ADL = −20 log10 |sinc | dB | Fs || |
●
Using values from Table 4.3, the synthesised or reconstructed signal gΠs (t), based on the DC and first four harmonics is gΠs (t) = 1 + 4.97 cos(2𝜋fo t − 101.25∘ ) + 2.92 cos(4𝜋fo t − 112.5∘ ) + 1.8 cos(8𝜋fo t − 135∘ ) This signal is shown in Figure 4.15 along with the original and sample-and-hold signals g(t) and gΠ (t). The signal gΠs (t), reconstructed from all the frequency components contained in the sample-and-hold signal gΠ (t) within the Table 4.3 Harmonic components of signal g(t) compared to those of sample-and-hold signal gΠ (t) derived by Fourier analysis. g𝚷 (t) n
An
0
1
1 2 3
0
4
1.8006
g(t) 𝝓n
Ano
𝝓no
0
1
4.9679
−101.25
5
−90
0.056
2.9235
−112.5
3
−90
0.224
0 −135
— 2
0
ADL = 20 log(Ano /An ), dB
— −90
0
— 0.912
The sampling rate employed to produce gΠ (t) is four times the maximum frequency component of g(t).
237
238
4 Frequency Domain Analysis of Signals and Systems
8
Sample-and-hold gп(t) Original g(t) Reconstructed gпs(t) t
0
–6 Figure 4.15 Aperture effect causes a distortion in the signal gΠs (t) that is reconstructed using all the frequency components contained in the sample-and-hold signal gΠ (t) within the frequency band of the original signal g(t).
bandwidth of the original signal g(t), can be seen to be altered both in amplitude and phase (note, for example, the discernible delay) when compared to g(t). This distortion is due entirely to aperture effect introduced by the sample-and-hold operation. ●
●
Employing Eq. (4.42) to obtain more harmonics (n = 4, 5, 6, …) of the sample-and-hold signal gΠ (t) leads to a more faithful synthesis of gΠ (t), as shown in Figure 4.16 for synthesis of gΠs (t) obtained by adding the DC and the first 36 harmonics of gΠ (t). Notice that gΠs (t) does not approach closer to g(t) but to gΠ (t). Aperture distortion cannot be remedied by passing more and more harmonics in the sample-and-hold signal gΠ (t) through to contribute to the signal reconstruction. What that does is merely better reconstruct the sample-and-hold waveform, but not the original signal. Figure 4.17 shows the single-sided amplitude and phase spectra of gΠ (t) up to the 36th harmonic. We notice a very interesting feature: the only frequency components in gΠ (t) are the band of frequencies in the original signal g(t), a baseband we denote as ±f m , plus replications of this band at F s , 2F s , and other integer multiples of the sampling frequency F s . All frequency components are reduced in amplitude by the ADL of Eq. (4.43), so the higher replicated bands are more severely attenuated. For example, the amplitude of the frequency component f o is 5 units in the original signal g(t), whereas within gΠ (t) the amplitude of this component is reduced as 8
Sample-and-hold gп(t) Original g(t) Reconstructed gпs(t) 0
t
–6
Figure 4.16
Signal gΠs (t) synthesised using DC and the first 36 harmonics of sample-and-hold signal gΠ (t).
4.2 Fourier Series
Baseband Amplitude, An
5
0 Phase, ϕn (deg)
0
Fs ± fm
2Fs ± fm
1st replicated band
2nd replicated band
Fs
2Fs
fo 0
nfo nfo
–45 –90
–135
Figure 4.17 Single-sided amplitude and phase spectra of sample-and-hold signal gΠ (t) of Figure 4.16 up to the 36th harmonic. Sampling rate F s = 4f max .
follows: in the baseband where its frequency is f o , this component is reduced in amplitude by only 0.056 dB to 4.97 units. In the first replicated band where its frequency is F s + f o , the same component is reduced in amplitude by 24.665 dB to 0.29 units. And in the second replicated band where its frequency is 2F s + f o , the component is reduced in amplitude by 30.426 dB to 0.15 units. The maximum ADL within the frequency band of g(t) occurs at the maximum frequency component f max of g(t), and is given by Eq. (4.43) as
●
ADLmax = −20 log10 (|sinc(fmax ∕Fs )|) dB Since |sinc(x)| → 1 as x → 0, it means that 20 log(|sinc(x)|) → 0 as x → 0. Therefore, the way to reduce ADL and hence aperture distortion in sample-and-hold signals is by choosing a sampling rate F s > > f max . To illustrate, the maximum frequency component in our current example is 4f o , so let us repeat the above exercise using a sampling rate F s = 160f o . The results are presented in Table 4.4, where it can be seen that the ADL is negligible and phase shifts are very small, so the waveform reconstructed using the frequency components in the baseband of gΠ (t) will be closely matched with g(t), as shown in Figure 4.18. This solution to ADL comes at a high price, however, since it will lead to a significant increase in bandwidth. A minimum sampling rate of 2f max is required to avoid what is Table 4.4 Harmonic components of signal g(t) compared to those of sample-and-hold signal gΠ (t) with sampling rate equal to 40 times the maximum frequency component of g(t). g𝚷 (t) n
An
0
1
g(t) 𝝓n
0∘
4.9997
−91.125∘
2
2.9992
−92.245∘
3
0
1
4
1.9979
Ano
𝝓no
ADL = 20 log(Ano /An ), dB
0
5
0∘ −90∘
0.00223 —
1 3
−90∘
0∘
—
−94.5∘
2
— −90∘
0.00056
0.00893
239
240
4 Frequency Domain Analysis of Signals and Systems
8
Original, sample-and-hold and reconstructed signals
t
0
–6 Figure 4.18 Sample-and-hold operation at high sampling rate F s = 40f max . Aperture distortion is negligible and there is a close match between the original signal g(t) and reconstructed signal.
8
Original g(t) Ts τ
0
d = τ/Ts Flat-top-sampled gп(t)
t
–6 Figure 4.19
Flat-top sampling at sampling rate F s = 1/T s using sampling pulses of fractional width d = 1/4.
known as alias distortion, which we explore further in Chapter 9. At five times this minimum (with F s = 10f max ), the maximum ADL is only 0.14 dB. ●
The sample-and-hold operation discussed above is a special case of a more general class of sampling known as flat-top sampling. In sample-and-hold, the sampling operation is implemented using full-width sampling pulses that hold the value of g(t) constant for the entire sampling interval. More generally, fractional-width sampling pulses may be used that produce gΠ (t) by holding the output constant at g(nT s ) for only a fraction d at the start of each sampling interval. For the remaining fraction (1 − d) of the sampling interval the output is set to zero. The result is a flat-top-sampled signal such as shown in Figure 4.19 for d = 1/4 using the analogue signal g(t) earlier specified in Eq. (4.39). Note therefore that instantaneous sampling (Figure 4.14a) is flat-top sampling in the limit d → 0, whereas sample-and-hold (Figure 4.14b) is flat-top sampling with d = 1.
The Fourier series of Eq. (4.20) may be applied in a straightforward manner to the flat-top-sampled signal gΠ (t) of Figure 4.19. In this case, the m steps occur in pairs of amplitudes [g((k − 1)Ts ), 0] with corresponding pulse widths [dT s , (1 − d)Ts ] for k = 1, 2, 3, …, m, where m = T/T s , T being the period of g(t), and T s the sampling interval. With this in mind, we may proceed as previously discussed to evaluate the Fourier series of gΠ (t) and obtain a
4.2 Fourier Series
2.5
(a) d = 0.5
0 0.5
fmax
Fs
2Fs
3Fs
4Fs
RB1
RB2
RB3
RB4
2Fs
3Fs
4Fs
nfo
(b) d = 0.1 0.25
0 0.05 (c) d = 0.01
BB
Fs – fmax
Fs + fmax
0.025
0
fmax
Fs
nfo
Figure 4.20 Single-sided amplitude spectrum of flat-top-sampled signal at sampling rate F s using sampling pulses of fractional width d = 0.5, 0.1, and 0.01. (BB ≡ baseband; RB1 ≡ 1st replicated band, etc.)
tabulated list of the amplitude and phase of each of its sinusoidal components up to any desired harmonic n. We have done this for gΠ (t) produced using sampling rate F s = 4f max and three fractional pulse widths d = 0.5, 0.1, and 0.01. Figure 4.20 shows the SSAS of gΠ (t) up to the 68th harmonic for each value of d, from which we can see several interesting features. We notice (again) that the only frequency components present in a flat-top-sampled signal gΠ (t) are those in the original analogue signal g(t), referred to as baseband, plus replications of this baseband at regular intervals F s along the frequency axis. There is a scale factor of d on the amplitude of each harmonic since in this sampling process the signal is held for only a fraction d of the time. In addition to this scale factor, there is also ADL, as earlier discussed. However, we now see that the width of the sampling pulse (expressed as a fraction d of the sampling interval) is a factor in the amount of ADL experienced: ADL reduces as d decreases until it becomes negligible at d → 0, which corresponds to instantaneous sampling. Our Fourier analysis therefore reveals that flat-top sampling with d → 0 (or instantaneous sampling) of an analogue signal g(t) at a sampling rate F s produces a sampled signal gΠ (t) that contains the entire and undistorted spectrum of g(t) plus replications of this undistorted spectrum at a regular spacing F s . Provided F s ≥ 2f max so that the first replicated band (which contains frequencies down to F s − f max ) does not overlap into the baseband of the sampled signal gΠ (t) (which contains frequencies up to f max ) then adding together all the sinusoidal components within the baseband of the sampled signal gΠ (t) will perfectly reconstruct g(t). This is illustrated in Figure 4.21, where a perfect synthesis of g(t) is achieved with only the DC and the first four harmonics in the flat-top-sampled signal gΠ (t) at d = 0.01. Note that the synthesised signal must be multiplied by 1/d to restore it to the scale of the original signal g(t). The situation suggests that this reconstruction may be achieved simply by putting gΠ (t) through a lowpass filter (LPF) that passes all frequencies up to f max and blocks all frequencies ≥ F s − f max , thereby blocking all the replicated bands in gΠ (t). Of course, adding all the sinusoidal components in gΠ (t), including those in the replicated bands, will synthesise gΠ (t), but interest is usually in reconstructing the original analogue signal g(t) from which the samples were taken.
241
242
4 Frequency Domain Analysis of Signals and Systems
8 gп(t)
Synthesised Original g(t)
0
–6 Figure 4.21 Original signal g(t), flat-top-sampled signal gΠ (t) (sampled at rate of F s = 4f max using sampling pulses of fractional width d = 0.01), and synthesised signal (obtained by adding only the baseband components in gΠ (t)).
We are now able to state the complete expression for the ADL experienced by a harmonic of frequency nf o in a flat-top-sampled signal that is generated at a sampling rate F s using sampling pulses of fractional width d ( )) ( | nf o d || | (4.44) ADL = −20 log10 |sinc | dB | Fs || | Note that the previously discussed Eq. (4.43) is a special case of the above equation when d = 1. Also, when nf o d is an integer multiple of F s , then sinc(nf o d/F s ) = 0 and its negative logarithm and hence the ADL in Eq. (4.44) will be infinitely large. In Figure 4.20a with F s = 16f o and d = 0.5, this condition is satisfied at f = 32f o , 64f o , …; and in Figure 4.17 with F s = 16f o and d = 1, this condition is satisfied at f = 16f o , 32f o , … It is therefore this infinite attenuation of certain frequency components due to aperture loss that accounts for the absence of spectral lines at all integer multiples of F s in Figure 4.17 and at even integer multiples of F s in Figure 4.20a, although in both cases the DC component in the baseband would have been replicated at these locations. We earlier discussed and demonstrated in Figure 4.18 a way to reduce ADL by using a large sampling rate F s and noted that this solution is wasteful in bandwidth. Equation. (4.44) suggests a cheaper alternative solution by implementing flat-top sampling using fractional pulse width d → 0. This solution is demonstrated in Figure 4.21. In flat-top sampling, the maximum ADL within the frequency band of g(t) occurs at the maximum frequency component f max of g(t), and is given by ( )) ( | fmax d || | (4.45) ADLmax = −20 log10 |sinc | dB | | Fs | | The worst-case scenario occurs at the minimum allowed sampling rate F s = 2f max when the above expression reduces to ADLmax = −20 log10 (|sinc(d∕2)|) dB So, at d = 1, 0.5, 0.1, and 0.01, ADLmax is respectively 3.9, 0.9, 0.04, and 0.004 dB. This effect is apparent in the trend of reduction in spectral distortion in Figure 4.17 and Figure 4.20a–c. This would suggest that spectral distortion due to aperture effect is negligible in flat-top sampling with d ≤ 0.1 and may therefore be ignored.
4.2 Fourier Series
A
gm(t)
d = τ/T m=1
(a) τ A
t T m=2 t
(b) –A A
m=3 t
(c) –A A
m=4 t
(d) –A Figure 4.22
Sinusoidal pulse train with m half cycles per pulse.
4.2.4.2 Binary ASK Signal and Sinusoidal Pulse Train
Modulation is a fundamental and indispensable signal processing operation in telecommunications and Chapters 7, 8, and 11 are devoted to its detailed treatment. Here we wish to demonstrate the versatility of the Fourier series method by applying it to analyse a simple digital modulation scheme to gain valuable insight into the frequency content (and hence bandwidth) of the signals produced by the scheme. In BASK, a sequence of data bits is conveyed using a sinusoidal carrier signal by changing or shifting only the amplitude of the carrier between two values to signal binary 1 or 0. The best performance of the scheme in the presence of noise is achieved by setting one amplitude state to zero and the other to some nonzero value A, and this is commonly described as on–off keying (OOK). The highest frequency content (and bandwidth requirement) of a BASK signal will be when the data bits follow the fastest-changing sequence of 101010…, which produces a periodic BASK waveform that may therefore be analysed using the Fourier series method. It is a requirement of the scheme that the sinusoidal carrier must complete an integer number m of half-cycles in each bit interval, so the waveform will be a sinusoidal pulse train, as shown in Figure 4.22 for m = 1, 2, 3, 4. The pulse interval 𝜏 corresponds, say, to binary 1, the no-pulse interval corresponds to binary 0, and the duty cycle d = 𝜏/T of the pulse train will have a value d = 1/2 for the sequence 101010… However, the analysis that follows will be carried out for a general value of d to allow the results to be applicable to slower-changing regular sequences such as 100100100…, where d = 1/3, 110110110… with d = 2/3, 10001000… with d = 1/4, etc. Note that in all cases 𝜏 is the bit duration and therefore the transmission bit rate is Rb =
1 𝜏
(4.46)
243
244
4 Frequency Domain Analysis of Signals and Systems
The sinusoidal carrier completes m half-cycles in 𝜏 seconds, or m/2 full cycles per 𝜏 seconds, which means that carrier frequency f c = m/(2𝜏). Since 𝜏 = dT and T = 1/f o (where f o is the fundamental frequency of the periodic waveform), the relationships among carrier frequency f c , number m of half-cycles per bit interval 𝜏, bit rate Rb , and fundamental frequency f o are given by fc =
mf o m m = Rb = 2𝜏 2 2d
(4.47)
Let us now turn our attention to the important task of analysis of the waveforms gm (t) in Figure 4.22. One cycle of this pulse train, in the interval 0 ≤ t ≤ T, is defined by ⎧ ⎪A sin(2𝜋fc t), gm,T (t) = ⎨ ⎪0, ⎩
0≤t≤𝜏
(4.48)
elsewhere
The rms value Arms , mean value (i.e. DC component) Ao , and Fourier coefficients an and bn of this pulse train may be straightforwardly evaluated as follows from their respective definitions, using f c 𝜏 = m/2 from Eq. (4.47) and f o 𝜏 = d, where necessary. 𝜏
T
A2rms =
1 1 g2 (t)dt = A2 sin2 (2𝜋fc t)dt T ∫0 mT T ∫0 𝜏
=
A2 [1 − cos(4𝜋fc t)]dt, 2T ∫0
=
sin(4𝜋fc t) |𝜏 A2 sin(4𝜋fc 𝜏) A2 𝜏 | = − d− 2T 2 4𝜋fc ||0 4𝜋fc
=
sin(2𝜋m) A2 d− 2 4𝜋fc
=
A2 d, 2
since sin2 𝜃 =
1 (1 − cos 2𝜃) 2
since sin(2𝜋m) = 0
The rms value therefore depends neither on frequency f c nor on number of half-cycles m in a pulse, and is given by √ Arms = A d∕2
(4.49)
The condition d = 1 corresponds to a regular sinusoidal signal, in which case the above equation reduces to √ Arms = A∕ 2 as expected. The DC component is evaluated as T
A0 =
𝜏
1 1 g (t)dt = A sin(2𝜋fc t)dt T ∫0 mT T ∫0
0 A cos(2𝜋fc t) || T 2𝜋fc ||𝜏 [ [ ] ] A 1 − cos(2𝜋m∕2) A cos(0) − cos(2𝜋fc 𝜏) = = T T 2𝜋m∕2𝜏 2𝜋fc
=
4.2 Fourier Series
Replacing 𝜏/T by d yields A0 = =
Ad [1 − cos(m𝜋)] m𝜋 { 0, m even 2Ad∕m𝜋,
(4.50)
m odd
To evaluate the Fourier coefficients, we advance (i.e. shift leftward) each waveform in Figure 4.22 by 𝜏/2 so that ′ (t) in the the pulse is centred at t = 0. A little thought will show that the main cycle of the resulting waveform gm interval −T/2 ≤ t ≤ T/2 is now given by the following separate expressions for odd and even m { (−1)(m−1)∕2 A cos(2𝜋fc t), −𝜏∕2 ≤ t ≤ 𝜏∕2 ′ gm,T (t) = ; m = 1, 3, 5, · · · 0, elsewhere { (−1)m∕2 A sin(2𝜋fc t), −𝜏∕2 ≤ t ≤ 𝜏∕2 = ; m = 2, 4, 6, · · · (4.51) 0, elsewhere ′ (t) is an odd function when m is an even integer, whereas it is an even function when m is an Notice that gm,T odd integer. Therefore, in view of Eq. (4.16), we need to compute only an when m is odd, and only bn when m is even. It was of course specifically for the benefit of this simplification that we carried out the time shift. As a side note, applying the trigonometric identities
⎫ ⎪ ⎪ − sin(𝜃) = cos[𝜃 + (4k + 1)𝜋∕2] ⎪ ⎬; ⎪ cos(𝜃) = cos[𝜃 + (4k)𝜋∕2] ⎪ − cos(𝜃) = cos[𝜃 + (4k + 2)𝜋∕2] ⎪ ⎭ sin(𝜃) = cos[𝜃 + (4k + 3)𝜋∕2]
k = 0, 1, 2, 3, · · ·
(4.52)
′ we see that, for all m, gm,T (t) may be represented more simply by the following single expression
( ) ⎧ 𝜋 A cos 2𝜋f t + (m − 1) , ⎪ c ′ 2 gm,T (t) = ⎨ ⎪0, ⎩
−𝜏∕2 ≤ t ≤ 𝜏∕2
;
m = 1, 2, 3, · · ·
elsewhere
Now using Eq. (4.51) in Eq. (4.11) to compute an for odd m yields 𝜏∕2
an =
2 (−1)(m−1)∕2 A cos(2𝜋fc t) cos(2𝜋nf o t)dt T ∫−𝜏∕2
=
(−1)(m−1)∕2 A 𝜏∕2 [cos(2𝜋fc + 2𝜋nf o )t + cos(2𝜋fc − 2𝜋nf o )t]dt ∫−𝜏∕2 T
=
[ ] (−1)(m−1)∕2 A 2 sin(2𝜋fc 𝜏∕2 + 2𝜋nf o 𝜏∕2) 2 sin(2𝜋fc 𝜏∕2 − 2𝜋nf o 𝜏∕2) + T 2𝜋fc + 2𝜋nf o 2𝜋fc − 2𝜋nf o (m−1)∕2
= (−1)
[ ] 𝜏 sin 𝜋(fc 𝜏 + nf o 𝜏) sin 𝜋(fc 𝜏 − nf o 𝜏) + A T 𝜋(fc 𝜏 + nf o 𝜏) 𝜋(fc 𝜏 − nf o 𝜏)
= (−1)(m−1)∕2 Ad[sinc(nd + m∕2) + sinc(nd − m∕2)], m = 1, 3, 5, · · ·
(4.53)
245
246
4 Frequency Domain Analysis of Signals and Systems
Similarly, using Eq. (4.51) in Eq. (4.12) to compute bn for even m yields 𝜏∕2
bn =
2 (−1)m∕2 A sin(2𝜋fc t) sin(2𝜋nf o t)dt T ∫−𝜏∕2
(−1)m∕2 A 𝜏∕2 [cos(2𝜋fc − 2𝜋nf o )t − cos(2𝜋fc − 2𝜋nf o )t]dt ∫−𝜏∕2 T ] [ (−1)m∕2 A 2 sin(2𝜋fc 𝜏∕2 − 2𝜋nf o 𝜏∕2) 2 sin(2𝜋fc 𝜏∕2 + 2𝜋nf o 𝜏∕2) = − T 2𝜋fc − 2𝜋nf o 2𝜋fc + 2𝜋nf o [ ] 𝜏 sin 𝜋(fc 𝜏 − nf o 𝜏) sin 𝜋(fc 𝜏 + nf o 𝜏) − = (−1)m∕2 A T 𝜋(fc 𝜏 − nf o 𝜏) 𝜋(fc 𝜏 + nf o 𝜏) =
= (−1)m∕2 Ad[sinc(nd − m∕2) − sinc(nd + m∕2)], m = 2, 4, 6, · · · To summarise, the Fourier coefficients are given by
} an = (−1)(m−1)∕2 Ad[sinc(nd − m∕2) + sinc(nd + m∕2)] , bn = 0 an = 0 bn = (−1)m∕2 Ad[sinc(nd − m∕2) − sinc(nd + m∕2)]
m = 1, 3, 5, · · ·
} ,
m = 2, 4, 6, · · ·
(4.54)
Finally, using the relations provided in Eq. (4.13) to determine the amplitude An and phase 𝜙n of the nth harmonic based on an and bn , and including specifically Eq. (4.14) to obtain the correct phase of gm (t), bearing in ′ (t), we obtain the following Fourier series for the sinusoidal mind that Eq. (4.54) applies to the time-shifted gm pulse trains gm (t) shown in Figure 4.22. Note that we have used Ao given in Eq. (4.50), and that the following result applies for all m, although Figure 4.22 only shows m up to 4. ( ) ( )] ∞ [ ∑ ⎧A + (−1) m−1 2 Ad sinc nd − m2 + sinc nd + m2 cos(2𝜋nf o t − 𝜋nd), m odd 0 ⎪ n=1 gm (t) = ⎨ ( [ ( ) ( )] ( )) ∞ m ∑ ⎪A0 + (−1) 2 Ad sinc nd − m2 − sinc nd + m2 cos 2𝜋nf o t − 𝜋 nd + 12 , m even ⎩ n=1 Ad(1 − cos m𝜋) (4.55) A0 = m𝜋 Employing again the trigonometric identities of Eq. (4.52) and the fact that (−1)m = 1 when m is even and equals −1 when m is odd, we obtain the following single expression for this Fourier series, which applies for all m (whether even or odd) gm (t) = A0 +
∞ ∑
An cos(2𝜋nf o t + 𝜙n );
n=1
) )] [ ( ( m m An = Ad sinc nd − − (−1)m sinc nd + ; 2 2 𝜋 𝜙n = (m − 1 − 2nd) ; 2 Ad(1 − cos(m𝜋)) A0 = ; m𝜋 m = 1, 2, 3, 4, · · ·
(4.56)
Before considering what Eq. (4.56) reveals about BASK, it is useful to highlight a few special cases. The sinusoidal pulse train g2 (t) of Figure 4.22b corresponds to m = 2 in Eq. (4.56). Making this substitution yields g2 (t) = Ad
∞ ∑ n=1
[sinc(nd − 1) − sinc(nd + 1)] cos(2𝜋nf o t − 𝜋(nd − 1∕2))
(4.57)
4.2 Fourier Series
g(t)
A
d = τ/T
(a) –T
–T/2
–τ/2
0 T
A
(b)
τ/2
T/2
T
t
g1(t) d = τ/T = 1 t τ=T
g1s(t) = DC + 1st three harmonics
A (c)
t
0
Figure 4.23 (a) Centred cosine-shaped pulse train; (b) Full wave rectifier waveform; (c) Full wave rectifier waveform synthesised with DC and first three harmonics.
The centred cosine-shaped pulse train g(t) of amplitude A and duty cycle d shown in Figure 4.23a is a special case of Eq. (4.56) with m = 1 and a time shift to = −𝜏/2 (= − d/2f o ). We substitute m = 1 in Eq. (4.56) and add 2𝜋nf o to = 𝜋nd to the phase to obtain the Fourier series for the centred cosine-shaped pulse train as g(t) =
∞ ∑ 2Ad + Ad [sinc(nd − 1∕2) + sinc(nd + 1∕2)] cos(2𝜋nf o t) 𝜋 n=1
(4.58)
Finally, the full wave rectifier waveform g1 (t) of Figure 4.23b corresponds to Eq. (4.56) with m = 1 and d = 1. Making these substitutions yields g1 (t) =
∞ [ ) ( )] ( ∑ 1 2A 1 +A + sinc n + cos(2𝜋nf o t − n𝜋) sinc n − 𝜋 2 2 n=1
(4.59)
You may wish to verify that the DC and the first three harmonics of this series produce the synthesised waveform g1s (t) = 0.6366A − 0.4244A cos(2𝜋fo t) − 0.0849A cos(4𝜋fo t) − 0.0364A cos(6𝜋fo t) which is sketched in Figure 4.23c and shows a close resemblance to the full wave rectifier waveform. Now turning our attention to applying Eq. (4.56) to BASK, we note that the waveform of a BASK system (implemented as OOK) for the fastest-changing sequence 101010… has d = 1/2, so Eq. (4.56) gives the spectral values [ ( ) )] ( n−m A n+m An = sinc − (−1)m sinc 2 2 2 𝜋 𝜙n = (m − 1 − n) 2 A(1 − cos m𝜋) (4.60) A0 = 2m𝜋
247
248
4 Frequency Domain Analysis of Signals and Systems
A/2
fo A/4 Bn = 4fo A/20 20fo ↑ fc
0
Figure 4.24
40fo
Amplitude spectrum of BASK signal with m = 20.
If, for example, such a system operates at bit rate Rb = 50 kb/s and employs a carrier of frequency f c = 500 kHz, the BASK waveform (for the fastest-changing sequence) will have 𝜏 = 1/Rb = 20 μs, period T = 2𝜏 = 40 μs, fundamental frequency f o = 1/T = 25 kHz, and (from Eq. (4.47)) m = 2df c /f o = 20, so that f c = 20f o . Substituting m = 20 in Eq. (4.60) yields the amplitude and phase of any desired harmonic n. Figure 4.24 shows the amplitude spectrum of this BASK signal up to the 40th harmonic We see that the main lobe of the spectrum is centred on f c with a null bandwidth Bn = 4f o (which is 4/T = 2/𝜏 = 2Rb ). The required bandwidth for transmitting at bit rate Rb using BASK is therefore 2Rb . It turns out that other values of m also produce similar results, namely a bandwidth 2Rb centred on f c , with a maximum amplitude at the carrier frequency. Phase spectrum is not shown in Figure 4.24, but you may wish to check that Eq. (4.60) gives respective values of 0∘ , −90∘ , and −180∘ for harmonics n = 19, 20 (i.e. the carrier), and 21 used below. A BASK signal is usually passed through an appropriate bandpass filter having a passband centred at f c to significantly reduce the amplitudes of remnant frequency components outside the main lobe. Synthesising the BASK waveform using only the three frequency components within the main lobe (i.e. bandwidth of 2Rb centred on f c ) produces the signal gs (t) = 0.3265A cos(38𝜋fo t) + 0.5A cos(40𝜋fo t − 90∘ ) − 0.3105A cos(42𝜋fo t) which is sketched in Figure 4.25 along with the original OOK waveform for comparison. The signal gs (t) is clearly adequate for the detection of the data sequence at the receiver. Since the identified bandwidth is enough for the transmission of the fastest-changing BASK sequence, it is also enough for all other slower-changing sequences. 4.2.4.3 Trapezoidal Pulse Train
The trapezoidal pulse was introduced in Section 2.6 as a useful waveform shape for approximating realisable laboratory pulses because of its separate intervals of rising, constant, and falling edges. It is also the general pulse shape from which other popular pulses, such as rectangular, triangular, sawtooth, ramp, etc., are derived as special cases. The conditions for each of these special cases are specified in Eq. (2.27). The rms value of the trapezoidal pulse train is stated in Eq. (3.111) and derived in Worked Example 3.5. Here we wish to evaluate the Fourier series of the trapezoidal pulse train for the simple reason of providing a set of equations from which the Fourier series of a wide range of related waveforms may be quickly extracted without the need for fresh calculations starting from first principles. Figure 4.26 shows a unipolar trapezoidal pulse train g(t) of amplitude A and period T that is centred on the constant portion of the pulse. The pulse train is expressed as the sum of three periodic signals g1 (t), g2 (t), and g3 (t) of the same amplitude and period. The Fourier series of g(t) will be the sum of the series of these three component signals. We wish to determine the required Fourier coefficients.
4.2 Fourier Series
Bit interval A
OOK
0
–A A gs(t)
0
–A 1/fo Figure 4.25
OOK signal and gs (t) synthesised using three harmonics centred on carrier.
g(t) = g1(t) + g2(t) + g3(t) g(t)
τ = τr + τc + τf d = τ/T = dr + dc + df
A τr
g1(t)
τc τf τ
t T dc = τ c / T
A
t
τc
g2(t)
df = τf / T
A
t
τf g3(t)
dr = τr / T
A τr –T/2
Figure 4.26
–τc/2 τc/2
t T/2
Trapezoidal pulse train g(t) expressed as a sum of three periodic signals.
249
250
4 Frequency Domain Analysis of Signals and Systems
The DC component of g(t) is given by the area of the pulse divided by the period of the waveform. Thus ) ( 1 1 1 A𝜏r + A𝜏f + A𝜏c T 2 2 = A(dr ∕2 + df ∕2 + dc )
A0 =
(4.61)
where dr , df , and dc are the respective duty cycles of the rising, falling, and constant pulse edges as defined in Figure 4.26. The component signal g1 (t) is a centred unipolar RPT whose Fourier coefficients are derived in Worked Example 4.1. The results are an1 = 2Adc sinc(ndc ) bn1 = 0
(4.62)
Note that we have introduced an extra subscript to identify the signal to which the coefficients pertain. Next, we consider the component signal g2 (t) for which both the cosine and sine coefficients an2 and bn2 must be calculated since g2 (t) is neither even nor odd. An equation for the waveform of g2 (t) within one cycle (−T/2, T/2) may be determined by seeking an expression in the form y = 𝛾 + 𝛽t, where 𝛽 is the slope and 𝛾 is the y axis intercept (i.e. value at t = 0). To determine 𝛽 and 𝛾, note that the pulse of g2 (t) falls from a value A at t = 𝜏 c /2 to 0 in 𝜏 f seconds, so its slope is −A/𝜏 f , which means that at t = 0 its value was higher than A by 𝜏 c /2 × A/𝜏 f . Therefore, the expression for g2 (t) within one cycle (−T/2, T/2) is ( ) ⎧ 𝜏c t − , ⎪A 1 + 2𝜏f 𝜏f g2,T (t) = ⎨ ⎪0, ⎩
𝜏 𝜏c ≤ t ≤ 𝜏f + c 2 2
(4.63)
Otherwise
Substituting this expression in Eq. (4.11) allows us to evaluate an2 as follows an2
( ) T∕2 𝜏f +𝜏c ∕2 𝜏c 2 2 t = g (t) cos(2𝜋nf o t)dt = A 1+ − cos(2𝜋nf o t)dt T ∫−T∕2 3,T T ∫𝜏c ∕2 2𝜏f 𝜏f ( ) 𝜏f +𝜏c ∕2 𝜏f +𝜏c ∕2 𝜏c 2A 2A 1+ cos(2𝜋nf o t)dt − t cos(2𝜋nf o t)dt = T 2𝜏f ∫𝜏c ∕2 T𝜏f ∫𝜏c ∕2
The first integral on the right-hand side is straightforward and the second is evaluated in Worked Example 4.1 using integration by parts. Taking each integral in turn (
( ) 𝜏c sin(2𝜋nf o t) |𝜏f +𝜏c ∕2 2A | 1+ cos(2𝜋nf o t)dt = ∫𝜏c ∕2 T 2𝜏f 2𝜋nf o ||𝜏c ∕2 [ ] 𝜏f +𝜏c ∕2 𝜏f +𝜏c ∕2 2A t sin(2𝜋nf o t) cos(2𝜋nf o t) || 2A t cos(2𝜋nf o t)dt = + | T𝜏f ∫𝜏c ∕2 T𝜏f 2𝜋nf o (2𝜋nf o )2 ||𝜏 ∕2
2A T
𝜏 1+ c 2𝜏f
)
𝜏f +𝜏c ∕2
c
Substituting the integral limits for t and simplifying (with a little patience, using when necessary f o T = 1, f o 𝜏 f = df , f o 𝜏 c = dc , sin(𝜋x)/𝜋x ≡ sinc(x), and the trigonometric identities cos(A + B) ≡ cos A cos B − sin A sin B and 1 − cos2A ≡ 2 sin2 A) leads to an2 = Adf cos(n𝜋dc ) sinc2 (ndf ) + Adc sinc(ndc )[sinc(2ndf ) − 1]
(4.64)
4.2 Fourier Series
Next, to determine the sine coefficient bn2 of g2 (t), we use Eq. (4.63) in Eq. (4.12) ( ) 𝜏f +𝜏c ∕2 𝜏c 2 t bn2 = A 1+ − sin(2𝜋nf o t)dt T ∫𝜏c ∕2 2𝜏f 𝜏f ( ) 𝜏f +𝜏c ∕2 𝜏f +𝜏c ∕2 𝜏c 2A 2A = 1+ sin(2𝜋nf o t)dt − t sin(2𝜋nf o t)dt T 2𝜏f ∫𝜏c ∕2 T𝜏f ∫𝜏c ∕2 ( ) [ ]|𝜏f +𝜏c ∕2 𝜏c sin(2𝜋nf o t) A A 𝜏c ∕2 | 1+ − t cos(2𝜋nf o t) | = cos(2𝜋nf o t)|𝜏 +𝜏 ∕2 − f c | n𝜋 2𝜏f n𝜋𝜏f 2𝜋nf o |𝜏c ∕2 Again, following the same simplification steps used for an2 leads to bn2 = Adf sin(n𝜋dc ) sinc2 (ndf ) +
A cos(n𝜋dc )[1 − sinc(2ndf )] n𝜋
(4.65)
We do not need to calculate the Fourier coefficients for the remaining component signal g3 (t) from first principles if we observe that g3 (t) is a time-reversed version of g2 (t) with 𝜏 f (and hence df ) replaced by 𝜏 r (and dr ). We know from Eq. (4.15) that time reversal alters only the phase spectrum by a factor of −1, which is a complex conjugation operation achieved by multiplying the sine coefficient bn by −1 while leaving the cosine coefficient an unchanged. Thus an3 = Adr cos(n𝜋dc ) sinc2 (ndr ) + Adc sinc(ndc )[sinc(2ndr ) − 1] A cos(n𝜋dc )[sinc(2ndr ) − 1] − Adr sin(n𝜋dc ) sinc2 (ndr ) bn3 = n𝜋
(4.66)
The cosine and sine coefficients of the trapezoidal pulse train g(t) is the sum of the respective coefficients derived above for the component signals. That is an = an1 + an2 + an3 bn = bn1 + bn2 + bn3
(4.67)
Therefore, including the DC component, the Fourier series of a trapezoidal pulse train has the coefficients A (d + 2dc + df ) 2 r an = A cos(𝜋ndc )[dr sinc2 (ndr ) + df sinc2 (ndf )]
A0 =
+ Adc sinc(ndc )[sinc(2ndr ) + sinc(2ndf )] A bn = cos(𝜋ndc )[sinc(2ndr ) − sinc(2ndf )] n𝜋 + A sin(𝜋ndc )[df sinc2 (ndf ) − dr sinc2 (ndr )]
(4.68)
This is a very important and versatile result. It is left as an exercise for you to show that Eq. (4.17) for the unipolar RPT and Eq. (4.19) for the unipolar triangular pulse train follow straightforwardly from this result when the right conditions are placed on the duty cycles dr , dc , and df . Further practice is provided in the end-of-chapter questions to use this result to quickly derive the Fourier series of other trapezoid-related waveforms and pulse trains. This saves the effort required to solve from first principles. Here is a quick worked example to get you started. Worked Example 4.6
Fourier Series of a Periodic Ramp Waveform
We wish to use Eq. (4.68) to quickly derive the Fourier series of the periodic ramp waveform x(t) shown in Figure 4.27a and to test the correctness of our derived series by synthesising x(t) using the DC and the first three harmonics of the series.
251
252
4 Frequency Domain Analysis of Signals and Systems
x(t), V
100 (a) –30
–20
–10
0
10
20
30
100 (b) xs(t) = DC + first three harmonics 0 –30
–20
–10
0
10
20
30
–20
–10
0
10
20
30
t, μs
t, μs
100 (c) Synthesis with DC + first 20 harmonics
Figure 4.27
0 –30
t, μs
Worked Example 4.6: Periodic ramp waveform.
This waveform is a special case of the trapezoidal pulse train in Figure 4.26 with A = 100, 𝜏 r = 𝜏 c = 0, and 𝜏 f = 𝜏 = T = 10 μs; so that dr = dc = 0, df = 𝜏/T = 1, and f o = 1/T = 100 kHz. Introducing these conditions into the expressions for the Fourier coefficients of a trapezoidal pulse train given in Eq. (4.68) yields A0 =
100 d = 50 × 1 2 f
= 50 an = A cos(0)[0 + df sinc2 (ndf )] + 0 = sinc2 (n) =0 bn = =
100 cos(0)[sinc(2ndr ) − sinc(2ndf )] + 0 n𝜋 100 [sinc(0) − sinc(2n)] n𝜋
100 n𝜋 This condition (an = 0, bn > 0) corresponds to the third condition in the bottom half of Figure 4.2, from which it follows that An = |bn | = 100/n𝜋, and 𝜙n = −90∘ . The required Fourier series of the periodic ramp waveform x(t) is thus ∞ ∑ x(t) ≡ A0 + An cos(2𝜋nf o t + 𝜙n ) =
n=1
= 50 +
∞ ∑ 100 n=1
n𝜋
cos(2𝜋nf o t − 90∘ ),
fo = 100 kHz
(4.69)
4.3 Fourier Transform
We see that x(t) contains a DC component of 50 V and has an amplitude spectrum that is inversely proportional to harmonic number n, with every harmonic component having the same phase of −90∘ . It is always advisable to ascertain the correctness of a derived Fourier series by summing the first few terms to compare the synthesised waveform with the original. Doing so here and summing the DC and first three harmonics on the right-hand side of the above equation yields the synthesised waveform xs (t) = 50 + 31.83 sin(2𝜋nf o t) + 15.92 sin(4𝜋nf o t) + 10.61 sin(6𝜋nf o t); fo = 100 kHz which is sketched in Figure 4.27b and compares well with x(t). The derived series is therefore correct. Adding more harmonics (not required) would of course produce a closer fit as demonstrated in Figure 4.27c by summing the DC and the first 20 harmonics.
4.3 Fourier Transform The Fourier transform (FT) is simply an extension of the Fourier series to nonperiodic signals and will exist if a signal satisfies the Dirichlet conditions earlier stated for Fourier series. In fact, if g(t) is an energy signal, which means that ∞
E=
∫−∞
|g(t)|2 dt < ∞
then its FT will exist. However, note that the Dirichlet condition regarding integrability is more relaxed and simply requires that g(t) be absolutely integrable, which means that it should satisfy the condition ∞
∫−∞
|g(t)|dt < ∞
(4.70)
Figure 4.28 illustrates to scale the gradual evolution of a signal from being a periodic waveform gT (t) to a nonperiodic pulse g(t). The amplitude spectrum of the signal at each stage is also shown, and we see how the spacing of spectral lines gradually reduces until the spectrum eventually ceases to be discrete. Let us now formalise these observations and derive an expression for the FT of nonperiodic signals. Recall from Eqs. (4.23) and (4.24) that a periodic waveform gT (t) may be expressed as gT (t) =
∞ ∑
Cn ej2𝜋nf o t
(i)
n=−∞ T∕2
Cn =
1 g(t)e−j2𝜋nf o t dt, T ∫−T∕2
fo =
1 T
(ii)
(4.71)
which means that gT (t) contains one complex exponential of magnitude |Cn | in each frequency interval of size f o centred at nf o , for n = …, −3, −2, −1, 0, 1, 2, 3, … As shown in Figure 4.28, a nonperiodic waveform g(t) arises from gT (t) in the limit T → ∞. In this limit, f o = 1/T becomes infinitesimally small and is denoted df ; the discrete frequencies nf o become a continuum of frequencies, denoted f ; and g(t) therefore contains a continuum of complex exponentials. The FT of g(t), denoted G(f ), is the coefficient (having magnitude |G(f )| and angle ∠G(f )) of the totality of complex exponentials contained in g(t) per unit frequency. In other words, the amplitude |G(f )| of the FT of g(t) equals the amplitude of the complex exponential component of g(t) lying in an infinitesimally small frequency interval −df ∕2 ≤ f ≤ df ∕2 divided by the size df of the interval. And the angle ∠G(f ) of the FT is the phase of that complex exponential. This definition of the FT implies that G(f ) = lim
T→∞
Cn = lim Cn T T→∞ f0
(4.72)
253
254
4 Frequency Domain Analysis of Signals and Systems
Waveform: gT(t), T = τ
Amplitude spectrum:
fo = 1/T
t
τ
nfo →
gT(t), T = 5τ
τ
t
T
nfo →
gT(t), T = 20τ
nfo →
g(t) = gT(t), T → ∞ t
τ Figure 4.28
–2/τ
f→
2/τ
Evolution of a signal from periodic to nonperiodic.
But as T → ∞, f o → df , so we may also express the relationship between Cn and G(f ) as G(f ) =
Cn ; ⇒ Cn = G(f )df df
(4.73)
Substituting Eq. (4.71)(ii) for Cn into Eq. (4.72), we obtain an expression for G(f ) as [ ] T∕2 1 −j2𝜋nf o t g(t)e dt G(f ) = lim Cn T = lim T T→∞ T→∞ T ∫−T∕2 ∞
=
∫−∞
g(t)e−j2𝜋ft dt
(4.74)
Given the time domain representation g(t) of a nonperiodic signal, Eq. (4.74) provides a means of determining its FT G(f ), which is the frequency domain representation of the signal. G(f ) is in general a complex number. Its magnitude |G(f )| gives the amplitude spectrum of g(t), and its angle ∠G(f ) gives the phase spectrum of g(t), denoted 𝜙g (f ). Note that G(f ) has units of amplitude spectral density, which means that if the signal g(t) is measured in volts then its FT will have units of volts per hertz (V/Hz). To establish a means of deriving the time domain signal g(t) from a frequency domain specification G(f ) of the signal, a process known as inverse Fourier transform (IFT), we note that in the limit T → ∞, the periodic signal gT (t) in Eq. (4.71)(i) becomes the nonperiodic signal g(t), the coefficient Cn in that equation becomes G(f )df , and the summation in the same equation of a discrete set of complex exponentials of frequency nf o becomes integration of a continuum of complex exponentials of frequency f . Thus g(t) = lim gT (t) = lim T→∞
T→∞
∞ ∑
Cn ej2𝜋nf o t
n=−∞
∞
=
∫−∞
G(f )ej2𝜋ft df
(4.75)
4.3 Fourier Transform
Equation (4.74) gives the FT of g(t), whereas Eq. (4.75) gives the IFT of G(f ). The two functions g(t) and G(f ) are said to form a FT pair, a relationship often denoted as g(t) ⇌ G(f )
(4.76)
which states that the function on the right-hand side is the FT of the left-hand side; and that the left-hand side is the IFT of the right-hand side. Be careful, however, to avoid the not uncommon mistake of writing g(t) = G(f ). The operations of Fourier and IFTs are also sometimes expressed as G(f ) = F[g(t)] g(t) = F−1 [G(f )]
Worked Example 4.7
(4.77)
Fourier Transform Derived from First Principles
We wish to determine the FT of the following pulses (a) g(t) = Arect(t∕𝜏), ) Figure 4.29a ( ) shown ( in t , m = 3, shown in Figure 4.30a t rect (b) g1 (t) = cos m𝜋 𝜏 𝜏 and to sketch their amplitude spectra. g(t) = Arect(t/τ) A
(a)
–τ/2
t
τ/2 |G( f )|
Aτ
Bn = 1/τ (b)
0 –5/τ
–4/τ
–3/τ
–2/τ
–1/τ
0
2/τ
3/τ
4/τ
5/τ
f
0
–180 –5/τ Figure 4.29
f
∠G(f), deg
180
(c)
1/τ
–4/τ
–3/τ
–2/τ
–1/τ
0
1/τ
2/τ
3/τ
4/τ
5/τ
Worked Example 4.7: Rectangular pulse and its amplitude and phase spectra.
255
256
4 Frequency Domain Analysis of Signals and Systems
A
(a)
m
g1(t) = cos
–τ/2
τ/2
–A
τ
t
rect
t
τ
,m=3
t
τ
(b) 0.509Aτ
0.25Aτ
0
–5/τ
Figure 4.30
–4/τ
–3/τ
–2/τ
–1/τ
0
1/τ
2/τ
3/τ
4/τ
5/τ
f
Worked Example 4.7: Sinusoidal pulse and its amplitude spectrum.
(a) The given signal g(t) is a unipolar rectangular pulse of amplitude A and duration 𝜏 centred at the origin, as shown in Figure 4.29a. A straightforward evaluation of Eq. (4.74) using the given specification of g(t) yields G(f ) as follows ∞
G(f ) =
∫−∞
g(t)e−j2𝜋ft dt
𝜏∕2
=
∫−𝜏∕2
Ae
−j2𝜋ft
dt,
since g(t) =
{ A, 0,
−𝜏∕2 ≤ t ≤ 𝜏∕2 otherwise
𝜏∕2 e−j2𝜋ft |
A j2𝜋f 𝜏∕2 | [e = − e−j2𝜋f 𝜏∕2 ] −j2𝜋f ||−𝜏∕2 j2𝜋f A = 2j sin(2𝜋f 𝜏∕2), (using Euler’s formula) j2𝜋f ( ) sin(𝜋f 𝜏) 𝜏 , multiplying by = A𝜏 𝜏 𝜋f 𝜏 =A
= A𝜏 sinc(f 𝜏) The amplitude spectrum |G(f )| is sketched in Figure 4.29b and the phase spectrum 𝜙g (f ) = ∠G(f ) in Figure 4.29c. We see that a rectangular pulse of duration 𝜏 has a sinc-shaped amplitude spectrum with nulls at f = ±1/𝜏, ±2/𝜏, ±3/𝜏, … In this case in which the pulse is centred on the y axis, the spectrum is entirely real, so the phase spectrum is 0∘ when G(f ) is positive and is ±180∘ when G(f ) is negative, which happens in alternate lobes of the spectrum. Notice that the amplitude spectrum is an even function of frequency, whereas the phase spectrum is an odd function of frequency, and this feature will always be the case whenever the signal g(t) is real.
4.3 Fourier Transform
(b) The given signal g1 (t) is a sinusoidal pulse of amplitude A and duration 𝜏 centred at the origin and completing an odd number m of half-cycles within the interval 𝜏, as shown in Figure 4.30a. In this case m = 3, but we will derive the FT expression for a general m = 1, 3, 5, 7, … Substituting the given specification of g1 (t) in Eq. (4.74) and using Euler’s formula to expand the integral yields ) m𝜋 t e−j2𝜋ft dt ∫−∞ ∫−𝜏∕2 𝜏 𝜏∕2 𝜏∕2 ) ) ( ( m𝜋 m𝜋 t cos(2𝜋ft)dt − jA t sin(2𝜋ft)dt cos cos =A ∫−𝜏∕2 ∫−𝜏∕2 𝜏 𝜏 ∞
g1 (t)e−j2𝜋ft dt =
G1 (f ) =
(
𝜏∕2
A cos
The second integral is zero (since the integrand is an odd function) and the first integrand is even so the integral may be doubled and evaluated in the half-interval (0, 𝜏/2) to obtain 𝜏∕2 ) ( m𝜋 t cos(2𝜋ft)dt cos G1 (f ) = 2A ∫0 𝜏 𝜏∕2
[cos((2𝜋f + m𝜋∕𝜏)t) + cos((2𝜋f − m𝜋∕𝜏)t)]dt ∫0 ] 𝜏∕2 [ sin((2𝜋f + m𝜋∕𝜏)t) sin((2𝜋f − m𝜋∕𝜏)t) || + =A | | 2𝜋f + m𝜋∕𝜏 2𝜋f − m𝜋∕𝜏 |0 ] [ sin(𝜋(f 𝜏 + m∕2)) sin(𝜋(f 𝜏 − m∕2)) + =A 𝜋(2f + m∕𝜏) 𝜋(2f − m∕𝜏)
=A
Multiplying the top and bottom of the right-hand side by 𝜏/2 allows us to introduce the sinc function and obtain the result 𝜏 G1 (f ) = A [sinc(f 𝜏 + m∕2) + sinc(f 𝜏 − m∕2)] 2 The amplitude spectrum |G1 (f )| is shown in Figure 4.30b for m = 3. The phase spectrum ∠G1 (f ) (not shown) is 0∘ and 180∘ in alternate lobes, starting with 0∘ in the highest-amplitude lobe.
4.3.1 Properties of the Fourier Transform The FT has several properties which are often exploited in communication system analysis and to reduce the computations needed to derive the FT of related signals. It is important that you familiarise yourself with these features and properties which we summarise below, mostly without proof, along with examples showing how some of them are used in practice. In what follows, it is given that g(t) ⇌ G(f ) g1 (t) ⇌ G1 (f ) g2 (t) ⇌ G2 (f ) and that a, a1 , a2 , and to are constants. 4.3.1.1 Even and Odd Functions
Using Euler’s formula, the FT may be expressed in terms of sine and cosine functions as ∞
G(f ) =
∫−∞
g(t)e−j2𝜋ft dt
∞
=
∫−∞
∞
g(t) cos(2𝜋ft)dt − j
∫−∞
g(t) sin(2𝜋ft)dt
257
258
4 Frequency Domain Analysis of Signals and Systems
Referring to the properties of even and odd functions listed in Section 2.5.5, and noting that the cosine function is even whereas the sine function is odd, we see that the first integral will be zero when g(t) is an odd function of time, and similarly the second integral will be zero when g(t) is even. Thus { ∞ 2 ∫0 g(t) cos(2𝜋ft)dt, g(t) even G(f ) = (4.78) ∞ −2j ∫0 g(t) sin(2𝜋ft)dt, g(t) odd This result indicates that: ●
●
The FT of an even function of time is a real function of frequency, with its phase spectrum therefore limited in values to only ∠G(f ) = 0∘ or ± 180∘ or both. The FT of an odd function of time is an imaginary function of frequency, with its phase spectrum therefore limited in values to only ∠G(f ) = ±90∘ .
4.3.1.2 Linearity
a1 g1 (t) + a2 g2 (t) ⇌ a1 G1 (f ) + a2 G2 (f )
(4.79)
Equation (4.79) indicates that the FT obeys the principle of superposition. This means that FT is a linear operator whose output in the presence of multiple simultaneous inputs is the result of treating each input as if they were present alone and then adding their individual outputs together. 4.3.1.3 Time Shifting
g(t − to ) ⇌ G(f ) exp(−j2𝜋fto ) = |G(f )|∠[∠G(f ) − 2𝜋fto ]
(4.80)
This indicates that the time domain action of delaying a signal g(t) by time to does not alter the signal’s amplitude spectrum in any way; but it adds −2𝜋fto to the phase spectrum of g(t). 4.3.1.4 Frequency Shifting
g(t) exp(j2𝜋fc t) ⇌ G(f − fc )
(4.81)
Therefore, multiplying a signal by exp(j2𝜋f c t) in the time domain has the effect in the frequency domain of translating its entire spectrum rightward by f c along the frequency axis. Similarly, multiplying by exp(−j2𝜋f c t) will translate the spectrum leftward through f c along the frequency axis. Furthermore, since 1 [exp(j2𝜋fc t) + exp(−j2𝜋fc t)] 2 it follows from Eq. (4.81) and the linearity property of Eq. (4.79) that cos(2𝜋fc t) =
1 1 G(f − fc ) + G(f + fc ) (4.82) 2 2 The effect of multiplication by cos(2𝜋f c t), an operation carried out, for example, in a mixer circuit, is therefore to translate the signal’s spectrum by ±f c along the frequency axis, applying a scale factor of one-half in the process. g(t) cos(2𝜋fc t) ⇌
4.3.1.5 Time Scaling
g(at) ⇌
1 G(f ∕a) |a|
(4.83)
This indicates that compressing a signal in the time domain (e.g. by shortening its duration) causes its spectrum to expand by the same factor, and vice versa. This is a statement of the inverse relationship that exists between the description of a signal in the time and frequency domains.
4.3 Fourier Transform
4.3.1.6 Time Reversal
g(−t) ⇌ G(−f )
(4.84)
Equation (4.84) states that reversal in the time domain corresponds to reversal in the frequency domain. We know that if g(t) is real then its amplitude spectrum is even – and hence unchanged by a reversal operation – whereas its phase spectrum is odd and therefore changed in sign by a reversal operation. Thus, time reversal does not alter the amplitude spectrum of a real signal but multiplies its phase spectrum by a factor of −1. 4.3.1.7 Complex Conjugation
g∗ (t) ⇌ G∗ (−f )
(4.85)
Thus, complex conjugation in the time domain is equivalent to complex conjugation plus reversal (i.e. flipping about f = 0) in the frequency domain. If g(t) is real then complex conjugation has no effect on the signal in the time domain and therefore no effect in the frequency domain, so the spectrum of g(t) and the spectrum of g* (t) given above must be identical. Similarly, if g(t) is real then g(−t) = g* (−t), which means that the spectrum of g(−t) given in Eq. (4.84) as G(−f ) must be identical with the spectrum of g* (−t), which, according to Eq. (4.85), is obtained by complex conjugation and reversal of G(−f ) to produce G* (f ). In summary G∗ (−f ) = G(f ) ⎫ ⎪ G(−f ) = G∗ (f ) ⎬ , g(−t) ⇌ G∗ (f )⎪ ⎭
g(t) real
(4.86)
4.3.1.8 Duality
G(t) ⇌ g(−f )
(4.87)
Equation (4.87) states that if a signal g2 (t) = G(t) is identical in shape to the spectrum of g(t), then the spectrum of g2 (t) will be G2 (f ) = g(−f ), having a shape that is identical to a time-reversed version of g(t). For example, we established in Worked Example 4.7 that a rectangular pulse has a sinc spectrum. It follows from this duality property that a sinc pulse will have a rectangular spectrum. To be more specific, since ( ) t ≡ g(t) ⇌ 𝜏 sinc(f 𝜏) ≡ G(f ) rect 𝜏 Equation (4.87) implies that G(t) = 𝜏 sinc(t𝜏) ⇌ g(−f ) = rect(−f ∕𝜏) 1 sinc(t𝜏) ⇌ rect(f ∕𝜏), (since rect() is even) 𝜏 ( ) f 1 sinc(2Bt) ⇌ rect 2B 2B Thus, a sinc pulse with nulls at intervals of t = 1/2B has a brick wall spectrum of bandwidth equal to B. 4.3.1.9 Differentiation
d g(t) ⇌ j2𝜋fG(f ) dt n d g(t) ⇌ (j2𝜋f )n G(f ) (4.88) dtn Differentiation in the time domain is therefore a form of high-pass filtering operation. It boosts the high-frequency components by multiplying the amplitude spectrum by a factor of 2𝜋f , which increases with frequency. This relatively reduces the amplitudes of the low-frequency components. Furthermore, the factor j indicates that the phase of each frequency component is advanced by 90∘ .
259
260
4 Frequency Domain Analysis of Signals and Systems
4.3.1.10 Integration t
∫−∞
g(𝜆)d𝜆 ⇌
G(f ) G(0) 𝛿(f ) + 2 j2𝜋f
(4.89)
Integration in the time domain is therefore a lowpass filtering operation. The amplitude of each frequency component is multiplied by a factor of 1/2𝜋f , which is inversely proportional to frequency. This attenuates the high-frequency components relative to the low-frequency components. The phase of each component is also reduced by 90∘ . 4.3.1.11 Multiplication ∞
g1 (t)g2 (t) ⇌
∫−∞
G1 (𝛾)G2 (f − 𝛾)d𝛾 = G1 (f ) ∗ G2 (f )
(4.90)
Multiplication of two signals in the time domain therefore corresponds to convolution of their FTs in the frequency domain. 4.3.1.12 Convolution
g1 (t) ∗ g2 (t) ⇌ G1 (f )G2 (f )
(4.91)
The convolution of two signals or functions in the time domain corresponds to the multiplication of their FTs in the frequency domain. Multiplication being a simpler operation than convolution, this property is exploited extensively in systems analysis. For example, we saw in Section 3.6.1 that the output y(t) of a linear time invariant (LTI) system is the convolution of the input signal x(t) with the system’s impulse response h(t). This property indicates that, in analysing an LTI system to determine its output, we may sidestep the convolution operation by adopting the following alternative analysis steps 1. 2. 3. 4.
Determine the FT of x(t): x(t) ⇌ X(f ) Determine the FT of h(t): h(t) ⇌ H(f ) Multiply the two FTs: X(f )H(f ) = Y (f ) The desired output y(t) is the IFT of this product: y(t) = F−1 [Y (f )].
These steps might seem long-winded. However, FTs are extensively tabulated and therefore the above steps are often faster than undertaking the convolution operation. We return to this method of analysis later in this chapter. The convolution property may also be applied as follows to show that convolving a signal g(t) with the impulse function 𝛿(t) leaves the signal unchanged. Since F[𝛿(t)] = 1, it follows by the convolution property that g(t) ∗ 𝛿(t) ⇌ G(f ) × 1 = G(f ) which states that the FT of g(t) ∗ 𝛿(t) is G(f ) and hence that g(t) ∗ 𝛿(t) = g(t)
(4.92)
4.3.1.13 Areas ∞
g(0) =
∫−∞
G(f )df
(a)
∞
G(0) =
∫−∞
g(t)dt
(b)
(4.93)
The total area under the spectrum of a signal gives the value of the signal at time t = 0, whereas the total area under the waveform of the signal gives the value of the spectrum of the signal at f = 0.
4.3 Fourier Transform
Given:
t g(t) = A rect τ ⇌ G(f) = Aτ sinc (fτ)
(a)
A
–τ/2
t
τ/2
Desired: g1(t) ⇌ G1(ƒ)?
g2(t) ⇌ G2(ƒ)?
g3(t) = A trian(t/τ) ⇌ G3(f)? A
A
τ
t
τ/2
–τ/2
–A
t
–τ/2
τ/2
t
–A
(b) Figure 4.31
(c)
(d)
Worked Example 4.8.
4.3.1.14 Energy ∞
∫−∞
∞
|g(t)|2 dt =
∫−∞
|G(f )|2 df
(4.94)
This is the so-called Parseval’s theorem (also known as Rayleigh’s energy theorem) which allows the same integration process to be used to calculate signal energy in either domain. Each integral must be finite, implying that g(t) is an energy signal. Worked Example 4.8
Application of Fourier Transform Properties
Given that A rect(t∕𝜏) ⇌ A𝜏 sinc(f 𝜏) we wish to apply some of the above FT properties to quickly determine the FT of each of the pulses g1 (t), g2 (t), and g3 (t) shown in Figure 4.31b–d without the need to start from first principles. The first pulse g1 (t) is the result of delaying the centred rectangular pulse g(t) (shown again in Figure 4.31a) by 𝜏/2 and multiplying by a scale factor − 1. Thus ) ( t − 𝜏∕2 g1 (t) = −A rect 𝜏 We take the FT of both sides of the above equation, applying the linearity and time shifting properties of the FT, to obtain the FT of g1 (t) as G1 (f ) = −G(f )e−j2𝜋f 𝜏∕2 = −A𝜏 sinc(f 𝜏)e−j𝜋f 𝜏 = e−j𝜋 A𝜏 sinc(f 𝜏)e−j𝜋f 𝜏 , = A𝜏 sinc(f 𝜏)e−j𝜋(1+f 𝜏)
(since − 1 = e−j𝜋 )
261
262
4 Frequency Domain Analysis of Signals and Systems
The second pulse g2 (t) in Figure 4.31c is a bipolar pulse obtained as follows from two centred rectangular pulses g 1/2 (t) of amplitude A and duration 𝜏/2: (i) advance one pulse by 𝜏/4; (ii) delay the other g 1/2 (t) by 𝜏/4 and scale by −1; (iii) add the two pulses of (i) and (ii). That is ) ) ( ( 𝜏 𝜏 − g1∕2 t − g2 (t) = g1∕2 t + 4 ( ( 4 ) ) t + 𝜏∕4 t − 𝜏∕4 = A rect − A rect 𝜏∕2 𝜏∕2 ) ( ) ( 2(t − 𝜏∕4) 2(t + 𝜏∕4) − A rect = A rect 𝜏 𝜏 When compared to the rectangular pulse Arect(t/𝜏) whose FT we are given, the two terms on the right-hand side are the same rectangular pulse compressed by a factor of 2 and, respectively, advanced and delayed by 𝜏/4. We therefore take the FT of both sides of the above equation, employing the FT properties of time scaling (Eq. (4.83)) and time shifting (Eq. (4.80)), to obtain the FT of g2 (t) as ( ) ( ) 𝜏 j2𝜋f 𝜏∕4 𝜏 −j2𝜋f 𝜏∕4 𝜏 𝜏 e e G2 (f ) = A sinc f − A sinc f 2 2 2 2 ( ) ( ) 𝜏 𝜏 𝜏 𝜏 [ej𝜋f 𝜏∕2 − e−j𝜋f 𝜏∕2 ] = A sinc f 2j sin(𝜋f 𝜏∕2) = A sinc f 2 2 2 2 ( ) ( ) ( ) 2 𝜋f 𝜏∕2 𝜏 𝜏 𝜏 𝜏 A𝜏 2j sin(𝜋f 𝜏∕2) × f sinc f sinc f = A sinc f = j𝜋 2 2 2 2 2 𝜋f 𝜏∕2 ( ) A𝜏 2 𝜏 = j𝜋 f sinc2 f 2 2 Notice that G2 (f ) is imaginary as expected since g2 (t) is an odd function. The phase of G2 (f ) is ∠G2 (f ) = 90∘ for f > 0, and ∠G (f ) = −90∘ for f < 0. 2
Moving on to the third pulse (Figure 4.31d), we observe that the given triangular pulse g3 (t) can be expressed in terms of the above bipolar pulse g2 (t) since ⎧0, ⎪ dg3 (t) ⎪2A∕𝜏, =⎨ dt ⎪−2A∕𝜏, ⎪0, ⎩
t < −𝜏∕2 −𝜏∕2 ≤ t < 0 0 < t ≤ 𝜏∕2
=
2 g (t) 𝜏 2
t > 𝜏∕2
Integrating both sides yields t
g3 (t) =
2 g (t)dt 𝜏 ∫−∞ 2
Thus, g3 (t) is the integral of g2 (t) along with a scale factor 2/𝜏. And since we know G2 (f ) from the previous solution, we take the FT of both sides of the above equation, employing the integration property of FT (with G2 (0) = 0) and the linearity property, to obtain the FT of g3 (t) as ( ) 2 G2 (f ) 1 𝜏 2 A𝜏 2 f sinc2 f = × × j𝜋 𝜏 j2𝜋f 𝜏 j2𝜋f 2 2 ( ) 𝜏 𝜏 2 = A sinc f 2 2
G3 (f ) =
This spectrum is real as expected since g3 (t) is even. In addition, G3 (f ) is positive at all frequencies, so its phase ∠G3 (f ) = 0∘ at all frequencies.
4.3 Fourier Transform
To summarise, we have obtained the FT of the delayed (and polarity-inverted) rectangular pulse g1 (t), the bipolar rectangular pulse g2 (t), and the triangular pulse g3 (t) = Atrian(t∕𝜏) as follows G1 (f ) = A𝜏 sinc(f 𝜏)e−j𝜋(1+f 𝜏) ( ) A𝜏 2 𝜏 f sinc2 f 2 2 ( ) 𝜏 𝜏 G3 (f ) = A sinc2 f 2 2 These expressions in general evaluate to complex numbers, and it is important to be able to interpret them to extract the correct magnitude and angle, which, respectively, provide amplitude and phase spectral values. For example, the magnitude and angle of G1 (f ) are given by G2 (f ) = j𝜋
|G1 (f )| = |A𝜏 sinc(f 𝜏)| ⎧0, ⎪ ∠G1 (f ) = ⎨−(1 + f 𝜏)𝜋, ⎪ ⎩−𝜋𝜏f ,
A𝜏 sinc(f 𝜏) = 0 A𝜏 sinc(f 𝜏) > 0 A𝜏 sinc(f 𝜏) < 0
The spectra of the above three pulses are plotted in Figure 4.32a–c. It would be useful to take a moment to verify some of the values plotted, especially for ∠G1 (f ), to satisfy yourself that you have the required understanding. The plot of ∠G3 (f ) = 0∘ is trivial and not shown.
4.3.2 Table of Fourier Transforms The FT of a signal may be derived from first principles by evaluating the defining FT integral in Eq. (4.74) or by applying one or more of the FT properties in Eqs. (4.79)–(4.91) to manipulate a known FT to obtain the FT of a desired related signal. We have dealt in detail with both these methods in the preceding pages and two worked examples. Yet a third way to obtain the FT of a time domain waveform or the IFT of a frequency domain signal description is to read this information from a table of FT pairs for the most common standard signals, such as Table 4.5. If a desired signal is not in the table, it may still be possible to obtain its FT by manipulating a related entry in the Table according to the FT properties discussed. Worked Example 4.9
Analysis of Digital Transmission Pulses
In digital transmission, information is conveyed by transmitting energy signals of duration T s . These signals are referred to as pulses or symbols, and each unique pulse is associated with a unique group of bits. The symbol rate is Rs = 1/T s , and if each transmitted symbol represents k bits then the bit rate of the transmission system is Rb = kRs . The minimum number of unique symbols required to cover all possible combinations of k bits is M = 2k , hence the system is described as M-ary transmission. In the case of a modulated transmission system, the symbols are sinusoidal pulses of duration T s and the uniqueness of each symbol is achieved through a unique state of the carrier signal, corresponding to a unique set of values of the amplitude, phase, and frequency of the sinusoidal waveform that is constrained to duration T s to form the transmitted symbol. The constraining of an infinite-duration sinusoid to duration T s is achieved by multiplying the sinusoid by a suitable windowing pulse of duration T s . It is worth mentioning that in this role the sinusoid is called a carrier in obvious reference to its function of conveying or carrying information. In this worked example, we wish to examine the time and frequency domain effects of using the two windowing pulses, a rectangular pulse w1 (t) and a raised cosine pulse w2 (t), shown in Figure 4.33a and b, and to comment on their impact on symbol rate and interference considerations.
263
264
4 Frequency Domain Analysis of Signals and Systems
Table 4.5
Fourier transform pairs.
Time domain signal
Description
Fourier transform
1
𝛿(t)
Unit impulse, Eq. (2.28)
1
2
1
Unit constant
𝛿(f )
3
𝛿(t − to )
Delayed unit impulse
exp(−j2𝜋fto )
4
exp(j2𝜋f c t)
Complex exponential
𝛿(f − f c )
5
sgn(t)
Signum function, Eq. (2.13)
1/(j𝜋f )
6
1/𝜋t
7
u(t)
Unit step, Eq. (2.12)
1 1 𝛿(f ) + 2 j2𝜋f
8
rect(t∕𝜏)
Rectangular pulse, Eq. (2.16)
𝜏 sinc(f 𝜏)
9
sinc(t∕𝜏)
Sinc pulse with nulls at ±𝜏, ±2𝜏, ±3𝜏, …
10
trian(t∕𝜏)
Triangular pulse, Eq. (2.21)
𝜏 rect(f 𝜏) ( ) 𝜏 𝜏 sinc2 f 2 2
11
cos(2𝜋fc t)
1 1 𝛿(f − fc ) + 𝛿(f + fc ) 2 2
12
sin(2𝜋fc t) ( )] ( ) [ 2𝜋 t 1 1 + cos t rect 2 2 𝜏 𝜏 ( ) ( ) m𝜋 t cos t rect 𝜏 𝜏
1 1 𝛿(f − fc ) − 𝛿(f + fc ) 2j 2j 𝜏 𝜏 𝜏 sinc(f 𝜏) + sinc(f 𝜏 + 1) + sinc(f 𝜏 − 1) 2 4 4 [ ( ) ( )] m m 𝜏 sinc f 𝜏 + + sinc f 𝜏 − 2 2 2
13 14
(
−j sgn(f )
m𝜋 t 𝜏
)
( ) t 𝜏
Raised cosine pulse of duration 𝜏 Centred sinusoidal pulse with odd number m of half-cycles in −𝜏∕2 ≤ t ≤ 𝜏∕2
[ ( ) ( )] 𝜏 m m sinc f 𝜏 + − sinc f 𝜏 − 2 2 2
15
sin
16
exp(−𝜋t2 )
17
exp(−at)u(t),
18
exp(−a|t|),
19
exp(−at)tu(t),
20
cos(2𝜋fc t)u(t)
Single-sided cosine
21
sin(2𝜋fc t)u(t)
Single-sided sine
22
exp(−at) cos(2𝜋fc t)u(t)
Exponentially decaying single-sided cosine
1 (a + j2𝜋f )2 j2𝜋f 1 [𝛿(f − fc ) + 𝛿(f + fc )] + 4 (2𝜋fc )2 − (2𝜋f )2 (2𝜋f )2 1 [𝛿(f − fc ) − 𝛿(f + fc )] + 4j (2𝜋fc )2 − (2𝜋f )2 a + j2𝜋f (2𝜋fc )2 + (a + j2𝜋f )2
23
exp(−at) sin(2𝜋fc t)u(t)
Exponentially decaying single-sided sine
2𝜋fc (2𝜋fc )2 + (a + j2𝜋f )2
rect
Centred sinusoidal pulse with even number m of half-cycles in −𝜏∕2 ≤ t ≤ 𝜏∕2
j
Gaussian function
exp(−𝜋f 2 ) 1 a + j2𝜋f
a>0
2a a2 + (2𝜋f )2
a>0 a>0
4.3 Fourier Transform
|G1(f)|
Aτ
(a)
0 –5/τ
–4/τ
–3/τ
–2/τ
–1/τ
0
180
1/τ ∠G1(f), deg
2/τ
3/τ
4/τ
5/τ
f
0 –180 –5/τ
–4/τ
–3/τ
–2/τ
–1/τ
0
1/τ
2/τ
3/τ
4/τ
0 90
f
∠G2(f), deg
f
0 –90
5/τ
|G2(f)|
0.725Aτ2
(b)
f
–4/τ
–2/τ Aτ 2
0 |G3(f)|
2/τ
4/τ
(c) ∠G3(f) = 0˚
0
–4/τ –2/τ
0
2/τ
4/τ
f
Figure 4.32 (a): Worked Example 4.8: Amplitude and phase spectra of delayed and polarity-inverted rectangular pulse g1 (t) of duration 𝜏 (shown in Figure 4.31b) (b) and (c): Worked Example 4.8; (b) Amplitude and phase spectra of bipolar rectangular pulse g2 (t) of duration 𝜏; (c) Amplitude spectrum of triangular pulse g3 (t) of duration 𝜏.
We will assume a carrier that completes 20 cycles in one symbol interval, so the basic expression for the sinusoidal carrier is g(t) = cos(2𝜋f c t), with f c = 20Rs , where Rs = 1/T s is the symbol rate (in symbols per second, called a baud). Multiplying g(t), which is periodic and of infinite duration, by the given window functions w1 (t) and w2 (t) produces the respective sinusoidal pulses g1 (t) and g2 (t), of duration T s , shown in Figure 4.33c and d. We see that g2 (t) tapers to zero at symbol boundaries, whereas g1 (t) does not. Since the state of the carrier will change from one symbol interval to the next (depending on the identity of data bits in each interval), jump discontinuities will occur in g1 (t) when its phase or amplitude undergoes a sudden change at a symbol boundary. For example, a change in phase from 0∘ to 180∘ will cause a jump discontinuity equal to double the carrier amplitude. Such discontinuities give rise to higher-frequency components in the transmitted waveform, which causes interference into adjacent channels or a reduction in bit rate or in the number of users when trying to mitigate the problem.
265
266
4 Frequency Domain Analysis of Signals and Systems
w1(t)
1
w2(t)
1
(b)
(a)
0 –Ts/2
Ts/2
0
t
g1(t)
1
t
0
Ts/2
Ts/2
0
t
(d) 0
–1 –Ts/2
t
g2(t)
1
(c) 0
–1 –Ts/2
0 –Ts/2
0
Ts/2
Figure 4.33 Worked Example 4.9: Windowing pulses. (a) Rectangular pulse; (b) Raised cosine pulse; (c) Rectangular windowed sinusoidal pulse; (d) Raised cosine windowed sinusoidal pulse.
In contrast, the tapering of g2 (t) eliminates discontinuities and ensures a smooth transition of signal level through symbol boundaries, whatever the state of the carrier in each symbol interval. This reduces high-frequency content in the transmitted waveform and hence minimises interference into adjacent channels. Let us now turn to the frequency domain, with the help of the Fourier transform (FT), to provide a more quantitative analysis of the above time domain observations. We have the following FT pairs w1 (t) ⇌ W1 (f );
g1 (t) ⇌ G1 (f )
w2 (t) ⇌ W2 (f );
g2 (t) ⇌ G2 (f )
Our interest is in G1 (f ) and G2 (f ), but since g1 (t) and g2 (t) are, respectively, the result of multiplying w1 (t) and w2 (t) by cos(2𝜋f c t), it follows from Eq. (4.82) that the spectra of G1 (f ) and G2 (f ) will be the respective translations of W 1 (f ) and W 2 (f ) through ±f c . All the spectrum information about G1 (f ) and G2 (f ) is therefore contained in W 1 (f ) and W 2 (f ), except for spectral location, so it will be enough for us to examine W 1 (f ) and W 2 (f ) instead. The FT of a rectangular pulse is derived in Worked Example 4.7, from which we have W1 (f ) = Ts sinc(fTs ) The raised cosine pulse w2 (t) in Figure 4.33b is given by ( [ )] ( ) 2𝜋 1 1 t + cos t rect w2 (t) = 2 2 Ts Ts ( ) ( ) ( ) 1 1 1 t t + rect cos 2𝜋 • •t = rect 2 Ts 2 Ts Ts Taking the FT of both sides and (for the second term on the right-hand side) invoking the FT property in Eq. (4.82), which specifies the frequency domain effect of multiplying by a sinusoid – of frequency 1/T s in this
4.3 Fourier Transform
case, we obtain the following result quoted in Entry 13 of Table 4.5 [( [( ) ] ) ] T T T 1 1 Ts + s sinc f − Ts W2 (f ) = s sinc(fTs ) + s sinc f + 2 4 Ts 4 Ts T T T = s sinc(fTs ) + s sinc(fTs + 1) + s sinc(fTs − 1) 2 4 4
(4.95)
We calculate |W1 (f )| and |W2 (f )| using the above formulas at f = −25/T s to f = 25/T s in steps of 1/(100T s ) to give a smooth plot. In order to better display the small values and variations which in a linear plot would be indistinguishable from zero, we will use logarithmic units and plot the normalised amplitude spectra shown in Figure 4.34 and defined as the following quantities ) ) ( ( |W1 (f )| |W2 (f )| ; Z2 = 20 log10 Z1 = 20 log10 max(|W1 (f )|) max(|W2 (f )|) In the plot, the y axis is terminated at −50 dB to zoom in better on the region of interest. Note that the right-hand side of the amplitude spectra |G1 (f )| and |G2 (f )| will be identical to Figure 4.34 except for the addition of f c to the frequency values along the x axis. We see that g1 (t) contains strong and persistent sidelobes (and hence high-frequency components), with its 25th sidelobe still above −40 dB (relative to peak). In contrast, the sidelobes of g2 (t) decay very rapidly so that by the fourth sidelobe the amplitude level of frequency components is already below −50 dB. To understand the impact that this will have on digital transmission based on these two pulses, let B denote transmission bandwidth (which means that the occupied frequency range is f c − B/2 → f c + B/2) and let it be required to keep interference into adjacent channels below −30 dB. To satisfy this spec, B/2 must extend beyond 0 B
Z1, dB
–10 –20
Threshold interference level = –30 dB
–30 –40 –50 0
–20Rs
–10Rs
0
Z2, dB
–10
10Rs
20Rs
→f
B
–20
Threshold interference level = –30 dB
–30 –40 –50 –25Rs
–20Rs
–10Rs
0
10Rs
20Rs
→f 25Rs
Figure 4.34 Worked Example 4.9: Normalised amplitude spectra Z 1 and Z 2 of rectangular and raised cosine windows respectively. B indicates bandwidth after translation to location f c ≫ B.
267
268
4 Frequency Domain Analysis of Signals and Systems
0.5Ts 0.4Ts
Ts sinc( f Ts) 2
W2(f ) =
0.3Ts 0.2Ts
Ts T sinc( f Ts) + s sinc( fTs + 1) 4 4 Ts + sinc( f Ts – 1) 4
Ts sinc( f Ts – 1) 4
Ts sinc(f Ts + 1) 4
0.1Ts →f
0 –0.1Ts –6Rs Figure 4.35
–5Rs
–4Rs
–3Rs
–2Rs
–Rs
0
Rs
2Rs
3Rs
4Rs
5Rs
6Rs
Spectrum W 2 (f ) of a raised cosine pulse of duration T s is the sum of three component sinc pulses.
f c until the start of the sidelobe that first dips below −30 dB. Reading from Figure 4.34, we have { 10Rs , for g1 (t) B = 2 for g2 (t) 2Rs , This means that transmission bandwidth B would be 20 times the symbol rate when using g1 (t), compared to four times the symbol rate if using g2 (t). A larger required transmission bandwidth per user inevitably leads to a reduction in the number of users that can be simultaneously supported. If, on the other hand, allocated bandwidth B is fixed (as is typically the case), it means that we would be limited to operating at a symbol rate Rs ≤ B/20 when using g1 (t) and Rs ≤ B/4 with g2 (t). Therefore, using the raised cosine window would allow a fivefold increase in symbol rate (and hence bit rate) when compared to the rectangular window. The required solution is complete, but to give a little insight into the reason for the rapid decay of sidelobes in the raised cosine pulse, we show in Figure 4.35 a linear plot of W 2 (f ) in Eq. (4.95) along with its three component sinc pulses. We see that the polarity of the sidelobes of the two lower-amplitude (and frequency-shifted) sinc pulses and the polarity of the larger sinc pulse are opposite at every point, and this produces a rapid cancellation of sidelobes when they are summed to give W 2 (f ). Note, however, that, on the downside, there is a broadening of the main lobe of W 2 (f ) when compared to the lobe width of the larger sinc pulse. This is as a result of the main lobes of the two frequency-shifted sinc pulses extending beyond the main lobe of the larger sinc pulse on either side.
4.3.3 Fourier Transform of Periodic Signals We derived the FT by extending the Fourier series to apply to nonperiodic signals, and we did so by following the evolution of a signal from periodic gT (t) to nonperiodic g(t) as illustrated in Figure 4.28. It turns out that taking those steps in reverse will allow us to extend the FT to be applicable to periodic signals, thus closing the loop that connects the FT and the Fourier series. Every periodic signal gT (t) of period T is the result of repeating a pulse g(t) of duration T over and over along the time axis at regular intervals 0, ±T, ±2 T, ±3 T, …, where the pulse g(t) constitutes the fundamental shape of
4.3 Fourier Transform
the periodic signal. That is, g(t) is the waveform of gT (t) in the interval (−T/2, T/2), so we may write { gT (t), −T∕2 ≤ t ≤ T∕2 g(t) = 0, elsewhere
(4.96)
We assume that the pulse g(t) is an energy signal. It is therefore guaranteed to have an FT G(f ), the magnitude of which is shown in the bottom row of Figure 4.28 for the specific case of a triangular pulse. Other fundamental shapes of an arbitrary periodic signal will understandably have a different spectral shape to what is shown in Figure 4.28. We see in Figure 4.28 that the spectrum of gT (t), which we will denote as GT (f ), is a weighted sampling of G(f ) at a regular frequency spacing f o = 1/T. The sample at frequency f = nf o , n = 0, ±1, ±2, ±3, …, is a spectral line that represents a complex exponential of frequency nf o and coefficient Cn = f o G(nf o ), by definition of G(f ) in Eq. (4.72). We represent this spectral line as fo G(nf o )𝛿(f − nf o ) which is an impulse of weight f o G(nf o ) located at f = nf o . The FT of the periodic signal is a collection or sum of these impulses for all n. Thus GT (f ) = fo
∞ ∑
G(nf o )𝛿(f − nf o ),
fo = 1∕T
(4.97)
n=−∞
This is an important result. It indicates that, whereas the FT G(f ) of a nonperiodic signal g(t) is continuous, the FT GT (f ) of a periodic signal gT (t) is discrete, having spectral lines or impulses of weight f o G(nf o ) at discrete frequency points nf o . Each spectral line represents a complex exponential having frequency nf o , amplitude |f o G(nf o )|, and phase equal to the angle of f o G(nf o ), which is simply ∠G(nf o ) since f o is positive. For interested readers, an alternative and mathematically rigorous derivation of Eq. (4.97) may be obtained as follows. Let us express gT (t) as a complex exponential Fourier series according to Eq. (4.23) gT (t) =
∞ ∑
Cn ej2𝜋nf o t ,
fo = 1∕T
n=−∞
The coefficient Cn in the above equation is related to the FT G(f ) of the fundamental shape of gT (t) according to the definition of G(f ) in Eq. (4.72) as Cn = fo G(f )|f =nf o = fo G(nf o ) Making this substitution for Cn in the previous equation yields ∞ ∑
gT (t) = fo
G(nf o )ej2𝜋nf o t
(4.98)
n=−∞
We notice that the right-hand side is a sum of complex exponentials each of which is Fourier transformable according to entry 4 in Table 4.5 ej2𝜋nf o t ⇌ 𝛿(f − nf o ) Therefore, denoting the FT of gT (t) as GT (f ) and taking the FT of both sides of Eq. (4.98) yields GT (f ) = fo
∞ ∑
G(nf o )𝛿(f − nf o ),
fo = 1∕T
n=−∞
This is the earlier heuristically derived Eq. (4.97), and is simply an alternative way of expressing the double-sided discrete spectrum of a periodic signal.
269
270
4 Frequency Domain Analysis of Signals and Systems
4.4 Discrete Fourier Transform Our discussion has so far been focused on continuous-time (CT) signals, starting with the Fourier series for periodic CT signals and adapting this to obtain the FT for nonperiodic CT signals. The Fourier series may also be adapted as discussed below to obtain a discrete Fourier transform (DFT), which is applicable to discrete-time (DT) signals. Consider a DT signal represented by g[n] = {g(0), g(1), g(2), · · · , g(N − 1)}. This is a sequence of N samples obtained by sampling one cycle (of duration T) of a CT signal g(t) at regular intervals T s , known as the sampling interval. The sampling rate F s is therefore Fs =
1 ; Ts
⇒ Fs Ts = 1
(4.99)
The sampling is carried out by dividing interval T of g(t) into N equal sub-intervals of size T s . One sample is then taken at the start of each subinterval, yielding the samples g(0), g(1), g(2), …, g(N − 1). The samples may, however, also be taken at the midpoint of each subinterval, provided there is consistency so that all samples are uniformly spaced in time. Recall that g(t) is constituted of complex exponentials and Eq. (4.24) gives the coefficient of the kth complex exponential in g(t) as T∕2
Ck =
1 g(t)e−j2𝜋kf o t dt T ∫−T∕2
This kth complex exponential has frequency kf o = k/T, amplitude equal to the magnitude |Ck | of Ck (which is in general complex-valued), and phase equal to the angle ∠Ck of Ck . As a result of the time variable in g(t) becoming discrete in g[n], the right-hand side of the above equation for Ck will change as follows to describe the complex exponentials in g[n]: t becomes nT s , the impulse of weight g(t)dt becomes the sample g(nT s ) ≡ g(n), and the integral becomes a summation from n = 0 to N − 1. Thus Ck =
N−1 1 ∑ g(n)e−j2𝜋kf o nT s T n=0
Substituting T s = T/N and f o T = 1 and defining the DFT of the data sequence g[n], denoted G(k), as the coefficient per unit frequency interval of the kth of the complex exponentials that constitute the data sequence, i.e. G(k) = Ck /f o = Ck T, we obtain ∑
N−1
G(k) =
2𝜋
g(n)e−j N kn ,
k = 0, 1, 2, · · · , N − 1
(4.100)
n=0
Equation (4.100) defines the DFT of g[n]. The spacing of spectral lines, usually referred to as frequency resolution, in the spectrum G[k] is Δf = fo =
F 1 1 = = s T NT s N
(4.101)
where T is the total duration of g[n]; and the kth sample G(k) of the spectrum is located at frequency f k (in Hz) given by fk = kf o =
F k =k s T N
(4.102)
It is implicitly assumed that the sequence g[n] repeats outside the interval T within which the N samples {g(0), g(1), g(2), …, g(N − 1)} were taken. This means that g(n) = g(n ± N)
(4.103)
4.4 Discrete Fourier Transform
Thus, g[n] is a periodic DT signal having period N samples, and is said to complete one cycle (or 2𝜋 rad) in a span of N samples, so that its fundamental angular frequency (in rad/sample) is given by Ωo =
2𝜋 N
(4.104)
The angular frequency Ωk of the kth sample G(k) of the spectrum is 2𝜋 , (4.105) N and this is a factor in the exponent of the DFT expression. Equation (4.100) states that a data sequence g[n] = {g(0), g(1), g(2), · · · , g(N − 1)}, of length N, has a DFT G[k] = {G(0), G(1), G(2), · · · , G(N − 1)}, also of length N. The kth element G(k) of the DFT sequence is obtained as a weighted sum of g[n] using corresponding weights W[n] = {1, e−jΩk , e−j2Ωk , · · · , e−j(N−1)Ωk }. The weights are all of unit magnitude but of increasing angle −nΩk , so the essence of the weighting is simply to rotate each data sample g(n), treated as a phasor, through nΩk radians in the clockwise direction prior to summing to obtain G(k). In the case of G(0), all weights have unit magnitude and zero angle (since Ωk = kΩo = 0 × Ωo = 0), so G(0) is just the (unweighted) sum of the entire data sequence. To derive an expression for the inverse discrete Fourier transform (IDFT) which allows the exact data sequence g[n] to be recovered from the transform sequence G[k], recall the complex exponential Fourier series of g(t) Ωk = kΩo = k
g(t) =
∞ ∑
Ck ej2𝜋kf o t
k=−∞
and adapt this series to a data sequence by making the following changes: replace t with nT s ; note that g(t)dt ≡ g(n), so that g(t) ≡ g(n)/dt; replace dt with T s ; and recall that (by definition) G(k) = Ck T, so that Ck = G(k)/T, k = 0, 1, 2, …, N − 1. Introducing these changes into the above equation, and noting again that T s = T/N and f o T = 1, we obtain N−1 N−1 g(n) ∑ G(k) j2𝜋kf o nT s 1 ∑ e = = G(k)ej2𝜋kf o nT∕N Ts T T k=0 k=0
g(n) =
N−1 N−1 2𝜋 2𝜋 Ts ∑ 1 ∑ G(k)ej N kn = G(k)ej N kn T k=0 N k=0
That is g(n) =
N−1 2𝜋 1 ∑ G(k)ej N kn , N k=0
n = 0, 1, 2, … , N − 1
(4.106)
Equation (4.106) is the IDFT of G(k). It states that the original data sample g(n) is the weighted average of the transform sequence G[k] = {G(0), G(1), G(2), … , G(N − 1)} using corresponding weights W[k] = {1, ejΩn , ej2Ωn , … , ej(N−1)Ωn }, where Ωn = n(2𝜋∕N). The weighting simply rotates each transform sample G(k) through kΩn radians counterclockwise prior to averaging (i.e. summing the rotated samples and dividing by N) to obtain g(n). In the case of g(0), where Ωn = 0, there is no rotation prior to averaging, so g(0) is just the (unweighted) average of the entire transform sequence G[k]. To summarise, a discrete signal may be represented in two ways: either in the time domain by g[n], which shows the signal’s waveform features, or in the frequency domain by G[k], showing the signal’s spectral characteristics. g[n] and G[k] are said to form a DFT pair, a relationship denoted as g[n] ⇌ G[k] Note our careful choice of index notation, using n to denote a sample count in the time domain and k for a count in the frequency domain. Given g[n], then G[k] may be computed; and given G[k], then an exact reconstruction
271
272
4 Frequency Domain Analysis of Signals and Systems
of g[n] may be obtained using the pair of equations ∑
N−1
G(k) =
2𝜋
g(n)e−j N kn ,
k = 0, 1, 2, … , N − 1
(DFT)
n=0 N−1
2𝜋 1 ∑ G(k)ej N kn , N k=0
g(n) =
n = 0, 1, 2, … , N − 1
(IDFT)
(4.107)
We should point out that some literature will describe the DFT derived above as discrete-time Fourier series (DTFS), with the word series emphasising the discrete nature of the frequency domain representation G[k]. Such literature will also define the discrete-time Fourier transform (DTFT), an adaptation of the FT equations (Eqs. (4.74) and (4.75)) for a nonperiodic CT signal g(t), namely ∞
G(f ) =
∫−∞
g(t)e−j2𝜋ft dt
(FT)
∞
g(t) =
∫−∞
G(f )ej2𝜋ft df
(4.108)
(IFT)
for application to a nonperiodic DT signal g[n], which is the result of sampling g(t) at regular intervals T s or sampling rate F s = 1/T s . For completeness, we derive the DTFT equation by making the following changes in the above FT equation: g(t)dt → g(n); t → nT s = n/F s ; integration → summation over all n. This leads to ∞ ∑
G(f ) =
g(n)e−jn2𝜋f ∕Fs
(4.109)
(DTFT)
n=−∞
The DTFT may be expressed in terms of an angle parameter Ω in radians defined as the normalised frequency f (i.e. f divided by F s ) multiplied by 2𝜋 Ω = 2𝜋
f Fs
(4.110)
So, f = F s corresponds to Ω = 2𝜋, f = F s /2 corresponds to Ω = 𝜋, and DC corresponds to Ω = 0. In terms of Ω, the DTFT equation becomes G(Ω) =
∞ ∑
g(n)e−jnΩ
(4.111)
(DTFT)
n=−∞
However, to avoid any confusion we will stick to an exclusive use of f (in Hz) for frequency domain variable and t (in seconds) for time domain variable. Equation (4.111) is stated simply for completeness and to help you relate our discussion with other conventions that you may encounter in the literature. Returning therefore to Eq. (4.109) and replacing f with f + F s we find G(f + Fs ) =
∞ ∑
( −jn2𝜋
g(n)e
n=−∞ ∞
=
∑
( −jn2𝜋
g(n)e
f +Fs Fs
f Fs
)
)
=
∞ ∑
( −jn2𝜋
g(n)e
f Fs
)
e−jn2𝜋
n=−∞
n=−∞
= G(f ) This means that the DTFT G(f ) is periodic with period F s . So, one cycle of G(f ) lies in the frequency range (−F s /2, F s /2) and the rest of G(f ) are just replications of this cycle along the frequency axis at integer multiples of F s . This agrees with our discovery in Figure 4.20c about the frequency domain effect of instantaneous sampling at rate F s .
4.4 Discrete Fourier Transform
To obtain the inverse discrete-time Fourier transform (IDTFT) expression, we make the following changes to the IFT expression in Eq. (4.108): dt → T s ; g(t) → g(n)/dt = g(n)/T s = g(n)F s ; t → nT s = n/F s ; integration over (−∞, ∞) → integration over one cycle (−F s /2, F s /2) of G(f ). This gives F ∕2
g(n) =
s 1 G(f )ejn2𝜋f ∕Fs df Fs ∫−Fs ∕2
(4.112)
(IDTFT)
In terms of Ω in Eq. (4.110), we use df = dΩ•Fs ∕2𝜋;
f = Fs ∕2 → Ω = 𝜋 to obtain
𝜋
g(n) =
1 G(Ω)ejnΩ dΩ 2𝜋 ∫−𝜋
(4.113)
(IDTFT)
As earlier stated, we will stick to the use of the frequency variable f . For comparison, let us collect in one place the DFT (also called DTFS), IDFT, DTFT, and IDTFT expressions that we have derived above ∑
N−1
G(k) =
2𝜋
g(n)e−j N kn ,
k = 0, 1, 2, … , N − 1
(DFT)
n=0 N−1
g(n) =
2𝜋 1 ∑ G(k)ej N kn , N k=0
∞ ∑
G(f ) =
n = 0, 1, 2, … , N − 1
g(n)e−jn2𝜋f ∕Fs
(DTFT)
n=−∞ ∞
G(Ω) =
∑
g(n)e−jnΩ ,
(IDFT)
Ω = 2𝜋
n=−∞
f Fs
(DTFT)
F ∕2
g(n) =
s 1 G(f )ejn2𝜋f ∕Fs df Fs ∫−Fs ∕2
(IDTFT)
g(n) =
1 G(Ω)ejnΩ dΩ 2𝜋 ∫−𝜋
(IDTFT)
𝜋
(4.114)
Now consider the above DTFT expression. If the nonperiodic CT signal g(t), from which the samples g(n) are taken, is of duration T and occupies the interval t = (0, T) and it is sampled at interval T s to produce the data sequence g[n] = {g(0), g(T s ), …, g([N − 1]T s )} ≡ {g(0), g(1), …, g(N − 1)}, the values of g(n) in the DTFT expression will be zero outside the range n = (0, N − 1) so that the DTFT expression reduces to ∑
N−1
G(f ) =
g(n)e−jn2𝜋f ∕Fs
(DTFT)
n=0
If we take samples of the DTFT at frequency intervals f o = 1/T = 1/NT s = F s /N, the kth sample will be at frequency f = kF s /N, having a value, denoted G(k), that is obtained by substituting f = kF s /N into the above equation to obtain ) ( N−1 N−1 kF ∑ ∑ 2𝜋 F −jn2𝜋 Ns × F1 s = g(n)e g(n)e−j N kn ≡ DFT G k s ≡ G(k) = N n=0 n=0 The DFT is therefore a sequence of equally spaced samples of the DTFT, the samples being taken at frequency intervals (called frequency resolutions) equal to the reciprocal of the duration T of the DT signal represented by the DTFT. The DFT is therefore an appropriate tool for frequency domain representation of both periodic and nonperiodic data sequences, and we will have no need for DTFT or its inverse in this book. If the data sequence is periodic (of period T and fundamental frequency f o = 1/T) then the DFT G[k] provides the magnitude (per unit frequency) and phase of the (discrete set of) harmonic complex exponentials (at frequencies 0, f o , 2f o , …, (N/2)f o ) that constitute the periodic data sequence. We will see later that the rest of the elements of
273
274
4 Frequency Domain Analysis of Signals and Systems
g(1) g(0)
g[n] = {g (0), g (1), g (2), g (3)};
g(2)
N = 4;
g(3)
1 1 = 4Ts T
fo =
nTs Ts
Ts
Ts
Ts
T
g(1) g(0)
g(2) g(3)
Ts
Ts
Figure 4.36
Ts
{g[n], 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}; fo 1 1 = = N = 16; fo′ = T′ 4 16Ts nTs
Ts
T′
Appending zeros to improve the frequency resolution of the DFT of a data sequence g[n].
the sequence of G[k], for k = N/2 + 1, N/2 + 2, …, N − 1 are respective complex conjugates of the elements at N/2 − 1, N/2 − 2, …, 1 and correspond to frequencies kf o . If, on the other hand, the data sequence is nonperiodic (of duration T) then the DFT G[k] provides the magnitude (per unit frequency) and phase of samples of the continuum of harmonic complex exponentials that constitute the data sequence. In other words, a nonperiodic data sequence g[n] of duration T has a continuous spectrum given by the DTFT, and the DFT provides samples of this spectrum at resolution f o = 1/T. This resolution can be improved (i.e. f o made smaller) by increasing T. But since the data sequence g[n] and its original duration T and sampling rate F s (and hence number of samples N) are usually fixed, one way to change the resolution to a new value fo ′ is by appending zeros to g[n], as illustrated in Figure 4.36, where the DFT of the zero-padded data has a resolution that is one-quarter that of the DFT of g[n]. Worked Example 4.10 DFT of Simple Sequences We wish to calculate the DFT of the following simple sequences (a) A unit impulse 𝛿[n] = {1, 0, 0, 0, 0, 0, 0, 0} (b) Data sequence g[n] = {1, 0, 0, 1} (a) Number of samples N = 8, so angular frequency Ωo = 2𝜋/N = 𝜋/4, and Ωk = k𝜋/4. The kth transform sample G(k) is the weighted sum of 𝛿[n] using weights W[n] = {1, e−jk𝜋∕4 , e−jk𝜋∕2 , e−j3k𝜋∕4 , e−jk𝜋 , e−j5k𝜋∕4 , e−j3k𝜋∕2 , e−j7k𝜋∕4 } Therefore ∑
N−1
G(k) =
𝛿(n)W(n) = 1 × 1 + 0 × e−jk𝜋∕4 + 0 × e−jk𝜋∕2 + · · · + 0 × e−j7k𝜋∕4
n=0
=1 and hence G[k] = {1, 1, 1, 1, 1, 1, 1, 1}.
4.4 Discrete Fourier Transform
(b) In this case, number of samples N = 4, so Ωo = 2𝜋/N = 𝜋/2 and Ωk = k𝜋/2. The kth transform sample G(k) is the weighted sum of g[n] = {1, 0, 0, 1} using weights W[n] = {1, e−jk𝜋∕2 , e−jk𝜋 , e−j3k𝜋∕2 }. We therefore obtain G(0) = sum[{1, 0, 0, 1} × {1, e−jk𝜋∕2 , e−jk𝜋 , e−j3k𝜋∕2 }];
k=0
= sum[{1, 0, 0, 1} × {1, 1, 1, 1}] = 1 + 0 + 0 + 1 = 2 G(1) = sum[{1, 0, 0, 1} × {1, e−jk𝜋∕2 , e−jk𝜋 , e−j3k𝜋∕2 }];
k=1
−j3𝜋∕2
= 1 + cos(−3𝜋∕2) + j sin(−3𝜋∕2) √ = 1 + 0 + j = 2∠45∘
=1+e
G(2) = sum[{1, 0, 0, 1} × {1, e−jk𝜋∕2 , e−jk𝜋 , e−j3k𝜋∕2 }]; −j3𝜋
=1+e
k=2
= 1 + cos(−3𝜋) + j sin(−3𝜋)
=1−1+j×0=0 G(3) = sum[{1, 0, 0, 1} × {1, e−jk𝜋∕2 , e−jk𝜋 , e−j3k𝜋∕2 }];
k=3
−j9𝜋∕2
=1+e
= 1 + cos(−9𝜋∕2) + j sin(−9𝜋∕2) √ = 1 + 0 − j = 2∠ − 45∘ √ √ That is, G[k] = {2, 1 + j, 0, 1 − j} = {2, 2∠45∘ , 0, 2∠ − 45∘ }.
4.4.1 Properties of the Discrete Fourier Transform Properties of the DFT closely mirror those of the FT presented in Section 4.3.1, so we will not repeat them here. For example, the linearity property of DFT states that if
g1 [n] ⇌ G1 [k]
and
g2 [n] ⇌ G2 [k]
then
a1 g1 [n] + a2 g2 [n] ⇌ a1 G1 [k] + a2 G2 [k]
and the property of convolution in the time domain states that g1 [n] ∗ g2 [n] ⇌ G1 [k]G2 [k] Parseval’s theorem, earlier stated in Eq. (4.94) for CT signals, takes the following form for a DT signal g[n] of length N ∑
N−1
g2 (n) =
n=0
N−1 1 ∑ |G(k)|2 N k=0
(4.115)
There are, however, two features and properties that are specific to DFT. 4.4.1.1 Periodicity
G(k) = G(k + N)
(4.116)
This means that the DFT is periodic with period N. The frequency resolution of the spectrum G[k] is f o = 1/T, so this period corresponds to a frequency interval Nf o =
N N 1 = = = Fs T NT s Ts
275
276
4 Frequency Domain Analysis of Signals and Systems
which is as expected from sampling theorem, since sampling replicates the spectrum of the original CT signal at intervals of the sampling frequency F s . The periodicity property of Eq. (4.116) may be readily proved by substituting k + N for k on both sides of the DFT expression in Eq. (4.114) ∑
N−1
G(k + N) =
n=0 N−1
=
∑ ∑
2𝜋
2𝜋
g(n)e−j N kn e−j N Nn
n=0 2𝜋
g(n)e−j N kn e−j2𝜋n
n=0 N−1
=
∑
N−1
2𝜋
g(n)e−j N (k+N)n =
2𝜋
g(n)e−j N kn
(since e−j2𝜋n = 1)
n=0
= G(k)
4.4.1.2 Symmetry
G(N∕2 − m) = G∗ (N∕2 + m)
for real g[n]
(4.117)
The DFT G[k] = {G(0), G(1), G(2), …, G(N − 1)} of a real data sequence is a double-sided spectrum, with its amplitude spectrum having even symmetry about k = N/2, and its phase spectrum having odd symmetry about the same point. This means that the highest-frequency component in g[n] is fmax =
F N f = s 2 o 2
(4.118)
This is as expected in order to obey the sampling theorem and therefore avoid what is known as alias distortion. Substituting m = 0 in Eq. (4.117) gives G(N∕2) = G∗ (N∕2) which means that the value G(N/2) of the transform sequence at the highest-frequency component is always a real number if the data sequence g[n] is real. Alternatively, substituting k = N/2 in the DFT expression yields ( G
N 2
)
∑
N−1
=
2𝜋
g(n)e−j N
•
N •n 2
n=0 N−1
=
∑
∑
N−1
=
g(n)e−jn𝜋
n=0
g(n) cos(n𝜋)
n=0
which is real if g[n] is real. The symmetry property may be readily proved by substituting N/2 + m for k on both sides of the DFT expression in Eq. (4.114), which leads to ( G
) ( N−1 ) N−1 ∑ ∑ 2𝜋 −j 2𝜋 N +m n N +m = g(n)e N 2 = g(n)e−jn𝜋 e−j N mn 2 n=0 n=0
∑
N−1
=
n=0
2𝜋
g(n) cos(n𝜋)e−j N mn
4.4 Discrete Fourier Transform
Now substituting N/2 − m for k on both sides, we find that ( G
) ( N−1 ) N−1 ∑ ∑ 2𝜋 −j 2𝜋 N −m n N −m = g(n)e N 2 = g(n)e−jn𝜋 ej N mn 2 n=0 n=0
∑
N−1
=
2𝜋
g(n) cos(n𝜋)ej N mn
n=0
= G∗ (N∕2 + m) Note that the last line holds because the expression for G(N/2 − m) differs from that of G(N/2 + m) only in the sign of j, which indicates a complex conjugate relationship.
4.4.2 Fast Fourier Transform The DFT algorithm, as implemented by direct evaluation of the DFT formula in Eq. (4.114), is computationally inefficient because it requires N complex multiplications and N − 1 complex additions to obtain each transform sample, making a total of N 2 multiplications and N(N − 1) additions to compute the DFT. An N-point DFT with N = 1024, for example, would therefore require over a million complex multiplications and over a million complex additions to complete, which would be very time-consuming. A more efficient algorithm for computing the DFT and its inverse can be developed by observing that many of the calculations in the DFT expression produce the same result and therefore do not need to be done afresh each time. Denoting the factor e−j2𝜋∕N in the DFT expression as W N , we can highlight some of the inherent computational simplifications as follows WN = e−j2𝜋∕N WN0 = 1 WN2 = WN∕2 WNN = 1 N∕2
WN
= −1
3N∕4
WN
N∕4
WN
=j
= −j
(k+N∕2)
WN
= −WNk
WN(k+lN)(n+mN) = WNkn ;
l, m = 0, ±1, ±2, · · ·
(4.119)
It is a straightforward matter to derive each of the above relations. For example WNN = (e−j2𝜋∕N )N = e−j2𝜋N∕N = e−j2𝜋 = cos(−2𝜋) + j sin(−2𝜋) = 1 + 0 = 1 and WN2 = (e−j2𝜋∕N )2 = e−j4𝜋∕N = e−j2𝜋∕(N∕2) ≡ WN∕2 Fast Fourier transform (FFT) is the name given to a range of algorithms that exploit these simplifications to significantly speed up DFT computation [1–3]. The most commonly used method is the Cooley–Tukey FFT algorithm [1]. The radix-2k Cooley–Tukey algorithm (k = 1, 2, 3, 4) is applicable to an N-point DFT when N is an integer power of 2k . For example, the radix-16 algorithm is applicable to an N-point DFT for N = 256, 4096,
277
278
4 Frequency Domain Analysis of Signals and Systems
65 536, … since these are all integer powers of 16. If the original data sequence does not satisfy this requirement, trailing zeros are added to the sequence to make it of length 2kL , where L is an integer. The DFT is decomposed into 2k smaller DFTs each of size N/2k , and this decomposition is repeated successively until after L stages we reach N single-point DFTs where each transform is simply equal to the data sample. This is therefore a divide and conquer algorithm. If the decomposition is accomplished by going sequentially through the data sequence and taking one sample at a time to form the sequence for each of the smaller DFTs in turn, the technique is described as decimation-in-time (DIT) FFT. For example, the radix-2 DIT FFT algorithm decomposes the DFT of g[n] into two DFT expressions, one involving the even-numbered samples of g[n] and the other involving the odd-numbered samples. If, on the other hand, the data sequence is partitioned into sequential blocks that are allocated to each of the smaller DFTs, the technique is known as decimation-in-frequency (DIF) FFT. We discuss below the radix-2 DIF FFT algorithm. The DIF FFT algorithm is developed as follows by splitting the data sequence g[n] into two halves: n = 0 → N/2–1 and n = N/2 → N − 1 ∑
∑
N−1
G(k) =
n=0
∑
N−1
(N∕2)−1
g(n)WNkn =
g(n)WNkn +
n=0
g(n)WNkn
n=N∕2
Making a change of index m ≡ n − N/2 in the second summation, its summation limits become m = 0 → (N/2) − 1; k(m+N∕2) . We then return to using index n in this summation simply g(n) becomes g(m + N/2) and WNkn becomes WN by replacing m with n. Thus ∑
(N∕2)−1
G(k) =
n=0
∑
∑
(N∕2)−1
g(n)WNkn +
k(n+N∕2)
g(n + N∕2)WN
n=0
(N∕2)−1
=
kN∕2
[g(n) + (−1)k g(n + N∕2)]WNkn , since WN
N∕2
= (WN )k = (−1)k
n=0
Separating the even points of the transform {G(0), G(2), …, G(N − 2)} at which (−1)k = 1 from the odd points {G(1), G(3), …, G(N − 1)} at which (−1)k = −1 leads to )] ( N WN2mn g(n) + g n + 2
(N∕2)−1 [
G(2m) =
∑ n=0
∑
(N∕2)−1
=
mn ga (n)WN∕2 , since WN2mn = (WN2 )mn = (WN∕2 )mn
n=0
≡ Ga (m), m = 0, 1, · · · ,
N −1 2
And (N∕2)−1 [
G(2m + 1) =
∑ n=0
(N∕2)−1 [
=
∑ n=0
)] ( N WN(2m+1)n g(n) − g n + 2 ( )] N g(n) − g n + WNn WN2mn 2
(N∕2)−1 [(
=
∑ n=0
∑
( ] )) N mn g(n) − g n + WNn WN∕2 2
(N∕2)−1
=
n=0
mn gb (n)WN∕2 ≡ Gb (m), m = 0, 1, · · · ,
N −1 2
4.4 Discrete Fourier Transform
where Ga [m] is the (N∕2)-point DFT of the sequence {ga (n)}; ga (n) = g(n) + g(n + N∕2);
n = 0, 1, … , N∕2 − 1
Gb [m] is the (N∕2)-point DFT of the sequence {gb (n)}; and gb (n) = [g(n) − g(n + N∕2)]WNn n+N∕2
= g(n)WNn + g(n + N∕2)WN
;
n = 0, 1, … , N∕2 − 1
What we have achieved above is that an N-point DFT has been reduced to two (N/2)-point DFTs. This process is repeated until after L = log2 N stages we reach trivial N single-point DFTs where each transform is simply equal to the data sample. The following two worked examples for N = 4 and N = 8 should serve to clarify this algorithm further. Worked Example 4.11 DIF FFT for N = 4 We wish to use the DIF FFT algorithm to compute the DFT G[k] of data sequence g[n] of length N = 4 samples, where g[n] = {g(0), g(1), g(2), g(3)} ⇌ G[k] = {G(0), G(1), G(2), G(3)} Since N = 4, the algorithm is implemented in L = log2 4 = 2 stages as follows: STAGE 1, N = 4: we have one 4-point DFT, namely G[k] to compute, hence N = 4. The 4-point DFT is reduced to two 2-point DFTs denoted Ga [m] for even points of G[k], and Gb [m] for odd points. Since G[k] = {G(0), G(1), G(2), G(3)}, its even points are Ga [m] = {G(0), G(2)} and its odd points are Gb [m] = {G(1), G(3)}. Ga [m] is the DFT of the data sequence ga [n] obtained from the original sequence g[n] as ga [n] = {g(n) + g(n + N/2)}, n = 0 to N/2 – 1, which is n = 0, 1. Therefore, ga [n] = {g(0) + g(2), g(1) + g(3)}. Next, Gb [m] is the DFT of the data sequence gb [n] obtained from the original sequence g[n] as gb [n] = {g(n)WNn + n+N∕2 }; N = 4; n = 0, 1. Thus, gb [n] = {g(0)W40 + g(2)W42 , g(1)W41 + g(3)W43 }, which, with g(n + N∕2)WN W40 = 1, W41 = −j, W42 = −1, and W43 = j, reduces to gb [n] = {g(0) − g(2), j(g(3) − g(1))}. This completes stage 1, which we may summarise as follows ga [n] = {g(0) + g(2), g(1) + g(3)} ⇌ Ga [m] = {G(0), G(2)} gb [n] = {g(0) − g(2), j(g(3) − g(1))} ⇌ Gb [m] = {G(1), G(3)}
(4.120)
STAGE 2, N = 2: we now have two 2-point DFTs, Ga [m] and Gb [m] to compute, hence N = 2. Each DFT is reduced to two single-point DFTs. Reaching single-point DFTs indicates the end of the process. We use the subscript notations aa for the even point of Ga [m], ba for the odd point of Ga [m], ab for the even point of Gb [m], and bb for the odd point of Gb [m]. From Eq. (4.120), we see that the desired first transform point G(0) is the even point of Ga [m], which is denoted Gaa [m], and this is the DFT of gaa [n] obtained from ga [n] as gaa [n] = {ga (n) + ga (n + N/2)}, n = 0 to N/2 – 1, which is n = 0, since N = 2. Therefore, gaa [n] = {ga (0) + ga (1)} = g(0) + g(2) + g(1) + g(3). Again from Eq. (4.120), the desired second transform point G(1) is the even point of Gb [m], which is denoted Gab [m], and this is the DFT of gab [n] obtained from gb [n] as gab [n] = {gb (n) + gb (n + N/2)}, n = 0. Therefore, gab [n] = {gb (0) + gb (1)} = g(0) − g(2) + j(g(3) – g(1)). From Eq. (4.120), the third transform point G(2) is the odd point of Ga [m], which is denoted Gba [m], and this is the DFT of gba [n] obtained from ga [n] as n+N∕2
gba [n] = {ga (n)WNn + ga (n + N∕2)WN =
ga (0)W20
+
ga (1)W21
};
= ga (0) − ga (1)
N = 2;
n=0
279
280
4 Frequency Domain Analysis of Signals and Systems
Substituting the values of ga [n] from Eq. (4.120) gives gba [n] = ga (0) − ga (1) = g(0) + g(2) − g(1) − g(3) Finally, from Eq. (4.120), the fourth transform point G(3) is the odd point of Gb [m], which is denoted Gbb [m], n+N∕2 } = gb (0) − gb (1). and this is the DFT of gbb [n] obtained from gb [n] as gbb [n] = {gb (n)WNn + gb (n + N∕2)WN Substituting the values of gb [n] from Eq. (4.120) yields gbb [n] = g(0) − g(2) − j(g(3) − g(1)). This completes our work. To summarise, the DFT of the 4-point data sequence g[n] = {g(0), g(1), g(2), g(3)} is the 4-point transform sequence G[k] = {G(0), G(1), G(2), G(3)}, where G(0) ≡ Gaa [m] is the DFT of the single data point gaa [n] = g(0) + g(2) + g(1) + g(3), so G(0) = g(0) + g(2) + g(1) + g(3). Similar comments apply to the other computations presented above. Thus G(0) = gaa [n] = g(0) + g(1) + g(2) + g(3) G(1) = gab [n] = g(0) − g(2) + j(g(3) − g(1)) G(2) = gba [n] = g(0) + g(2) − g(1) − g(3) G(3) = gba [n] = g(0) − g(2) − j(g(3) − g(1)) If the data sequence g[n] = {1, 0, 0, 1} as in the previous worked example then we obtain G[k] = {G(0), G(1), G(2), G(3)} = {2, 1 + j, 0, 1 − j}, which agrees with the result obtained earlier. You are right to feel that the direct computation of the DFT in Worked Example 4.10 was more straightforward. The FFT algorithm is designed for computer code implementation (as will become clearer shortly) and it achieves a huge saving in computation time when N is large. Worked Example 4.12 DIF FFT for N = 8 We wish to use the DIF FFT algorithm to compute the DFT G[k] of data sequence g[n] of length N = 4 samples, where g[n] = {g(0), g(1), g(2), g(3), g(4), g(5), g(6), g(7)} G[k] = {G(0), G(1), G(2), G(3), G(4), G(5), G(6), G(7)} A more detailed explanation of the algorithm steps is given in the last worked example, which should be studied first before attempting this one. Since N = 8, the algorithm is implemented in L = log2 8 = 3 stages as follows: STAGE 1, N = 8: the even points of G[k] are Ga [m] = {G(0), G(2), G(4), G(6)} and this is the DFT of ga [n] = {g(n) + g(n + 4)}; n = 0, 1, 2, 3. Thus ga [n] = {g(0) + g(4), g(1) + g(5), g(2) + g(6), g(3) + g(7)} ⇌ Ga [m] = {G(0), G(2), G(4), G(6)} The odd points of G[k] are Gb [m] = {G(1), G(3), G(5), G(7)} which is the DFT of gb [n] = {g(n)W8n + g(n + 4)W8n+4 }; ≡ {gb (0), gb (1), gb (2), gb (3)};
n = 0, 1, 2, 3 gb (0) = g(0)W80 + g(4)W84 ; etc.
. STAGE 2, N = 4: decompose each of the 4-point DFTs, Ga [m] and Gb [m], into the following 2-point DFTs: Even points of Ga [m] gaa [n] = {ga (n) + ga (n + 2)} ⇌ Gaa [p] = {G(0), G(4)};
n = 0, 1
Odd points of Ga [m] gba [n] = {ga (n)W4n + ga (n + 2)W4n+2 } ⇌ Gba [p] = {G(2), G(6)};
n = 0, 1
4.4 Discrete Fourier Transform
Even points of Gb [m] gab [n] = {gb (n) + gb (n + 2)} ⇌ Gab [p] = {G(1), G(5)};
n = 0, 1
Odd points of Gb [m] gbb [n] = {gb (n)W4n + gb (n + 2)W4n+2 } ⇌ Gbb [p] = {G(3), G(7)};
n = 0, 1
STAGE 3, N = 2: take the even point and odd point of each of the above 2-point DFTs to decompose them into the following single-point DFTs gaaa [n] = {gaa (n) + gaa (n + 1)} ⇌ Gaaa [q] = G(0);
n=0
gbaa [n] = {gaa (n)W2n + gaa (n + 1)W2n+1 } ⇌ Gbaa [q] = G(4); gaba [n] = {gba (n) + gba (n + 1)} ⇌ Gaba [q] = G(2);
n=0
gbba [n] = {gba (n)W2n + gba (n + 1)W2n+1 } ⇌ Gbba [q] = G(6); gaab [n] = {gab (n) + gab (n + 1)} ⇌ Gaab [q] = G(1);
n=0
n=0
gbab [n] = {gab (n)W2n + gab (n + 1)W2n+1 } ⇌ Gbab [q] = G(5); gabb [n] = {gbb (n) + gbb (n + 1)} ⇌ Gabb [q] = G(3);
n=0
n=0
n=0
gbbb [n] = {gbb (n)W2n + gbb (n + 1)W2n+1 } ⇌ Gbbb [q] = G(7);
n=0
We may then work backwards from this final stage up to the first stage to express each transform point as a linear combination of samples of the data sequence g[n]. To demonstrate this for G(1): the stage 3 result is used to express G(1) in terms of gab [n], then the stage 2 result is used to replace gab [n] with gb [n], and finally the stage 1 result is used to replace gb [n] with g[n]. Thus G(1) = gab (n) + gab (n + 1);
n=0
= gab (0) + gab (1) = gb (0) + gb (2) + gb (1) + gb (3) = g(0)W80 + g(4)W84 + g(2)W82 + g(6)W86 + g(1)W81 + g(5)W85 + g(3)W83 + g(7)W87 1 = g(0) − g(4) − jg(2) + jg(6) + √ (1 − j)g(1) 2 1 1 1 − √ (1 − j)g(5) − √ (1 + j)g(3) + √ (1 + j)g(7) 2 2 2 The FFT algorithm may be conveniently and elegantly presented in the form of a signal flow graph, as shown in Figure 4.37 for the above 8-point DIF FFT. A signal flow graph is an interconnection of nodes and branches having the following characteristics: ● ● ● ●
Direction of signal flow along a branch is indicated by an arrow. The input of a node is the sum of signals in all branches entering that node. The signal in a branch exiting a node equals the input of that node. A branch modifies a signal flowing through it by the branch transmittance, which may be a multiplication by 1 (i.e. no change) if the branch has no label or coefficient, or a multiplication by the branch coefficient or label.
281
282
4 Frequency Domain Analysis of Signals and Systems
Output location: g(0)
+ W80
ga(1) + W41
g(1) W81 g(2) W82 g(3)
g(5)
W85
gaa(1) +
+
ga(3) +
+
g(6)
W86
W78
gbba(0) W21
+
gab(0) W40
gbab(0)
gab(1) +
+
W41 gb(2) W42 +
+
gaab(0) +
W20 W21
gbb(0) + W20
+ gabb(0) +
gbb(1) +
W43
Twiddle factor Figure 4.37
gaba(0) +
W20
+
gb(3) g(7)
+
gba(1) W43
gaaa(0) gbaa(0)
W21 gba(0)
+ gb(1)
+
W20
W40
gb(0) W84
gaa(0)
+
ga(2) W42 +
W83 g(4)
ga(0)
gbbb(0) W21
+
G(0)
000
G(4)
001
G(2)
010
G(6)
011
G(1)
100
G(5)
101
G(3)
110
G(7)
111
Butterfly
Signal flow graph for an 8-point DIF FFT.
The signal flow graph of the FFT algorithm consists of a repetitive structure, called a butterfly, with two inputs of the form x(n) and x(n + N/2) and two outputs y1 and y2 . One such butterfly is shaded in the bottom right side in Figure 4.37. The downward going diagonal and bottom branches of each butterfly has a coefficient known as m , where M = N at the first stage and decreases progressively by a factor of two a twiddle factor of the form WM through the stages, and m ≤ M − 1. There are N/2 butterflies per stage and log2 N stages. Each butterfly performs computations of the form y1 = x(n) + x(n + N∕2) n+N∕2
y2 = x(n)WNn + x(n + N∕2)WN
= [x(n) − x(n + N∕2)]WNn
This means that there are one complex addition, one complex subtraction, and one complex multiplication per butterfly, or a total of (N/2)log2 N complex multiplications and Nlog2 N complex additions and subtractions in the FFT algorithm. This FFT algorithm is therefore computationally much more efficient than the direct DFT computation algorithm, with a saving of a factor of 2 N/log2 N on complex multiplications and (N − 1)/log2 N on complex additions and subtractions. The output ports of the signal flow graph have been numbered in binary on the right-hand side of Figure 4.37 to show that the algorithm produces the transform sequence stored in bit-reversed order. If the algorithm output is stored in an N-element MATLAB vector Gk, where N is an integer power of 2 (as
4.4 Discrete Fourier Transform
G[k]
Complex conjugate
G*[k]
Ng*[n] FFT
Complex conjugate
Ng[n]
1/N
g[n]
IFFT Figure 4.38
Inverse FFT algorithm.
required) then the following MATLAB code will restore the right order so that G(0) is in Gk(1), G(1) is in Gk(2), G(2) is in Gk(3), and so on n = (0:N-1)'; Gk = Gk(1+bi2de(de2bi(n,log2(N)), 'left-msb')); We may devise a simple way to compute inverse DFT using the computationally efficient FFT algorithm developed above if we exploit the similarity between the DFT and IDFT expressions repeated below for convenience. ∑
N−1
G(k) =
2𝜋
g(n)e−j N kn ,
DFT
n=0 N−1
g(n) =
2𝜋 1 ∑ G(k)ej N kn , N k=0
IDFT
Taking the complex conjugate of both sides of the IDFT expression yields ∑
N−1
Ng∗ (n) =
2𝜋
G∗ (k)e−j N kn
(4.121)
k=0
The right-hand side of this Eq. (4.121) is similar in form to the DFT expression implemented by the FFT algorithm. It indicates that if we take the complex conjugate of G[k] before inputting it into the FFT algorithm then the output of the algorithm will be N times the complex conjugate of the data sequence g[n]. A block diagram of this inverse FFT algorithm is shown in Figure 4.38.
4.4.3 Practical Issues in DFT Implementation The DFT sequence G[k] = {G(0), G(1), …, G(N − 1)}, computed from N samples of a nonperiodic signal g(t) taken within an interval T to produce a data sequence g[n], gives the spectrum of g[n], being the magnitude (per unit frequency) and phase of the complex exponential components of the signal rect(t/T)g(t) at respective frequencies {0, Δf , 2Δf , …, (N/2)Δf }, where Δf = 1/T = 1/(NT s ) = F s /N is the frequency resolution, T s is the sampling interval, and F s = 1/T s is the sampling frequency. The spectrum is folded over at frequency f = (N/2)Δf Hz, as earlier discussed. There are several practical issues which should be taken into consideration when evaluating and interpreting the DFT. 4.4.3.1 Aliasing
Alias distortion is a phenomenon whereby a high-frequency component in g(t) is reproduced within the spectrum G[k] of the sampled signal g[n] at a false lower frequency. To avoid this distortion, the signal g(t) must be passed through an anti-alias filter to limit its maximum frequency component or bandwidth (whichever is lower) to f max prior to sampling. Note that a bandpass signal will contain frequency components much higher than the signal bandwidth, in which case f max here refers to bandwidth, whereas in a lowpass or baseband signal, the bandwidth and maximum frequency component are equal. Once f max has been set in this way by filtering, the sampling rate is chosen as F s ≥ 2f max to obey the sampling theorem and therefore avoid alias distortion.
283
4 Frequency Domain Analysis of Signals and Systems
4.4.3.2 Frequency Resolution
The spectrum G[k] is a sequence that represents a discrete selection – spaced apart in frequency by the frequency resolution Δf – of the continuum of complex exponentials that constitute g(t). It is therefore important to make Δf small in order not to skip over or miss a significant frequency component in g(t) and the resulting data sequence g[n]. With sampling rate F s already fixed as above by the sampling theorem and the span T of the data sequence also fixed, the frequency resolution Δf = 1/T = F s /N may be reduced by using a higher value of N in the DFT computation. This is achieved by appending N 1 trailing zeros to g[n] = {g(0), g(1), …, g(N − 1)} such that N + N 1 = 2m , and frequency points of G[k] form a faithfully representative selection from the continuum of frequency components of g(t). The frequency resolution issue is illustrated in Figure 4.39 in which the DFT of a sampled signal of duration T = 5 s is computed at two different resolutions Δf . The signal g(t) is the sum of two sinusoids at frequencies 1 Hz and 1.3 Hz with respective amplitudes 1 and 0.5. The data sequence g[n] was obtained by taking N = 512 samples of g(t) within the interval T = 5 s, which corresponds to a sampling rate of F s = 102.4 Hz that more than satisfies the sampling theorem, so there is no alias distortion issue. The middle plot shows the magnitude spectrum |G(k)| of g[n] computed at a resolution of Δf = 0.2 Hz, using the unpadded data sequence g[n]. The spectrum is shown in logarithmic units of dB relative to the maximum amplitude. We see that the frequency component of 1 Hz has been captured in |G(k)| and is depicted correctly at a relative amplitude of 0 dB, but the other frequency component of 1.3 Hz has been skipped over and missed in |G(k)| due to the poor resolution employed. In the bottom plot, the frequency resolution of the DFT computation was improved to Δf = 0.1 Hz by padding g[n] with 512 trailing zeros, so that Δf = F s /N = 102.4/1024 = 0.1 Hz. We see that both frequency components of 1 Hz and 1.3 Hz are
g(t)rect(t/T)
1.5 0
–1.5
T = 5s
|G(k)|, dB
0
Δf = 1/T = 0.2 Hz
–10 –20 –30 –40
–10
–5
0 |G(k)|, dB
284
0 k→
5
10
Δf = 0.1 Hz
–10 –20 –30 –40
–20
–10
0 k→
10
20
Figure 4.39 DFT analysis of 5s segment of g(t) at two different resolutions Δf using a rectangular window. Signal g(t) comprises 2 sinusoids at 1Hz and 1.3 Hz having relative amplitudes 0 dB and −6 dB, respectively.
4.4 Discrete Fourier Transform
now captured in the spectrum. However, the relative amplitude of the 1.3 Hz component is −5.2 dB (rather than its true −6 dB value). This is because of spectral leakage, which is further discussed below. 4.4.3.3 Spectral Leakage
Each spectral line of g(t) is replaced by a scaled version of the amplitude spectrum W(f ) of the windowing function w(t). The sidelobes of W(f ) will somewhat corrupt adjacent spectral components, a problem described as spectral leakage. The default windowing function w(t) = rect(t/T) has significant spectral sidelobes in W(f ), and this is the window employed in Figure 4.39. We see in the bottom plot of that figure that the frequency component at 1.3 Hz, which is present in g(t) at a relative amplitude of 20 log(0.5) = −6 dB, appears in the spectrum at a relative amplitude of −5.2 dB. This enhancement of its amplitude by 0.8 dB is due to interference from a strong spectral sidelobe of the 1 Hz component. Spectral leakage may be minimised by employing a windowing function that tapers at the ends and thus has smaller and more rapidly decaying sidelobes in its spectrum than those of the spectrum of a rectangular window. This makes spectral leakage into adjacent spectral lines negligible if these spectral lines are separated by more than the width of the main lobe of the windowing function. 4.4.3.4 Spectral Smearing
When the main lobes of adjacent spectral components overlap, there will be a smearing of their spectral lobes and hence a blurring of the distinction between the spectral components involved. Windows with suppressed sidelobes will have wider main lobes and therefore their reduction of spectral leakage is achieved at the expense of a higher level of spectral smearing. The default rectangular window corresponds to the sequence w(n) = 1, n = 0, 1, 2, …, N − 1, where NT s = T is the duration of the signal and F s = 1/T s is the sampling rate. Several tapered windows have been defined, the most common of which include: ●
Hamming window w(n) =
●
●
( ) 25 21 2𝜋n − cos ; 46 46 N −1
n = 0, 1, 2, · · · , N − 1
Raised cosine or Hanning window [ ( )] ) ( 1 2𝜋n n𝜋 w(n) = 1 − cos = sin2 ; 2 N −1 N −1
n = 0, 1, 2, · · · , N − 1
(4.123)
Blackman–Harris window ) ) ) ( ( ( 2𝜋n 4𝜋n 6𝜋n w(n) = a0 − a1 cos + a2 cos − a3 cos N −1 N −1 N −1 a0 = 0.35875; a1 = 0.48829; a2 = 0.14128; a3 = 0.01168; n = 0, 1, 2, · · · , N − 1
●
(4.122)
Kaiser–Bessel window ( √ I0
𝜋𝛼
w(n) = I0 (x) =
(
1−
2n N−1
)2
)
−1
I0 (𝜋𝛼) 1 2𝜋 ∫0
(4.124)
2𝜋
exp(x cos 𝜙)d𝜙 =
;
n = 0, 1, 2, · · · , N − 1
∞ 32 ∑ (x∕2)2m ∑ (x∕2)2m ≈ (m!)2 (m!)2 m=0 m=0
(4.125)
285
4 Frequency Domain Analysis of Signals and Systems
1
0
Amplitude, dB →
286
Rectangular window
Blackman-Harris window
T
T
0 –20 –40
1/T
–60 –80 –100 –120
Figure 4.40
0
Frequency →
0
Frequency →
Rectangular and Blackman–Harris windows and their amplitude spectra.
where I 0 is the zeroth order modified Bessel function of the first kind, and 𝛼 is a variable parameter that decides the trade-off between main lobe width and sidelobe level. Typically, 𝛼 = 3. Figures 4.40 and 4.41 show plots of the time domain functions and amplitude spectra of some of the above windows. Notice, for example, how the Blackman–Harris window has negligible sidelobes, with levels below −90 dB, but its main lobe is four times as wide as the main lobe of the rectangular window. The rectangular window, on the other hand, has the narrowest main lobe of all the windows but has the highest sidelobe levels, up to −13.3 dB for the first sidelobe. Figure 4.42a–c show the results of a high-resolution DFT analysis (at Δf = 1/32 Hz) of a 0.5 s segment of a signal g(t) that consists of two sinusoids of relative amplitudes 0 dB and − 5 dB and frequencies as specified in each plot. In (a) the signal was multiplied by a raised cosine window to extract the 0.5 s segment and N = 512 samples were taken to produce the data sequence g[n], which was then padded with 32 256 zeros before the FFT algorithm was used to compute the amplitude spectrum |G(k)| shown. A similar procedure was followed for plots (b) and (c), but with a rectangular window used in (b). We see in (a) that the two spectral components have been clearly detected by the DFT analysis and their lobes are distinct and at the correct amplitudes. There are no discernible effects of spectral leakage or spectral smearing. In (b), involving two spectral components at 10 and 14 Hz, the rectangular window avoids spectral smearing (due to its narrow main lobe), but there is discernible spectral leakage effect as the peak of the 14 Hz lobe is at −4 dB (an increase of 1 dB) and has been shifted slightly away from its true 14 Hz location. In (c), the raised cosine window, with its wide main lobe, creates a significant overlap between the main lobes of the 10 and 14 Hz spectral components. This produces significant spectral smearing, which blurs the identity of the 14 Hz component. Spectral smearing between the 10 and 14 Hz components was avoided in Figure 4.42b by using a rectangular window. However, if the spectral components are sufficiently close then even the rectangular windowed data will also experience spectral smearing. Padding the windowed data with more zeros will increase frequency resolution but is ineffective in reducing spectral smearing. One way to reduce spectral smearing (in order to be able to detect more closely spaced spectral components) is to increase the window duration T so that its main spectral lobe becomes proportionately narrower. This solution requires observing the original signal g(t) and data sequence g[n] over a longer duration T. Figure 4.42d demonstrates the effectiveness of a longer observation interval T in
4.4 Discrete Fourier Transform
1
Hanning (Raised-Cosine) window
Hamming window
T
T
0
Amplitude, dB →
0 –20 –40 –60 –80 –100 –120
–6/T –4/T –2/T 0 2/T Frequency →
Figure 4.41
(b)
–20
–14
0
0
10
20
–10
0
10
6/T
→f, Hz
→f, Hz
14
|G(k)|, dB
Spectral smearing
–40
→f, Hz
0
No spectral leakage; No spectral smearing
|G(k)|, dB
(d)
4/T
Spectral leakage; No spectral smearing
–25
(c)
–10
|G(k)|, dB
0
–6/T –4/T –2/T 0 2/T Frequency →
No spectral leakage; No spectral smearing
|G(k)|, dB –40
6/T
Raised-cosine and Hamming windows and their amplitude spectra.
0 (a)
4/T
–40
–14
–10
0
10
14
→f, Hz
Figure 4.42 High resolution DFT analysis of 0.5 s segment, (in (a) – (c)), and 1 s segment, (in (d)), of a signal g(t) that contains two frequencies f 1 = 10 Hz and f 2 at relative amplitudes 0 dB and −5 dB: (a) Raised-cosine window; f 2 = 20 Hz; (b) Rectangular window; f 2 = 14 Hz; (c) Raised-cosine window; f 2 = 14 Hz; (d) Raised-cosine window; f 2 = 14 Hz.
287
288
4 Frequency Domain Analysis of Signals and Systems
combating spectral smearing. The observation interval used for DFT analysis was doubled to T = 1 s in (d). As a result, the width of the main lobe of the raised cosine window was halved. Using this window on the data produced a spectrum free of both spectral leakage and spectral smearing between the 10 and 14 Hz components, as seen in Figure 4.42d. The width W of the main spectral lobe is different for each window, but may in general be expressed as the reciprocal of window duration T in the form W=
M T
(4.126)
where the value of M depends on window type. For example, from Figure 4.34 and Worked Example 4.9, M = 2 (the lowest possible) for a rectangular window and M = 4 for a raised cosine window. For two spectral components at frequencies f 2 and f 1 to be resolvable (i.e. spectral smearing between them is avoided) in a DFT analysis, their frequency difference must be no smaller than W. That is f2 − f1 =
M ; T
⇒
T=
M f2 − f1
(4.127)
Thus, for example, if we wish to resolve the spectral components in a signal segment down to a frequency difference of 1 Hz then the segment must be at least four seconds long (assuming a raised cosine window is employed). Of course, you may reduce the required segment length to two seconds by using a rectangular window, but spectral leakage may then cause a significant distortion in the results obtained. 4.4.3.5 Spectral Density and Its Variance
We now have all the tools we need to look at the distribution of the energy or power of a signal amongst its various frequency components. This information is important in revealing any hidden repetitive patterns within the signal, identifying the location of strong or dominant spectral components, guiding decision making in the allocation of system resources for signal coding across the various frequency bands of the signal, and providing a means to compute the amount of energy or power in all or selected sub-bands of the signal. It was established in Eq. (3.148) that the autocorrelation function Rg (𝜏) of signal g(t) is the convolution (in the time domain) of g(t) with its time-reversed version. That is Rg (𝜏) = g(𝜏) ∗ g(−𝜏) We now know from Eq. (4.91) that if we take the FT of both sides of the above equation the right-hand side of the result will be the multiplication of the FT of g(t), denoted G(f ), with the FT of g(−t), which we know from Eq. (4.84) to be G(−f ). Thus F[Rg (𝜏)] = G(f )G(−f ) If g(t) is a real signal then G(−f ) equals the complex conjugate of G(f ), according to Eq. (4.86), and this allows us to write F[Rg (𝜏)] = G(f )G∗ (f ) = |G(f )|2
(4.128)
From the discussion surrounding Eq. (4.74), if g(t) is in volts (V) then |G(f )| is in V/Hz and therefore the squared magnitude of the FT on the right-hand side of the above equation is a quantity in V2 /Hz2 , which (noting that Hz ≡ s−1 ) may be manipulated as follows J V 2 •s W •s V2 = = ≡ joules∕hertz = 2 Hz Hz Hz Hz
4.4 Discrete Fourier Transform
The squared magnitude of the FT of g(t) is therefore a quantity in units of energy per unit frequency, called the energy spectral density (ESD) of g(t) and denoted Ψg (f ). We conclude from Eq. (4.128) that the FT of the autocorrelation function of an energy signal is the ESD of the signal. That is Rg (𝜏) ⇌ |G(f )|2 = Ψg (f ) ≡ ESD in J∕Hz
(4.129)
The energy in an infinitesimally small frequency band (f , f + df ) is Ψg (f )df , and the total energy E of the signal is obtained by summing these contributions over the entire frequency axis −∞ < f < ∞ so that ∞
E=
∫−∞
∞
Ψg (f )df =
∫−∞
|G(f )|2 df
∞
=2
|G(f )|2 df
∫0
(even symmetry)
(4.130)
Since signal bandwidth is the range of significant positive frequencies in the signal, it is important to note that Energy per unit bandwidth = 2 × ESD
(4.131)
Equating the above frequency domain expression for energy with the time domain formula given in Eq. (3.101) for the energy of CT signals leads to Parseval’s theorem stated in Eq. (4.94) and before that in Eq. (3.102). We may extend the concept of spectral density to power signals g(t), including ergodic (and hence stationary) random processes, by applying the above analysis to a segment of g(t) of duration 𝕋 , spanning the interval −𝕋 ∕2 ≤ t ≤ 𝕋 ∕2. The FT of this segment of g(t) is denoted G𝕋 (f ) so that its energy per unit frequency is |G𝕋 (f )|2 as earlier established. Since power is the average rate of energy, or energy per unit of time, the power per unit frequency is |G𝕋 (f )|2 ∕𝕋 , a quantity which has units of (J/Hz)/s or W/Hz. The interval of analysis is then extended to cover the entire signal g(t) by letting 𝕋 → ∞. The power spectral density (PSD) of g(t), denoted Sg (f ), is therefore given by |G𝕋 (f )|2 (4.132) 𝕋 →∞ 𝕋 If the power signal is periodic, with period T, then it is represented by the complex exponential Fourier series of Eq. (4.23), which shows that the signal contains complex exponentials of magnitude |Cn |, given by Eq. (4.24), and power |Cn |2 at frequencies nf o for n = 0, ±1, ±2, ±3, … The frequency spacing is f o , and the power per unit frequency is a function of nf o given by Sg (f ) = lim
Sg (nf o ) =
|Cn |2 = |Cn |2 T fo
(4.133)
Seeing in Eq. (3.114) that the definition of autocorrelation function for a power signal has the same form as that for an energy signal except for the factor lim 𝕋1 , it follows by the same steps as for energy signals that the 𝕋 →∞ autocorrelation function of a power signal and its PSD form a FT pair, so we may write Rg (𝜏) ⇌ Sg (f ) ≡ PSD in W∕Hz
(4.134)
The power P of g(t) is obtained by integrating its PSD over all frequencies. Thus ∞
E=
∫−∞
∞
Sg (f )df = 2
∫0
Sg (f )df
(4.135)
And this also means that Power per unit bandwidth = 2 × PSD
(4.136)
It is important to always remember the relationship stated Eq. (4.136). For example, white noise has a constant power per unit bandwidth denoted N o , so the PSD of white noise is N o /2, but the total white noise power in a bandwidth B is P = N o B.
289
290
4 Frequency Domain Analysis of Signals and Systems
0
N/4
N/2
3N/4
N
(a) Three half-overlapping segments: N = 2m + 1 0
N/8
N/4
3N/8
N/2
5N/8
3N/4
7N/8
N
One data segment of length M = 2m (b) Seven half-overlapping segments: N = 2m + 2 Figure 4.43 Partitioning of data of length N into half-overlapping segments. Both the data length and the segment length are integer powers of 2: (a) three half-overlapping segments; and (b) seven half-overlapping segments.
A plot of spectral density as a function of frequency is often called a periodogram. When applied to nondeterministic signals, such as a random process or data contaminated by noise, the periodogram will exhibit spectral variance because the exact values of the transform sequence G[k] computed on the data sequence g[n] will somewhat vary from one sample function (i.e. observation over duration 𝕋 ) of the random process to the next or due to random changes in the values of the contaminating noise samples. One way to reduce this variance and come closer to a true estimate of the spectral density of such signals is to compute the periodogram over each of several half-overlapping segments of the data and then to average the computed periodograms. Each data segment is multiplied by an appropriate window function, as earlier discussed, before its DFT is computed. Figure 4.43 shows the partitioning scheme for three and seven half-overlapping segments. If both the original data length N and the segment length M are required to be integer powers of two then there are strict constraints on the values of N and M and the number S of half-overlapping data segments. For example, a 1024-length data sequence can be partitioned into three half-overlapping segments each of length 512. A 2048-length data sequence can be partitioned into seven half-overlapping 512-length segments. And so on. In general, the possible values are S = 2k − 1; m
M=2 ;
k = 2, 3, 4, · · · m ≡ positive integer ≥ 5
m+k−1
(4.137)
N=2
If the samples of the N-length data sequence are numbered n = 0, 1, 2, …, N − 1, and the segments are numbered i = 0, 1, 2,…, S − 1, then the ith segment will span the data samples n=
N N i → n = k (i + 2) − 1; 2k 2
i = 0, 1, 2, … , S − 1
(4.138)
4.5 Laplace and z-transforms
Magnitude, dB
0 –5 –10 –15 –20 0 Figure 4.44
50
100
150
200
250
→ f, Hz
Periodogram of noise contaminated double-sinusoid signal.
Often, we will know the data length N and wish to determine the segment length M and number of segments S. In that case, we work backwards in Eq. (4.137) and write (4.139)
k + m = 1 + log2 N
We then choose k to give the required number of segments as S = 2k − 1. This choice of k leaves us with only one possible value for m = 1 + log2 N − k. For example, if N = 4096 = 212 , then k + m = 13; and if we choose k = 3 (for seven half-overlapping segments) then m = 10, which gives M = 210 = 1024. That is, a 4096-length data may be partitioned into seven half-overlapping 1024-length segments. Other segment lengths are possible by choosing a different value for k. Figure 4.44 shows the periodogram (in dB relative to maximum level) of a noise contaminated data sequence that contains two sinusoids at frequencies 100 and 200 Hz with relative amplitudes 0 dB and − 5 dB. The analysis was carried out as above over an 8 s segment of the data, using a raised cosine window and F s = 512 Hz, N = 4096, S = 7, M = 512.
4.5 Laplace and z-transforms In the previous three sections, we discuss Fourier analysis in detail with an emphasis on its engineering applications and the ability not only to formulate or derive the mathematical expressions but to graphically illustrate, interpret, and apply the results. This foundation is enough for our needs of communication systems analysis and design in this book. However, for the sake of completeness and to see how they fit in with Fourier techniques, there are two related transforms that we must briefly introduce.
4.5.1 Laplace Transform The Laplace transform (LT) is defined for an arbitrary function or signal g(t) as ∞
G(s) =
∫−∞
g(t)e−st dt, s = 𝜎 + j𝜔
(4.140)
The parameter s is a complex variable with real part 𝜎 and imaginary part 𝜔 ≡ 2𝜋f that takes on values in the so-called s-plane shown in Figure 4.45a. Explicitly using these real and imaginary parts in the above definition allows us to express the LT in the form ∞
G(s)|s=𝜎+j2𝜋f =
∫−∞
∞
[g(t)e−𝜎t ]e−j𝜔t dt ≡
∫−∞
[g(t)e−𝜎t ]e−j2𝜋ft dt
(4.141)
291
4 Frequency Domain Analysis of Signals and Systems
jω
re gio n
Im (z)
Un sta bl e
re gio n St ab le
292
σ
f=0
f = Fs/2
Re(z)
Unit circle (b) z-plane
(a) s-plane Figure 4.45
f = Fs/4
s- and z-planes.
Thus, the LT of g(t) is the Fourier transform (FT) of g(t)e−𝜎t . So, provided g(t) is a practical signal (i.e. it is finite and does not have an infinite number of discontinuities) then its LT will converge if the FT of g(t)e−𝜎t exists, and this means if g(t)e−𝜎t is absolutely integrable as follows ∞
∫−∞
|g(t)e−𝜎t |dt < ∞
(4.142)
The LT therefore exists for some causal signals g(t)u(t) that do not have a FT. In such cases, the region of convergence (ROC) of the LT is the region 𝜎 ≥ 𝜎 1 in the s-plane where 𝜎 has a large enough value to make g(t)e−𝜎t decay sufficiently rapidly with time to satisfy Eq. (4.142). Setting 𝜎 = 0 in Eq. (4.141), we find that the LT of g(t) becomes identical with the FT of g(t). So, provided g(t) is absolutely integrable, we may write (4.143)
G(f ) = G(s)|𝜎=0
The LT is therefore a generalisation of the FT, both being applicable to CT signals and systems. If the ROC of the LT of a signal includes the imaginary axis in the s-plane and the LT is evaluated exclusively along this imaginary axis, the result of this computation is the FT of the signal, which provides complete information about the magnitudes and phases of the frequency components of the signal, as previously studied in detail. The existence of the LT for a wide range of signals, and its various features, such as transforming convolution into multiplication and linear differential equations into algebraic equations, make the LT a very effective tool for the analysis and design of electrical, control, and other CT systems. The LTs of standard functions have been widely tabulated and simple methods such as partial fraction expansion may be used for the inverse operation of determining the corresponding t-domain signal for a given s-domain function without the need to evaluate the inverse LT integral. We should mention that Eq. (4.140) defines the so-called bilateral Laplace transform. If the signal is causal, or if we arbitrarily choose the time of interest to start at t = 0, then the unilateral Laplace transform is enough, and is defined for a function g(t) as ∞
G(s) =
∫0
g(t)e−st dt,
s = 𝜎 + j𝜔
(4.144)
4.5.2 z-transform The z-transform (ZT) G(z) of a data sequence g[n] is defined as G(z) =
∞ ∑ n=−∞
g(n)z−n
(4.145)
4.5 Laplace and z-transforms
where z is a complex variable that takes on values in the so-called z-plane as shown in Figure 4.45b. Recalling the DTFT expression in Eq. (4.114) G(Ω) =
∞ ∑
g(n)e−jnΩ
n=−∞
where Ω = 2𝜋
f Fs
(4.146)
we see that the ZT is indeed a generalisation of the DTFT. More specifically, when the ZT is evaluated at z = e−jΩ it reduces to the DTFT. That is (4.147)
G(Ω) = G(z)|z=e−jΩ
Since ejΩ = 1∠Ω, which is the unit circle (centred at the origin) in the z-plane, it follows that evaluating the ZT of g[n] exclusively along this unit circle yields the DTFT of g[n]. Points along this circle correspond to all the frequency components of g[n], according to Eq. (4.146), which is repeated below with f as subject Ω F (4.148) 2𝜋 s So, the DC component (i.e. f = 0) is at Ω = 0 and the highest-frequency component of g[n], which is of value f = F s /2 in view of the sampling theorem, is at Ω = 𝜋 radian. To further emphasise: a nonperiodic data sequence g[n] obtained by sampling at rate F s is composed of a continuum of complex exponentials – complex conjugate pairs of which form a sinusoid of the same frequency. If we know the ZT G(z) of this data sequence then we can obtain the magnitude (per unit frequency) and phase of these complex exponentials at any desired frequency as the magnitude and angle of the complex number f =
G(z)|z=1∠2𝜋f ∕Fs For example, if F s = 500 Hz, the maximum frequency component of the data sequence is 250 Hz, and the magnitude and phase of the 125 Hz component of the data sequence is provided by the value of the ZT at angle 90∘ counterclockwise along the unit circle. Note that the maximum value of Ω is 𝜋 radian, and that the bottom half of the unit circle in Figure 4.45b is for values of Ω from 0 rad to −𝜋 rad clockwise (i.e. not Ω = 𝜋 to 2𝜋), and correspond to the negative frequencies from 0 to −F s /2 which form the complex conjugates for the components represented by corresponding points along the upper half of the unit circle. Returning to Eq. (4.145) and expressing z in its polar form z = r∠Ω = rejΩ , where r is the magnitude of z (not necessarily 1) and Ω is its angle, we obtain G(z) =
∞ ∑
[g(n)r −n ]e−jnΩ ,
z = rejΩ
(4.149)
n=−∞
This indicates that the ZT of g[n] is the DTFT of g[n]r −[n] . Thus, the ZT may exist for certain data sequences that do not have a FT, and this existence will be in a region of the z-plane (outside the unit circle r = 1) where r is large enough to make the sequence g[n]r −[n] decay sufficiently rapidly to zero as n increases. It is often the case that the data sequence g[n] is zero for n < 0, so that the one-sided or unilateral z-transform is applicable, defined as G(z) =
∞ ∑
g(n)z−n
(4.150)
n=0
The relationship between the ZT and the FT should now be very clear from the above discussions. The ZT is to DT signals and systems what the LT is to CT signals and systems. It facilitates easy solutions for difference
293
294
4 Frequency Domain Analysis of Signals and Systems
equations (the discrete equivalent of differential equations) that describe, for example, digital filters and is a very effective tool for the analysis and design of DSP systems. A relationship also exists between the ZT and the LT which may be shown by mapping the s-plane into the z-plane through the substitution z = esT s = e(𝜎+j𝜔)Ts = e𝜎Ts ej𝜔Ts = e𝜎Ts ∠𝜔Ts = e𝜎Ts ∠2𝜋f ∕Fs
(4.151)
The following may be deduced from Eq. (4.151): When 𝜎 = 0 (in the s-plane), z = ej𝜔Ts = 1∠2𝜋f ∕Fs in the z-plane. Thus, the y axis in the s-plane corresponds to the unit circle in the z-plane. This is as expected since the FT is computed along the y axis (i.e. the j𝜔 axis) in the s-plane, but along the unit circle in the z-plane. The angle of z is 2𝜋f /F s , so an increase of frequency f from f = 0 to f = F s /2 completes a half-cycle counterclockwise around the top half of the unit circle, whereas a decrease in f from 0 to −F s /2 completes a half-cycle clockwise around the bottom half of the unit circle. This relationship between frequency and angular displacement along the unit circle was earlier noted when discussing the link between the DTFT and the ZT. Repeated cycles around this circle will merely repeat the same values, consistent with the F s -periodic nature of the spectrum of sampled signals and systems. When 𝜎 < 0 (which corresponds to the left half of the s-plane), the magnitude of z is |z| = e𝜎Ts < 1, since the sampling interval T s is a positive quantity. But |z| < 1 corresponds to points inside the unit circle. Thus, the entire left half of the s-plane is mapped into the unit circle in the z-plane. A stable causal system must have all its poles in the left half of the s-plane, so if a causal system is to be stable then the ZT of its impulse response must have all poles inside the unit circle and none on or outside the circle. Equation (4.150) indicates that the ZT is a power series in z−1 . When the ZT of a signal is expressed as a power series then the identity of the signal is obvious as simply the coefficients of the series. For example, if G(z) = 8 + 2z−1 + z−2 , then g[n] = {8, 2, 1, 0, 0, …} at respective instants n = {0, 1, 2, …}. It therefore follows that multiplication by z−k has the effect of introducing a delay of k sampling intervals. The ROC of the ZT is the range of values of z where the power series of Eq. (4.150), i.e. g(0)+ g(1)z−1 + g(2)z−2 +· · ·, converges so that G(z) is finite. Values of z at which G(z) is infinite are known as the poles of G(z), whereas values of z at which G(z) = 0 are known as the zeros of G(z). For a causal finite-duration signal such as an energy signal, the summation in Eq. (4.150) yields a finite value except at z = 0, so the ROC is everywhere in the z-plane except z = 0. But for causal infinite-duration sequences, there may be poles at locations other than zero. In that case, the ROC is defined as everywhere outside a circle centred at the origin and having a radius equal to the pole of largest magnitude. That is, draw a line from the origin to the pole that is furthest from the origin, and the ROC is everywhere outside the circle centred at the origin and having that line as its radius. The inverse z-transform (IZT), denoted Z −1 , is an operation that yields the DT sequence g[n] given its ZT G(z). That is g[n] = Z −1 [G(z)]
(4.152)
The most common form for expressing the ZT is as a ratio of two polynomials in z−1 . In this form, the inverse ZT may be obtained using several methods such as power series expansion or partial fraction expansion. A discussion of these methods, though straightforward, is beyond the scope of this book. The interested reader is referred to any one of the many good textbooks on signals and systems and DSP, such as [4 and 5]. Worked Example 4.13 z-transform We wish to derive the ZT (in closed-form) of the following sequences: (a) The unit impulse 𝛿[n]. (b) The unit step sequence u[n].
4.6 Inverse Relationship Between Time and Frequency Domains
(a) Recalling that 𝛿[n] = 0 everywhere except at n = 0 where it equals 1, the ZT of 𝛿[n] follows straightforwardly from the definition of ZT as ∞ ∑ Z[𝛿(n)] = 𝛿(n)z−n n=0
= 𝛿(0)z0 + 𝛿(1)z−1 + 𝛿(2)z−2 + · · · = 1 × 1 + 0 × z−1 + 0 × z−2 + · · · =1 This result is independent of z, so the transform’s ROC is everywhere in the z-plane. (b) Applying the ZT definition gives Z[u(n)] =
∞ ∑
u(n)z−n =
n=0
= 1 + z−1 + z
∞ ∑
1•z−n
n=0 −2 −3
+z
+···
≡ S∞ The sum denoted S∞ is a geometric series having first term 1 and constant ratio z−1 . Writing down the equation for the sum of the first N + 1 terms of this series and also a second equation obtained by multiplying both sides of the first equation by z−1 SN =
N ∑
z−n = 1 + z−1 + z−2 + z−3 + · · · + z−N
n=0
z−1 SN = z−1 + z−2 + z−3 + · · · + z−N + z−(N+1) Now subtracting the two equations SN (1 − z−1 ) = 1 − z−(N+1) Thus SN =
1 − z−(N+1) 1 − z−1 1 − 1∕z(N+1) N→∞ 1 − z−1
S∞ = lim SN = lim N→∞
=
1 , 1 − z−1
|z| > 1
Multiplying top and bottom of the right-hand side by z yields the required transform z , |z| > 1 Z[u(n)] = z−1
4.6 Inverse Relationship Between Time and Frequency Domains We have learnt that signals may be fully specified either in the time domain or in the frequency domain. The former gives the instantaneous values of the signal as a function of time, e.g. g(t), whereas the latter specifies the spectrum of the signal. In general the spectrum is given by the FT G(f ) of the signal, but in the special case of a periodic signal having period T, the spectrum is specified as the amplitude An and phase 𝜙n of each sinewave (of frequency nf o ) that adds to make up the signal, where f o = 1/T. Once we specify g(t) then G(f ) can be determined by Fourier
295
296
4 Frequency Domain Analysis of Signals and Systems
analysis. If G(f ) is specified then g(t) follows from Fourier synthesis. Thus, the time and frequency domains provide alternative methods of describing the same signal. There are important general relationships between the features of a signal waveform g(t) and the features of its spectrum G(f ). The shorter the time duration of a signal, the broader its spectrum. Observe in Figure 4.29 that the null bandwidth Bn (considered as the range of significant positive frequencies) of the rectangular pulse of duration 𝜏 is Bn =
1 𝜏
(4.153)
Thus, as demonstrated in Figure 4.46, if we expand the duration of a signal then its spectrum contracts by the same factor and vice versa, in such a way that the product of signal duration and signal bandwidth, called the time-bandwidth product, remains constant. The value of this constant depends on pulse shape. In the special case of a rectangular pulse, Eq. (4.153) and Figure 4.46 show that the constant is 1, which is the lowest possible for all pulse shape types, but this comes at the expense of stronger spectral sidelobes (beyond the main lobe). It can be seen from the bottom three plots in Figure 4.46 that the triangular pulse has a time-bandwidth product equal to 2. Note that we have used a null bandwidth definition for this discussion. The choice of a different bandwidth definition will change the value of the constant but not the inverse relationship. A signal cannot be both strictly band-limited and strictly duration-limited. If the spectrum of a signal is precisely zero outside a finite frequency band then the time domain waveform g(t) will have infinite duration, although g(t) may tend to zero as t → ∞. Conversely, if g(t) is precisely zero outside a finite time duration then the spectrum G(f )
g1(t)
|G1(f)|
τ = 1 ms
⇌
Bn = 1 kHz f
t
g2(t)
|G2(f)|
⇌
τ = 0.5 ms
Bn = 2 kHz f
t
g3(t)
|G3(f)|
⇌
τ = 0.25 ms
Bn = 4 kHz
t
f |G4(f)|
g4(t) τ = 1 ms
⇌
Bn = 2 kHz f
t g5(t)
|G5(f)|
τ = 0.5 ms
⇌
Bn = 4 kHz f
t g6(t) τ = 0.25 ms
⇌
|G6(f)| Bn = 8 kHz
t Figure 4.46
Null bandwidths of rectangular and triangular pulses of various durations.
f
4.7 Frequency Domain Characterisation of LTI Systems
g1(t)
τ
|G1(f)|
f
t |G2(f)|
g2(t)
τ
t
f
Figure 4.47 Comparing the effect of sudden transitions in signals. There are more significant higher frequency components in g1 (t) than in g2 (t).
will carry on and on, although |G(f )| will tend to zero as f → ∞. For example, a (duration-limited) rectangular pulse has a sinc spectrum, which carries on and on, and a (strictly band-limited) rectangular spectrum belongs to a sinc pulse. Sudden changes in a signal produce high-frequency components in its spectrum. Figure 4.47 shows the spectra of two pulses of the same duration, but of different shapes. One pulse is rectangular and has sharp transitions between zero and maximum value, whereas the other pulse is cosine shaped with no sharp transitions. It can be observed that the amplitudes of the higher-frequency components decay less rapidly in the rectangular pulse. A signal that is periodic in the time domain will be discrete in the frequency domain with the spectrum consisting of spectral lines spaced apart by the reciprocal of the signal’s time domain period. Each spectral line in a single-sided spectrum represents a harmonic sinusoidal component of the signal. A signal that is nonperiodic in the time domain will be continuous in the frequency domain with the spectrum containing a continuum of spectral lines. If a signal is continuous in the time domain then it will be nonperiodic in the frequency domain. Apart from the special cases of an impulse function and white noise, the amplitude spectrum of a CT signal will eventually tend to zero as frequency increases and will certainly not have a periodic pattern. Finally, if a signal is discrete in the time domain then it will be periodic in the frequency domain. A DT signal is usually the result of taking regular data samples at interval T s or sampling rate F s = 1/T s . The spectrum of such a signal will have a repetitive (i.e. periodic) pattern at a regular frequency spacing F s .
4.7 Frequency Domain Characterisation of LTI Systems 4.7.1 Transfer Function The frequency response or transfer function H(f ) of an LTI system is defined as the ratio between system output and input when the input is a sinusoidal signal of frequency f . It indicates how the system alters the amplitude and phase of each sinusoid passed through it and is in general therefore a complex quantity which we may express in the form H(f ) ≡ |H(f )|∠𝜙H (f ) = |H(f )|ej𝜙H (f )
(4.154)
where |H(f )| and 𝜙H (f ) are, respectively, the magnitude and phase of H(f ). When a sinusoid of frequency f k is passed through a system with transfer function H(f ) the sinusoid will have its amplitude multiplied by a factor of
297
298
4 Frequency Domain Analysis of Signals and Systems
|H(f k )| and its phase increased by 𝜙H (f k ) radians, so we may write R
cos(2𝜋fk t + 𝜙k ) −−−−→ |H(fk )| cos[2𝜋fk t + 𝜙k + 𝜙H (fk )]
(4.155)
|H(f )| therefore specifies gain as a function of frequency and is called the amplitude response or gain response of the system. The phase of H(f ) is given by [ ] −1 Im{H(f )} (4.156) 𝜙H (f ) = tan Re{H(f )} where Im{x} and Re{x} give the imaginary and real parts of x respectively. 𝜙H (f ) specifies phase shift as a function of frequency and is called the phase response of the system. In Section 3.6 we learnt how to determine the output of an LTI system using a time domain characterisation of the system in terms of its impulse response h(t). Given the transfer function H(f ) of an LTI system, which is a frequency domain characterisation of the system, we can also obtain a general expression for the response of the system to an arbitrary input signal by taking the following steps, illustrated in Figure 4.48. A single sinusoidal input of frequency f k yields a sinusoidal response at the same frequency with amplitude gain |H(f k )| and phase shift 𝜙H (f k ), whatever the value of f k . In view of the principle of superposition stated in Eq. (3.126), when the input is a group of sinusoids of frequencies f k , k = 1, 2, 3, …, N, the output will be the result of the system changing each sinusoidal signal in amplitude by a factor of |H(f k )| and in phase by a shift 𝜙H (f k ), and then adding these modified sinusoids together. Recall that this principle of superposition implies that what the system does to any one of the sinusoidal signals is not affected in any way by the presence of other signals. But an arbitrary signal x(t) is in fact a sum of sinusoids expressed as ∞
x(t) =
∫−∞
X(f )ej2𝜋ft df
(4.157)
where X(f ) is the FT of x(t). The integrand [X(f )df ]exp(j2𝜋ft) at the pair of frequency locations ±(f → f + df ) of infinitesimally small width df represents a sinusoid of frequency f and amplitude 2|X(f )|df . The right-hand side cos(2πfkt + ϕk)
(a)
f
fk
N
Σ
k=1
(b) f1 f2 f3
fN
Ak cos(2π fkt + ϕk)
f
|X(f)|
f
Figure 4.48
X(f)
∣H(fk)∣ cos[2πfkt + ϕk + ϕH(fk)]
N
LTI System H(f)
∞
x(t) = ∫ X( f)ej2πftdf –∞
(c)
(d)
LTI System H(f)
Σ
k=1
Ak∣H(fk)∣cos[2π fkt + ϕk + ϕH(fk)]
∞
LTI System H(f)
LTI System H(f)
Frequency domain approach in LTI system analysis.
y(t) = ∫ X( f)H( f)ej2πftdf –∞
Y(f) = X(f)H(f)
4.7 Frequency Domain Characterisation of LTI Systems
and hence x(t) is therefore a continuous sum of sinusoids of all frequencies in the range f = (0, ∞). So, when we present x(t) to our system the output y(t) will be the result of modifying each input sinusoid of frequency f by the complex factor H(f ) which includes both a gain factor and a phase shift, and then adding them all together in the same continuous (i.e. integral) fashion. That is ∞
y(t) =
∫−∞
X(f )H(f )ej2𝜋ft df
(4.158)
We see that Eqs. (4.157) and (4.158) have the same format. So since X(f ) in Eq. (4.157) is the FT of x(t), it follows that X(f )H(f ) in Eq. (4.158) must be the FT of y(t), denoted Y (f ). Thus Y (f ) = H(f )X(f )
(4.159)
This important result states that the output spectrum Y (f ) for a transmission through an LTI system is the product of the input spectrum and the system’s transfer function or frequency response H(f ). Therefore, we now have two approaches to the analysis of an LTI system. In the time domain approach of Figure 3.24 we convolve the input signal x(t) with the system’s impulse response h(t) to give the output signal y(t). We can then obtain the output spectrum Y (f ) if required by taking the FT of y(t). In the frequency domain approach of Figure 4.48, we multiply the input signal’s spectrum X(f ) by the system’s transfer function H(f ) to obtain the output signal’s spectrum Y (f ). If desired, we may then take the IFT of Y (f ) to obtain the output signal’s waveform y(t). But how are h(t) and H(f ) related? To answer this question we invoke the convolution property of the FT stated in Eq. (4.91). Taking the FT of both sides of the time domain relation y(t) = x(t) ∗ h(t) we obtain F[y(t)] = F[x(t) ∗ h(t)] = F[x(t)]F[h(t)] Y (f ) = X(f )F[h(t)] Equating the right-hand side of this equation with the right-hand side Eq. (4.159) yields F[h(t)] = H(f ) The impulse response h(t) and frequency response H(f ) therefore form a FT pair h(t) ⇌ H(f )
(4.160)
Equation (4.160) completes the picture of interrelationship between the two analysis approaches. H(f ) can be determined theoretically by analysing the system’s circuit diagram or the channel’s signal propagation mechanisms. It may also be obtained experimentally by measuring the system’s gain and phase shift for a sinusoidal input of frequency stepped through the range of interest. The impulse response h(t) of the system is then obtained by taking the IFT of H(f ). The next worked example presents the circuit analysis method. Worked Example 4.14 Analysis of a Simple Lowpass Filter In Chapter 3, Worked Example 3.9, we analyse an LPF using a time domain approach. In this example, we will follow an alternative frequency domain approach, which is the preferred method for most engineers. Determine the transfer function H(f ) and impulse response h(t) of the simple RC LPF shown in Figure 4.49. For R = 1 kΩ and C = 79.58 nF, determine (a) The output spectrum when the input signal is an impulse function 𝛿(t). (b) The output voltage v2 (t) for an input voltage v1 (t) = 10 cos(2000𝜋t + 30∘ ) V. (c) The 3 dB bandwidth of the filter.
299
300
4 Frequency Domain Analysis of Signals and Systems
R
Input
Output
C
V1(f)
Z1
System (a)
Z2
V2(f)
Z1 = R; Z2 = –j/2π fC
(b)
Figure 4.49 RC low-pass filter: (a) Circuit diagram; (b) Equivalent circuit for a sinusoidal input signal of amplitude V 1 and frequency f .
The equivalent RC circuit for a sinusoidal input signal of frequency f is shown in Figure 4.49b. The input voltage V 1 (f ) is divided between the resistance R and the capacitance C according to the ratio of their impedances Z 1 and Z 2 , respectively. The voltage drop across C is the required output voltage. Thus H(f ) ≡
Z2 V2 (f ) −j∕2𝜋fC = = V1 (f ) Z1 + Z2 R − j∕2𝜋fC
Multiplying the top and bottom of the last term by j2𝜋fC yields 1 1 = √ H(f ) = 1 + j2𝜋fRC 1 + (2𝜋fRC)2 ∠tan−1 (2𝜋fRC) = √
1 1 + 4𝜋 2 f 2 R2 C2
∠ − tan−1 (2𝜋fRC)
≡ |H(f )|∠𝜙H (f ) where |H(f )| = √
1 1 + 4𝜋 2 f 2 R2 C2
≡ Gain response
𝜙H (f ) = −tan−1 (2𝜋fRC) ≡ Phase response
(4.161)
The impulse response h(t) is the IFT of H(f ) [ ] 1 −1 −1 h(t) = F [H(f )] = F 1 + j2𝜋fRC ( [ )] 1 1 = F−1 RC 1∕RC + j2𝜋f ] [ 1 −1 1 F = RC 1∕RC + j2𝜋f Looking in the list of FT pairs in Table 4.5, we find in entry 17 that e−at u(t) ⇌ Therefore [ −1 F
1 a + j2𝜋f
1 1∕RC + j2𝜋f
] = e−t∕RC u(t)
And hence 1 −t∕RC e u(t) RC (a) Applying Eq. (4.159) and noting from Table 4.5 that the FT of a unit impulse is 1, we obtain the output spectrum when the input is 𝛿(t) as h(t) =
Y (f ) = H(f )F[𝛿(t)] = H(f )
4.7 Frequency Domain Characterisation of LTI Systems
This confirms Eq. (4.160) and is also a general result: whenever the input is a unit impulse, the spectrum of the output signal gives the transfer function of the system. Substituting the values of R and C in the amplitude and phase expressions for H(f ) yields the required amplitude and phase spectra |H(f )| = √
1 1 + 4𝜋 2 f 2 (79.58 × 10−9 )2 (1000)2
= √
1 1 + (f ∕2000)2
𝜙H (f ) = −tan−1 [2𝜋fCR] = −tan−1 (5 × 10−4 f ) (b) v1 (t) is a sinusoid of frequency f = 1000 Hz. The output v2 (t) will be a sinusoid of the same frequency but with amplitude reduced by the factor |H(f )| and phase increased by 𝜙H (f ). Thus v2 (t) = 10|H(f )| cos[2000𝜋t + 30∘ + 𝜙H (f )]|f =1000 10 = √ cos[2000𝜋t + 30∘ + tan−1 (−5 × 10−4 × 103 )] 2 1 + (1000∕2000) = 8.94 cos[2000𝜋t + 30∘ − 26.6∘ ] = 8.94cos(2000𝜋t + 3.4∘ ) √ (c) The 3 dB bandwidth of the filter is the frequency f 2 at which the gain response |H(f )|max is 1∕ 2 of its peak value. We see from Eq. (4.161) that |H(f )|max = 1 at f = 0, so we may write 1 (
|H(f2 )| = √ 1+
f2 2×103
)2
≡
|H(f )|max 1 = √ √ 2 2
Squaring both sides reduces the equation to )2 ( f2 =2 1+ 2 × 103 and hence f2 = 2 kHz. (d) The impulse response h(t), gain response |H(f )| and phase response 𝜙H (f ) of this LPF are plotted in Figure 4.50. You may wish to compare what we have done here with Worked Example 3.9 to decide which method you prefer and why.
4.7.2 Output Spectral Density of LTI Systems It is useful to determine how the ESD of an energy signal and the PSD of a power signal are affected by transmission through an LTI system. We know from Eq. (4.159) that the output spectrum Y (f ) of an LTI system of transfer function H(f ) is related to the input spectrum X(f ) by Y (f ) = X(f )H(f ) Squaring this equation to obtain |Y (f )|2 = |X(f )|2 |H(f )|2
301
302
4 Frequency Domain Analysis of Signals and Systems
20 log | H(f) |, dB
0 –3 –6
1 ↑ h(t) RC
–9 –12 –15 –12
0.6 RC
⇌
0.2 RC 0
RC
2RC
3RC
4RC
t 5RC
90 60 30
→ f, kHz
0 –30 –60 –90 –12
Figure 4.50
→ f, kHz 0 2 4 6 8 10 12 ϕH (f), deg
–6
–8
–4
0 2 4 6 8 10 12
Impulse response and transfer function of RC low-pass filter.
and noting, from Eq. (4.129), that the square of the magnitude spectrum of a signal is the ESD of the signal, we obtain the relationship Ψy (f ) = Ψx (f )|H(f )|2
(4.162)
where Ψy (f ) and Ψx (f ) are, respectively, the ESD of the output and input signals y(t) and x(t). Since PSD is simply the average rate of ESD, a similar relationship between output PSD Sy (f ) and input PSD Sx (f ) will hold for power signals. Thus Sy (f ) = Sx (f )|H(f )|2
(4.163)
Therefore, the effect of the system is to modify – in other words colour – the signal’s spectral density by the factor |H(f )|2 , which is the square of the system’s gain response. For an important example, consider white noise as the input signal. We know that this has a constant PSD = N o /2. In view of Eq. (4.163), the noise at the system output will be coloured noise of PSD No |H(f )|2 (4.164) 2 Summing the above contribution over the entire frequency axis yields the total noise power at the output of the system Scn (f ) =
Pn =
∞ No ∞ |H(f )|2 df = No |H(f )|2 df ∫0 2 ∫−∞
(4.165)
4.7.3 Signal and System Bandwidths We have up to now used the term bandwidth at various points in the book but without a formal definition. The word bandwidth has in recent times become commonplace in society and is very often misused, even in technical circles. You hear of computer bandwidth and smartphone bandwidth (sic), etc. Download and upload speeds (in bits per second) of your broadband connection are perfectly correct societal usages. However, in telecommunications, bandwidth is not measured in bits per second and there is no direct one-to-one relation between bandwidth and bit rate. How much bit rate is achieved within a given transmission bandwidth very much depends on several link
4.7 Frequency Domain Characterisation of LTI Systems
design factors, such as modulation and coding schemes, ratio between signal power and noise plus interference, level of acceptable bit error rate, etc. We will explore the interplay of these parameters throughout the book, but for now it is time for a precise definition of bandwidth. The Fourier series expression for periodic signals – see Eq. (4.2) – would suggest that such signals contain an infinite number of harmonic frequencies. Similarly, the Fourier integral expression of Eq. (4.75) suggests including contributions from frequency components up to infinity in the continuous summation for the signal g(t). In realisable signals, however, the amplitude An of the nth harmonic sinusoid becomes very small for large n, and G(f ) becomes insignificant for large f , so these high-frequency components may be neglected with negligible effect on the signal g(t). The only exception will be if g(t) is an impulse function, in which case G(f ) = 1. The bandwidth of a signal is the range of positive frequencies of the sinusoidal signal components making a significant contribution to the signal’s power or energy. This is the width in hertz (Hz) of the significant SSAS or periodogram of the signal. The bandwidth of a communication channel (i.e. transmission medium) or electronic device or (in general) system is the range of positive frequencies of the sinusoidal signals that may be passed through the system from its input to its output without significant distortion. A lowpass or baseband signal contains significant positive frequencies down to or near DC (f = 0) and its bandwidth is usually taken as the maximum significant frequency component even if there is no DC component. A bandpass signal, on the other hand, has its significant positive frequencies centred at a high-frequency fc ≫ 0, and its bandwidth is always given by the range from the lowest to the highest significant positive frequency components. It is important to always bear this distinction in mind. For example, a sinusoidal message signal of frequency f m is treated as a baseband signal and its bandwidth is therefore B = f m , but a sinusoidal carrier signal of high frequency f c is regarded as a bandpass signal, hence bandwidth B = 0. A similar explanation is applicable to lowpass and bandpass systems, where the former passes frequencies near DC and the latter passes frequencies centred around fc ≫ 0. A precise statement of what is significant in the above discussion leads to various bandwidth definitions, including (i) subjective bandwidth, (ii) null bandwidth, (iii) X dB bandwidth, (iv) fractional power containment bandwidth, and (v) noise equivalent bandwidth. 4.7.3.1 Subjective Bandwidth
The significant frequency range is chosen to be just enough for a subjectively acceptable signal quality. This bandwidth definition is employed exclusively for audio signals and the luminance and chrominance signals of television. For example, speech signals typically contain frequencies in the range 50 Hz to 7 kHz. The low-frequency components (50–200 Hz) enhance speaker recognition and naturalness, whereas the high-frequency components (3.5–7 kHz) enhance intelligibility, such as being able to distinguish between the sounds of ‘s’ and ‘f’. Good subjective speech quality is, however, obtained in telephone systems with the baseband spectrum limited to the range 300–3400 Hz and this subjective bandwidth has been adopted as the ITU-T standard for telephony. And in television, because the eye is not very sensitive to changes in colour, the fine details of colour changes are usually omitted to allow the use of a smaller bandwidth for the chrominance signals (represented as two colour-difference components, each of typical subjective bandwidth 1.3 MHz). Examples of subjective bandwidths that have been standardised by international agreement for the purpose of maximising system capacity and radio spectrum exploitation include, analogue telephone speech = 3400 Hz, AM broadcast signal = 10 kHz, FM broadcast signal = 200 kHz, and analogue television (TV) broadcast signal = 6 or 8 MHz (depending on standard and including luminance and chrominance signals). Note that analogue telephone and TV signals are nowadays digitised prior to transmission using digital technology. 4.7.3.2 Null Bandwidth
The significant frequency range extends up to the first null in the amplitude spectrum of the signal. Clearly, this bandwidth definition applies only to signals (mostly pulses) whose spectra have well-defined nulls. If the
303
304
4 Frequency Domain Analysis of Signals and Systems
|G(f)|
(a)
f1
f2 Null bandwidth
0
f
|G(f)|
(b)
–fc
Figure 4.51
0
f f1 fc f2 Null bandwidth
Null-bandwidth: (a) Lowpass signal; (b) Bandpass signal.
double-sided spectrum of the signal has a main lobe bounded by one null (zero amplitude spectral point) at a positive frequency f 2 and another null at a lower frequency f 1 , then the bandwidth of the signal is given by { f1 < 0 (lowpass signal) f2 , (4.166) Null bandwidth = f2 − f1 , f1 > 0 (bandpass signal) Figure 4.51 illustrates each of the two cases in Eq. (4.166). 4.7.3.3 3 dB Bandwidth
The spectra of many signals and the gain response of most systems do not have well-defined nulls. In such cases the significant frequency range extends up to the point at which the amplitude spectrum of the signal or the gain response of the system is down by X dB from its peak value. This bandwidth definition is mainly applied to filters or systems. The most common specification is for X = 3, known as the 3 dB bandwidth or half-power bandwidth since a 3 dB drop in amplitude corresponds to a reduction in signal power by half. Values of X = 6 and X = 60 are sometimes used to give the shape factor of a filter. Figure 4.52 shows the gain response of a fifth order Butterworth LPF with 3 dB bandwidth = 10 kHz and 60 dB bandwidth = 40 kHz. In general, if the gain response |H(f )| of a filter or system has peak value |H(f )|max , then the cut-off frequency f X for determining the X dB bandwidth of the system is obtained by solving the equation |H(fX )| = |H(f )|max × 10−X∕20
(4.167)
Note that this was the approach followed in Worked Example 4.14c to determine the 3 dB bandwidth of the RC LPF. If, however, a plot (such as Figure 4.52) of the normalised gain response is available then it is a straightforward matter to read the bandwidth (for any value of X) from such a graph. Figure 4.53 shows various examples of 3 dB bandwidth.
4.7 Frequency Domain Characterisation of LTI Systems
–3
Normalised Gain, dB
–20
–40
–60
–80 –100 –100
Figure 4.52
–80
–60
–40
–20 0 20 Frequency, kHz
40
60
80
100
Gain response of a fifth order Butterworth lowpass filter.
|H(f)|
1
|H(f)|
1/ 2 (a)
(b)
Bandwidth f1
f2
0
bandwidth
f
f1
0
f2
f
|H(f)| (c)
Bandwidth –fc
f1
0 1
fc
f
|H(f)|
1/ 2
bandwidth
(d) –fc
f2
fc
f
Figure 4.53 3 dB bandwidths: (a) strictly bandlimited lowpass system; (b) non-bandlimited lowpass system; (c) strictly bandlimited bandpass system; and (d) non-bandlimited bandpass system.
A filter is designed to have a non-flat frequency response, in order to pass a specified band of frequencies with little or no attenuation while heavily attenuating (or ideally blocking) all frequencies outside the passband. In most cases, it is desired that the filter’s gain is constant within the passband. There is, however, a class of filters, known as equalisers, that are designed not only to exclude signals in the stopband but also to compensate for the distortion effects introduced into the passband by, for example, the transmission medium. An equaliser has
305
306
4 Frequency Domain Analysis of Signals and Systems
a non-flat frequency response, which when combined with the response of the transmission medium gives the desired overall flat response. Examples of filters include the lowpass filter (LPF), which passes only frequencies below a cut-off frequency F 1 . A highpass filter (HPF) passes only frequencies above a cut-off frequency F 2 . A bandpass filter (BPF) passes only those frequencies in the range from F 1 to F 2 . A bandstop filter (BSF) passes all frequencies except those in the range from F 1 to F 2 . The normalised frequency response of an ideal LPF is shown in Figure 4.54a. The gain of the filter drops from 1 to zero at the cut-off frequency F 1 . Such a brick wall filter is not feasible in real time. A realisable filter, shown in Figure 4.54b, requires a finite √ frequency interval in which to make the transition from passband, where the minimum normalised gain is 1∕ 2 = 0.707, to stopband, where the maximum gain is 𝛼 min . The interval from F 1 to the start F s of the stopband is known as the transition band.
4.7.3.4 Fractional Power Containment Bandwidth
The significant frequency range is the minimum range containing no less than a specified fraction of the total signal power. For example, the 95% fractional power containment bandwidth of a baseband signal is the range of frequencies, starting from DC, that contains at least 95% of the total signal power. In the case of a bandpass signal, this range is centred on the centre frequency. Fractional power containment bandwidth definition is used mainly for specifying transmission bandwidths. For example, in frequency modulation (FM), Carson’s bandwidth is the width of the band of frequencies (centred on the carrier frequency) which contains at least 98% of the total power in the FM signal. The Federal Communications Commission (FCC) specifies occupied bandwidth as the frequency range containing 99% of signal power, with exactly 0.5% of the power above and 0.5% below the occupied band. The p percent (p%) power containment bandwidth Bp of a periodic signal is obtained by first calculating the total power Pt of the signal in the time domain (from its waveform structure) and then totalling the cumulative power PN in the frequency components starting from DC – see Eq. (4.35) – until a harmonic np is first reached that
|H(f)|
(a)
Passband –F1
Stopband f
F1
0 1 |H(f)| 1/√2
(b)
Transition band Passband
Stopband
αmin –F1 Figure 4.54
F1
Fs
f
Frequency response of (a) Ideal (also called brick wall) lowpass filter (b) Realisable lowpass filter.
4.7 Frequency Domain Characterisation of LTI Systems
satisfies the first line of Eq. (4.168) below. Bp is then given by the second line. ( ) np p 1∑ 2 1 2 A A0 + ≥ Pt 2 n=1 n 100 Bp = nf o ,
fo ≡ fundamental frequency
(4.168)
The above computation may be carried out by trial and error (using a computer code that returns the cumulative power up to any given harmonic), or it may be conveniently laid out in tabular form as done in Worked Example 4.15. For nonperiodic signals, Bp may be obtained by integrating the signal’s single-sided periodogram from f = 0 up to the lowest positive frequency at which the cumulative energy or power reaches p% of the total. For example, an energy signal g(t) of energy E (computed from its waveform structure) is guaranteed to have a FT G(f ). So, Bp is obtained by evaluating the following equation (using numerical integration if |G(f )|2 cannot be integrated to obtain a closed form expression) 2 E ∫0
Bp
|G(f )|2 df =
p 100
(4.169)
Worked Example 4.15 Further Analysis of Rectangular Pulse Train We wish to analyse an RPT (such as Figure 3.19a) having amplitude A = 10 V, duty cycle d = 1/5, and period T = 10 μs to determine the following quantities: (a) Its 95% fractional power containment bandwidth B95 , determined to the nearest whole percentage. (b) Percentage of power contained in the main lobe. (c) Percentage of power contained in the first and second sidelobes. The total power of the RPT is obtained from Eq. (3.105) as Pt = A2rms = A2 d = 20 W The Fourier series of the RPT is derived in Worked Example 4.1 and is as stated in Eq. (4.17), with DC component A0 = Ad = 2 V, amplitude of nth harmonic An = 2Ad|sinc(nd)| = 4|sinc(nd)|, and fundamental frequency f o = 1/T = 100 kHz. A tabular layout of the computations is given in Table 4.6. Column 1 is a list of the harmonic numbers from n = 0 to 16. Column 2 lists the amplitude of each harmonic, with the corresponding power in column 3, calculated as A20 for DC power, and A2n ∕2 for the power in each harmonic n = 1 to 16. Colum 4 presents a running total of cumulative power starting from DC. Finally, column 5 converts column 4 to percentage power by dividing each column 4 entry by total power Pt and multiplying by 100. (a) Looking down the fifth column of Table 4.6, we see that the percentage of power first reaches or exceeds 95% (to the nearest whole integer) at harmonic n = 8. Thus the 95% fractional power containment bandwidth of this pulse train is B95 = 8f o = 800 kHz. (b) It can be seen from Table 4.6 that the first null of the spectrum occurs at harmonic n = 5, so this is the end of the main lobe. Column 5 indicates that at this point the cumulative power is 90.2878% of total power. Thus, the percentage of power in the main lobe is 90.29%. (c) The first sidelobe is the frequency range from the first null at n = 5 to the second null at n = 10. We see from Table 4.6 that, over this range, power increases from 90.2878% to 94.9946%. Thus, percentage of power in the first sidelobe is 94.9946 − 90.2878 = 4.71%. Similarly, the second sidelobe is the range from n = 10 to n = 15. Over this range, power increases from 94.9946 to 96.6413%. The percentage of power in the second sidelobe is therefore 96.6413 − 94.9946 = 1.65%.
307
308
4 Frequency Domain Analysis of Signals and Systems
Table 4.6 Worked Example 4.15. Harmonics of rectangular pulse train and their power contributions.
An , V
Pn , W
Cumulative power,
% of total power, r∑ = 100(P∑ /Pt )
0
2
4
4
20.0000
1
3.7420
7.0011
11.0011
55.0056
2
3.0273
4.5823
15.5834
77.9171
3
2.0182
2.0366
17.6200
88.1000
4
0.9355
0.4376
18.0576
90.2878
5
0
0
18.0576
90.2878
6
0.6237
0.1945
18.2520
91.2602
7
0.8649
0.3741
18.6261
93.1305
8
0.7568
0.2864
18.9125
94.5625
n
0.4158
0.0864
18.9989
94.9946
10
9
0
0
18.9989
94.9946
11
0.3402
0.0579
19.0568
95.2839
12
0.5046
0.1273
19.1841
95.9204
13
0.4657
0.1085
19.2925
96.4627
14
0.2673
0.0357
19.3283
96.6413
15
0
0
19.3283
96.6413
16
0.2339
0.0273
19.3556
96.7780
The harmonic at which cumulative power first reaches 95% (rounded to the nearest integer) is highlighted.
4.7.3.5 Noise Equivalent Bandwidth
We are often interested in the noise power at the output of a system, for example to determine signal-to-noise ratio (SNR), which gives a measure of transmission quality. Usually, the system, through its frequency-dependent gain response |H(f )|, will pass more noise at some frequencies than at others. But as far as output noise power is concerned this system is equivalent to a system with a rectangular-shaped gain response that equally admits all the white noise falling within its finite passband. As illustrated in Figure 4.55 this noise-equivalent system has a bandwidth B and a constant gain response K (within its passband) equal to the maximum value |H(f )|max of the gain response of the original system. We may formally define the noise equivalent bandwidth of a system as the bandwidth of a noiseless ideal brick wall filter, of constant gain response equal to the maximum gain response of the actual system, which passes the same amount of noise power through to its output as does a noiseless version of the actual system when both have white noise of equal PSD as input. From Eq. (4.165) and as shown in Figure 4.55, the noise power at the output of the equivalent system is Pne = K 2 No B, (Noise power at system output)
(4.170)
For equivalence this noise power must be equal to the noise power Pna at the output of the actual system – obtained earlier in Eq. (4.165) and repeated in Figure 4.55. Equating the expressions for Pne and Pna yields
4.7 Frequency Domain Characterisation of LTI Systems
Actual gain response |H(f)|max
Noise-equivalent gain response |He(f)|
|H(f)|
K = |H(f)|max
≡ f
Sx(f) =
No 2
N N Sya(f) = o ∣H(f)∣2 Sx(f) = o Actual 2 2 ≡ System H(f) ∞
Pna = ∫ Sya(f)df = –∞
No ∞ ∫ ∣H(f)∣2 df 2 –∞
∞
B
–B
No 2 Equivalent Sye(f) = 2 ∣He(f)∣ System He(f)
∞
Pne = ∫ Sye(f)df = –∞
=
= No ∫ ∣H(f)∣2 df
f
No B 2 ∫ K df 2 –B
= K2NoB
0
Figure 4.55
Noise equivalent bandwidth B.
an expression for B as ∞
B=
∫0 |H(f )|2 df
(4.171)
|H(f )|2max
Equation (4.171) defines the noise-equivalent bandwidth of the system. This is a very useful concept, which allows us to work with the more convenient ideal filter, as far as noise is concerned. In other words, we replace the actual system of frequency-dependent gain response |H(f )| with an ideal brick wall filter of bandwidth B, given by Eq. (4.171). Once we know the noise-equivalent bandwidth of a system, it is a matter of straightforward multiplication to obtain noise power using Pn = No B,
(noise power referred to system input)
(4.172)
for power referred to input, or Eq. (4.170) for output power. We will adopt this approach in all system noise calculations, and it will be taken for granted that B (except where otherwise indicated) refers to noise-equivalent bandwidth. If, however, this bandwidth is not known then the 3 dB bandwidth of the system may be used in its place. This substitution underestimates noise power but the error will be small if the filter response has steep sides with a small transition width. Worked Example 4.16 Noise Equivalent Bandwidth of Raised Cosine Filter The raised cosine filter is universally employed in digital transmission systems to reduce inter-symbol interference. For transmission at symbol rate Rs (baud), the normalised gain response of the filter is ⎧ ⎪1, [ )] ( |f | − f1 ⎪1 , |H(f )| = ⎨ 1 + cos 𝜋 f2 − f1 ⎪2 ⎪0, ⎩ f1 = (1 − 𝛼)Rs ∕2;
f2 = (1 + 𝛼)Rs ∕2;
|f | ≤ f1 f1 ≤ |f | ≤ f2 |f | ≥ f2 0≤𝛼≤1
where 𝛼 is known as the roll-off factor of the filter.
(4.173)
309
310
4 Frequency Domain Analysis of Signals and Systems
|H(f)|
1
α=0 α = 0.2 α = 0.5 α=1
0.5
0 –Rs
–Rs/2
Figure 4.56
0
Rs/2
Rs
f
Gain response of raised cosine filter of various values of roll-off factor 𝛼.
We wish to determine the null bandwidth Bnull and noise equivalent bandwidth B of the filter in terms of 𝛼 and Rs . Putting f = f 2 in Eq. (4.173) yields |H(f2 )| = 0.5(1 + cos 𝜋) = 0. Thus, the filter reaches its first and only null at f = f2 = (1 + 𝛼)Rs ∕2, so its null bandwidth is { R (1 + 𝛼) 2s , Baseband (f centred at 0) Bnull = (4.174) (1 + 𝛼)Rs , Bandpass (f centred at fc ≫ Rs ) This null bandwidth has a minimum baseband value Bnull = Rs /2 when 𝛼 = 0 (which corresponds to an ideal Nyquist channel – unrealisable in real time), and a maximum value Bnull = Rs when 𝛼 = 1 (which corresponds to a full-cosine roll-off filter). The gain response of the filter is shown in Figure 4.56 for various values of 𝛼. With a gain of zero at all frequencies above Bnull , the raised cosine filter has the effect of limiting the spectrum of signals passed through it to an occupied bandwidth equal to Bnull . The (occupied) transmission bandwidth Bocc of a radio transmission link that employs a raised cosine filter of roll-off factor 𝛼 and operates at symbol rate Rs baud is therefore given by Bocc = Rs (1 + 𝛼)
(4.175)
To compute the noise equivalent bandwidth B of the raised cosine filter, we use Eq. (4.173) in (4.171) with |H(f )|max = 1, employing the substitution 𝜃 = 𝜋(f − f1 )∕(f2 − f1 ) to evaluate the resulting integrals ( )]2 ∞ f1 f2 [ ∞ f − f1 1 2 2 1 + cos 𝜋 |H(f )| df = 1 df + df + 0 ⋅ df B= ∫0 ∫f2 ∫0 4 ∫f1 f2 − f1 ( ( ) )] f2 [ f − f1 f − f1 1 1 3 + 2 cos 𝜋 + cos 2𝜋 df = f1 + 4 ∫f1 2 2 f2 − f1 f2 − f1 𝜋 𝜋 3 1f −f 1f −f 1 = f1 + (f2 − f1 ) + 2 1 cos 2𝜃d𝜃 2 cos 𝜃d𝜃 + 2 1 8 4 𝜋 ∫0 4 𝜋 ∫0 2 3 3 5 = f1 + (f2 − f1 ) + 0 + 0 = f2 + f1 8 8 8 R R R 3 5 = (1 + 𝛼) s + (1 − 𝛼) s = s (1 − 𝛼∕4) 8 2 8 2 2
4.7 Frequency Domain Characterisation of LTI Systems
x(t)
x(t)
H(f)
y(t) = Kx(t – to)
t Figure 4.57
y(t) to
t
Distortionless transmission system.
Thus, the noise equivalent bandwidth of a raised cosine filter of roll-off factor 𝛼 is { R (1 − 𝛼∕4) 2s , Baseband (f centred at 0) B= (1 − 𝛼∕4)Rs , Bandpass (f centred at fc ≫ Rs )
(4.176)
Equation (4.175) and (4.176) are two very important results that will serve us well in the rest of the book.
4.7.4 Distortionless Transmission When a signal is transmitted through a medium, it is desired that the received signal be an exact copy of the transmitted signal except for some propagation delay, which causes the signal to be received later than it was transmitted. This ideal performance is known as distortionless transmission. We have observed that a transmission medium can be characterised by its transfer function, which completely specifies how the medium modifies the amplitudes and phases of frequency components in an information signal. We are interested in determining the specification of a transfer function that would achieve distortionless transmission. Figure 4.57 illustrates the input and output waveforms of a distortionless transmission channel or system having transfer function H(f ), which we seek to determine. The channel output y(t) is a scaled and delayed version of the input x(t) according to (4.177)
y(t) = Kx(t − to )
where K is a positive real constant. If y(t) has FT Y (f ) and x(t) has FT X(f ), then recalling the time shifting property of FT given by Eq. (4.80) and taking the FT of both sides of this equation yields Y (f ) = KX(f ) exp(−j2𝜋fto ) = X(f )K exp(−j2𝜋fto ) ≡ X(f )H(f ) where we have used the fact stated in Eq. (4.159) that the output spectrum Y (f ) is the product of the input spectrum X(f ) and the system’s transfer function H(f ). Therefore, to achieve distortionless transmission, the required channel transfer function specification is H(f ) = K exp(−j2𝜋fto ) = K∠ − 2𝜋to f ≡ |H(f )|∠𝜙H (f )
(4.178)
We see that two conditions must be satisfied, as illustrated in Figure 4.58. First, the gain response of the channel must be constant and, second, the phase response of the channel must be linear |H(f )| = K,
Constant gain response
𝜙H (f ) = −2𝜋to f ,
Linear phase response
(4.179)
The transfer function of practical transmission channels will in general not satisfy the above conditions without additional filtering. Over the frequency band of interest, any departure of the gain response of the channel from a constant value K gives rise to attenuation distortion. Similarly, any departure of the phase response from a linear
311
312
4 Frequency Domain Analysis of Signals and Systems
K
|H(f)|
(a) f ϕH(f) (b) f Slope = –2π × Delay
Figure 4.58
(a) Gain response, and (b) phase response of a distortionless transmission system.
graph gives rise to phase distortion, also called delay distortion. A parameter known as the group delay 𝜏 g of the channel is related to the slope of the channel’s phase response by 𝜏g (f ) = −
1 d𝜙H (f ) 2𝜋 df
(4.180)
Substituting the expression for 𝜙H (f ) from Eq. (4.179) into the above definition, we see that a channel with no phase distortion and hence a constant-slope phase response will have a constant group delay 𝜏g (f ) = to . For all other channels, the phase response is nonlinear, and this means that group delay will be a function of frequency, varying over the frequency band of interest. A frequency domain measure of the phase distortion incurred in transmission through a distorting channel is usually given by differential delay 𝜏 d , which is the difference between the maximum and minimum values of group delay within the frequency band of interest. 𝜏d = 𝜏g (f )|max − 𝜏g (f )|min
(4.181)
This frequency domain measure of phase distortion is not to be confused with the time domain measures of delay, namely average delay and rms delay spread, which are discussed in Section 3.2.1 in connection with multipath propagation. Distortionless transmission through a channel may be approximated over a desired frequency range by using a filter known as an equaliser at the output of the channel to compensate for the amplitude and phase distortions caused by the channel. The arrangement is as shown in Figure 4.59. Since the overall system is distortionless, we may write Hc (f )He (f ) = K exp(−j2𝜋fto ) |Hc (f )|∠𝜙Hc (f ) × |He (f )|∠𝜙He (f ) = K∠ − 2𝜋to f |Hc (f )||He (f )|∠(𝜙Hc (f ) + 𝜙He (f )) = K∠ − 2𝜋to f Distortionless channel H(f) = Hc(f)He(f) Input signal
Figure 4.59
Distorting Channel Hc(f)
Channel equalisation.
Equaliser He(f)
Output signal
4.7 Frequency Domain Characterisation of LTI Systems
Equating the magnitude and phase terms on both sides yields |Hc (f )||He (f )| = K 𝜙Hc (f ) + 𝜙He (f ) = −2𝜋to f Thus |He (f )| =
K |Hc (f )|
𝜙He (f ) = −[2𝜋to f + 𝜙Hc (f )]
(4.182)
Eq. (4.182) stipulates that ●
●
The gain response of the equaliser should be the reciprocal of the gain response of the distorting channel. This provides gain (or attenuation) equalisation. The sum of the phase response of the equaliser and that of the distorting channel should be a linear function of frequency. This gives phase (or delay) equalisation.
Attenuation equalisation alone may be enough in some applications such as speech transmission. For example, the attenuation (in dB) of audio telephone lines increases as the square root of frequency. The gain response of the transmission medium in this case is √ |Hc (f )| = K1 exp(−a f ) An equaliser with gain response √ |He (f )| = K2 exp(+a f ) will adequately compensate for the attenuation distortion of the medium. The overall gain response of the combination of transmission medium and equaliser in tandem is flat and independent of frequency, since √ √ |Hc (f )||He (f )| = K1 exp(−a f )K2 exp(+a f ) = K1 K2 = K, a constant
4.7.5 Attenuation and Delay Distortions An important cause of attenuation and delay distortions is multipath propagation, which arises when the signal arrives at the receiver having travelled over more than one path with different delays. Such a situation arises, for example, when a radio signal is received both by direct transmission and by reflection from an obstacle. If we consider two paths that differ in propagation time by Δ𝜏, this translates to a phase difference of 2𝜋f Δ𝜏 between two signals x1 (t) and x2 (t) of frequency f arriving over the two paths. If the phase difference is an integer multiple of 2𝜋 then x1 (t) and x2 (t) add constructively to give a received signal x(t) that has an enhanced amplitude. However, if the phase difference is an odd integer multiple of 𝜋 then the two components add destructively, and the received signal is severely attenuated. If x1 (t) and x2 (t) have equal amplitude then x(t) is zero under this situation. In practice, the direct signal (termed the primary signal) has a larger amplitude than the reflected signal (termed the secondary signal). The received signal amplitude therefore varies from a nonzero minimum under destructive interference to a maximum value under constructive interference. For values of phase difference other than integer multiples of 𝜋, the amplitude and phase of x(t) are determined according to the method of sinusoidal signal addition studied in Section 2.7.3. There are three important consequences: Because the phase difference between the two paths is a function of frequency, the amplitude and phase of the received (resultant) signal depends on frequency. Some frequencies are severely attenuated, whereas some are enhanced. This results in attenuation and phase distortion.
313
314
4 Frequency Domain Analysis of Signals and Systems
The propagation delay difference between the two paths depends on the location of the receiver. This gives rise to fast fading in mobile communication systems where there may be dynamic relative motion involving receiver, transmitter, and multipath sources. In digital communications, multipath propagation over two or more differently delayed paths gives rise to pulse broadening or dispersion. One transmitted narrow pulse becomes a sequence of two or more narrow pulses at the receiver and this is received as one broadened pulse. Broader pulses place a limit on the pulse rate (also known as the symbol rate) that can be used without the overlap of adjacent pulses. An equaliser may be used, as earlier discussed, to compensate for amplitude and phase distortions over a desired frequency range. However, in a dynamic mobile communication environment, the gain and phase responses of the channel will vary with time. The equaliser will therefore need to be implemented as an adaptive filter with filter coefficients that are regularly optimised to satisfy Eq. (4.182) in the face of changing channel conditions.
4.7.6 Nonlinear Distortions Our discussion of attenuation distortion was concerned with the modification of the shape of a signal’s amplitude spectrum by a transmission medium. The amplitudes of existing frequency components were modified, but no new frequency components were created. When the transmission medium or communication system is nonlinear then the output signal is subject to a nonlinear distortion, which is characterised by the following: The output signal is no longer directly proportional to the input signal. The output signal contains frequency components that are not present in the input signal. A common example of nonlinear distortion is an amplifier that is overdriven into saturation or operated in its nonlinear region. Nonlinear distortion may also be as a result of clipping when the input signal exceeds the input range of a system, such as a quantiser. Analysis of nonlinear distortion can be very complex for an information signal, which contains a band of frequencies, and for a transmission system having a nonlinearity that may require a high-order polynomial for accurate representation. However, to demonstrate how the new frequency components are created, let us take a simple signal x(t) consisting of only two sinusoids of zero phase at angular frequencies 𝜔1 and 𝜔2 x(t) = A1 cos(𝜔1 t) + A2 cos(𝜔2 t)
(4.183)
Let this signal be transmitted through a nonlinear medium whose output y(t) is a nonlinear function of the input, represented by a third-order polynomial y(t) = a1 x(t) + a2 x2 (t) + a3 x3 (t)
(4.184)
Equation (4.184) specifies what is known as the transfer characteristic of the nonlinear medium. You may verify that using Eq. (4.183) in Eq. (4.184) gives the output of this nonlinear medium as 1 y(t) = a2 (A21 + A22 ) 2[ ] 1 + a1 A1 + a3 (3A31 + 6A22 A1 ) cos 𝜔1 t 4 [ ] 1 + a1 A2 + a3 (3A32 + 6A21 A2 ) cos 𝜔2 t 4 1 1 + a2 A21 cos 2𝜔1 t + a2 A22 cos 2𝜔2 t 2 2 1 1 3 + a3 A1 cos 3𝜔1 t + a3 A32 cos 3𝜔2 t 4 4 + a2 A1 A2 cos(𝜔1 + 𝜔2 )t + a2 A1 A2 cos(𝜔1 − 𝜔2 )t 3 3 + a3 A21 A2 cos(2𝜔1 + 𝜔2 )t + a3 A22 A1 cos(2𝜔2 + 𝜔1 )t 4 4 3 3 2 (4.185) + a3 A1 A2 cos(2𝜔1 − 𝜔2 )t + a3 A22 A1 cos(2𝜔2 − 𝜔1 )t 4 4
4.7 Frequency Domain Characterisation of LTI Systems
Let us take a moment to examine this equation. Note first that the nonlinear distortion is caused by the coefficients a2 , a3 , … If these coefficients were identically zero, the transmission medium would be linear and y(t) would be an exact replica of x(t), except for a gain factor of a1 , which would include a phase shift if a1 is complex. Now observe the following distortions: There is a DC component (i.e. f = 0), which was not present in the input. This happens whenever any of the even coefficients a2 , a4 , … in Eq. (4.184) is nonzero. The output amplitude of a frequency component present in the input signal no longer depends exclusively on the system gain, but also on the amplitude of the other frequency component(s). For each input frequency component 𝜔k , there appear new frequency components in the output signal at m𝜔k , m = 2, 3, …, N, where N is the order of the nonlinearity (N = 3 in the Eq. (4.184) illustration). Since these new frequencies are harmonics of the input frequency, this type of distortion is termed harmonic distortion. For any two input frequencies 𝜔1 and 𝜔2 , there appear new components at m𝜔1 ± n𝜔2 , |m| + |n| = 2, 3, …, N. These are the sum and difference of the harmonic frequencies. This type of distortion is termed intermodulation distortion. The frequency component at m𝜔1 ± n𝜔2 is called an intermodulation product (IMP) of order |m| + |n|. The power in an IMP decreases with its order. Some of the above new frequencies may fall in adjacent channel bands in frequency division multiplex (FDM) or multicarrier satellite transponder systems and appear as unwanted interference. Increasing signal power to boost the SNR, for example by increasing A1 and A2 in Eq. (4.183), also increases the harmonic and intermodulation products. The practical way to minimise this type of distortion is to ensure that amplifiers and other system components operate in their linear region. Figure 4.60 demonstrates the transmission of a signal x(t) comprising two frequencies f 1 and f 2 through a third-order nonlinear system with coefficients a1 = 1, a2 = 0.6, and a3 = 0.12. The harmonic distortion and intermodulation distortion components can be seen in the output spectrum. Notice that the second-order IMPs f 2 ± f 1 have higher amplitudes than the third-order IMPs f 2 ± 2f 1 and 2f 2 ± f 1 . This is as expected since IMP power decreases with its order. Nonlinearity is, however, not always a nuisance. It finds extensive application in telecommunication for modulation, demodulation, frequency up-conversion and down-conversion, etc. For example, if a message signal (containing a band of frequencies f 1 ) is added to a sinusoidal carrier signal of frequency f2 ≫ f1 and the sum signal is then passed through a nonlinear device, the band of second-order IMPs centred around f 2 would constitute an amplitude modulated (AM) signal. Figure 4.60 is the special case in which the message signal is a single sinusoid of
Input spectrum
f1 = fo
f2 = 7fo
nfo
x(t) Nonlinear System y = x + 0.6x2 + 0.12x3 f1
f2 f2 – f1
f=0
Output spectrum
y(t) f2 + f1 2f2
2f1
Figure 4.60
3f1
f2 – 2f1
f2 + 2f1
2f2 – f1 2f2 + f1
3f2
Input and output spectra of signal transmission through a nonlinear system.
nfo
315
316
4 Frequency Domain Analysis of Signals and Systems
frequency f 1 . The output spectrum contains frequency components f 2 − f 1 , f 2 and f 2 + f 1 which are, respectively, the lower side frequency, carrier, and upper side frequency of the AM signal. In this application of nonlinearity, the other ‘unwanted’ frequency components in the output would be excluded using a bandpass filter. We have more to say on this in Chapter 7.
4.8 Summary The subject of frequency domain analysis of signals and systems is foundational to the study of communication engineering. Although calculus is an indispensable tool for anyone wishing to acquire complete technical mastery of this important subject, this chapter followed an engineering-first approach that made use of maths only where necessary and prioritised clarity of understanding as well as engineering problem solving skill over pure mathematical rigour. You should now have a sound understanding of the various transform techniques and their interrelationships, including the sinusoidal and complex exponential forms of the Fourier series, the Fourier transform (FT), the discrete time Fourier series (DTFS), discrete time Fourier transform (DTFT), discrete Fourier transform (DFT), the fast Fourier transform (FFT), the Laplace transform (LT), and the z-transform (ZT). Our treatment emphasised how to derive from first principles the frequency domain representation of a wide range of signals, including energy signals, power signals, periodic and nonperiodic signals, and discrete- (DT) and continuous-time (CT) signals. We also learnt how, given the transform of signal A (from standard tables or otherwise), we may derive the spectrum of a related signal B by exploiting relevant transform properties to modify the given transform according to the time domain relationships between signals A and B. The most versatile result that we derived in this regard was the Fourier series coefficients for a trapezoidal pulse train, which can be applied through straightforward modifications to obtain the Fourier series of a wide range of standard signals, including unipolar and bipolar rectangular, triangular, sawtooth, and ramp waveforms. One of the novel undertakings of the chapter was our use of Fourier analysis to gain valuable insights into the features and operational constraints of various practical telecom signals and systems. In this way, we discovered aperture distortion in sample-and-hold signals, the frequency domain effect of flat-top sampling, and indeed the sampling theorem, the spectrum and bandwidth requirements of BASK, the effectiveness of tapered window pulses, such as the raised cosine pulse, in reducing adjacent channel interference and boosting digital transmission system capacity, etc. We examined a range of practical issues in DFT implementation and discussed ways to avoid alias distortion, improve frequency resolution, minimise spectral leakage, prevent spectral smearing, and reduce the variance of computed periodograms for nondeterministic signals. We discussed the inverse relationship between time and frequency domains, an early indication of what will be a recurring theme throughout this book, that communication system design is more a game of trade-offs than of free lunches. In this case, narrower pulses would allow us to operate at a higher symbol rate, but this advantage or improvement comes at the price of a proportionately wider bandwidth. Fourier analysis underpins two important concepts that play a pivotal role in telecommunications, namely: ●
●
The spectrum of a signal, which specifies the relative amplitude and phase of each member of the collection of sinusoids that constitute the signal. The frequency response of a communication system, which specifies how the system will alter the amplitude and phase of a sinusoidal input signal at a given frequency when that sinusoid is passed through the system.
The signal spectrum gives a complete specification of the signal from which important signal parameters such as bandwidth, power, and signal duration can be computed. The frequency response of a linear time-invariant system gives a full description of the system, allowing us to determine the output signal spectrum corresponding to an
Questions
arbitrary input signal, by virtue of the principle of superposition. Other important transmission parameters can then be obtained, such as system gain, output signal power, and attenuation and phase distortions. For a nonlinear system or device, the superposition principle does not hold. In this case, we may employ the device’s transfer characteristic, which specifies the output signal as a polynomial function of the input. By doing this, we find that the effect of nonlinearity is to introduce new frequency components (known as harmonic and intermodulation products) into the output signal, which were not present in the input signal. Nonlinearity is undesirable in transmission media and repeaters but is exploited in transmitters and receivers to perform important signal processing tasks, such as frequency translation, modulation, and demodulation. At this point, it is worth summarising the major issues in communication system design, which we address in this book. ●
●
●
●
The message signal must be transformed into a transmitted signal that requires a minimum transmission bandwidth and experiences minimum attenuation and distortion in the transmission medium. Furthermore, those transformation techniques are preferred in which the information contained in the transmitted signal is insensitive to small distortions in the spectrum of the transmitted signal. The system should approach as closely as possible to a distortionless transmission. This may require the use of equalisers to shape the overall frequency response of the system and the implementation of measures to minimise noise and interference. The original information should be recovered at the intended destination from a received signal that is in general a weak, noise-corrupted, and somewhat distorted version of the transmitted signal. We must be able to evaluate the performance of a communication system and to fully appreciate the significance of various design parameters and the trade-offs involved.
In this chapter, we have introduced important concepts and tools, which will be relied upon throughout the book to develop the above systems design and analysis skills. We delve deeper into this task in the next chapter with a study of the three main transmission media in modern communication systems.
References 1 Cooley, J.W. and Tukey, J.W. (1965). An algorithm for machine computation of complex Fourier series. Mathematics of Computation 19: 297–301. 2 Oran Brigham, E. (1988). Fast Fourier Transform and Is Applications, Pearson. ISBN: 978-0133075052. 3 S Bouguezel, M. Ahmad and M Swamy (2006), “An alternate approach for developing higher radix FFT algorithms”, APCCAS 2006–2006 IEEE Asia Pacific Conference on Circuits and Systems, DOI: https://doi.org/10 .1109/APCCAS.2006.342373 4 Haykin, S. and Van Veen, B. (2003). Signals and Systems, 2e. Wiley. ISBN: 978-0471378518. 5 Ifeachor, E. and Jervis, B. (2001). Digital Signal Processing: A Practical Approach. Prentice: Hall. ISBN: 978-0201596199.
Questions 4.1
.(a) Determine the fundamental frequency and first three harmonics of the periodic voltage waveforms shown in Figure Q4.1. (b) Does g2 (t) have the same amplitude spectrum as g3 (t)? If not, giving reasons for your answer but without deriving any Fourier series, which of the two waveforms will have stronger higher frequency components?
317
318
4 Frequency Domain Analysis of Signals and Systems
g1(t), volts
Figure Q4.1 Question 4.1.
1
(a)
t, μs
0 5 –1 g2(t), volts 5
(b) 3
t, ms
g3(t), volts 5 (c) 3
4.2
t, ms
Given the voltage waveform g(t) = 10 − 20 cos(100𝜋t) + 10 cos(200𝜋t + 𝜋∕3) − 5 cos(500𝜋t − 𝜋∕2) volts (a) Sketch the single-sided amplitude and phase spectra of g(t). (b) Express g(t) as a sum of complex exponentials and hence sketch its double-sided amplitude and phase spectra. (c) Determine the total power in the amplitude spectrum of (a) and (b) and hence comment on the equivalence or not of the two frequency domain representations.
4.3
Given that the Fourier series of a centred unipolar rectangular pulse train (RPT) (e.g. Figure 4.4a) of amplitude A, waveform period T, and duty cycle d is g(t) = Ad + 2Ad
∞ ∑
sinc(nd) cos(2𝜋nf o t);
fo = 1∕T
n=1
(a) Determine the Fourier series of the bipolar waveform g1 (t) in Figure Q4.3. g1(t), V 10 t T/2
T/3 T/6
–30 T Figure Q4.3 Question 4.3.
Questions
Figure Q4.4 Question 4.4.
A
g1(t)
(a) –T/2
0 A
t
T/2 g2(t)
(b) –T/4 T/4 T
t
(b) For T = 100 μs, sketch fully labelled double-sided amplitude and phase spectra of g1 (t) up to the ninth harmonic. 4.4
Figure Q4.4a shows a triangular waveform g1 (t) of amplitude A and period T. (a) Obtain an expression for the normalised power of the waveform in terms of A and T. (b) Starting from entry 10 of Table 4.5 for the Fourier transform of a triangular pulse, show that the double-sided Fourier series of g1 (t) is given by g1 (t) =
∞ A ∑ sinc2 (n∕2) cos(2𝜋nf o t), fo = 1∕T 2 n=−∞
(c) Hence, sketch the single-sided amplitude spectrum of g1 (t) up to the sixth harmonic. (d) Determine the null bandwidth of g1 (t) and the fraction of total power contained therein. (e) The triangular waveform g2 (t) shown in Figure Q4.4b is identical to g1 (t) in every respect, except that g2 (t) leads g1 (t) by one-quarter of a period. Sketch the single-sided phase spectrum (SSPS) of g2 (t) up to the seventh harmonic. 4.5
Given that the Fourier series of a centred unipolar triangular pulse train (e.g. g2 (t) in Figure 4.4c) of amplitude A, waveform period T and duty cycle d is g2 (t) = Ad∕2 + Ad
∞ ∑
sinc2 (nd∕2) cos(2𝜋nf o t),
fo = 1∕T
n=1
(a) Determine the Fourier series of the bipolar triangular waveform g3 (t) in Figure Q4.5. (b) For T = 1 ms, sketch a fully labelled single-sided amplitude spectrum of g3 (t) up to the sixth harmonic.
A
g3(t)
–T/2
T/2 –A
Figure Q4.5 Question 4.5.
t
319
320
4 Frequency Domain Analysis of Signals and Systems
g(t) d = τ/T A
–T/2 –τ/2
0
τ/2
t
T/2
Figure Q4.6 Question 4.6.
4.6
Derive from first principles the Fourier series of the raised cosine pulse train g(t) of amplitude A and duty cycle d shown in Figure Q4.6. One cycle of the pulse train is defined by [ ( )] ⎧ A 1 + cos 2𝜋 t , − 𝜏 ≤ t ≤ 𝜏 𝜏 2 2 ⎪2 gT (t) = ⎨ ⎪0, Otherwise ⎩
4.7
Figure Q4.7 shows three pulse trains g1 (t), g2 (t), and g3 (t) that differ only in their pulse shape. g1 (t) is rectangular, g2 (t) is triangular, and g3 (t) is raised cosine as defined in Question 4.6. (a) Determine the 99% fractional power containment bandwidth of each pulse train, rounding percentage values to the nearest integer. (b) Determine the percentage of power in the main lobe of the spectrum of each pulse train. (c) Determine the percentage of power in each of the first three sidelobes of the spectrum of each pulse train. (d) Based on the above results, discuss the effects of pulse shape on the spectral content of signals. g1(t), V
100
d = τ/T = 1/3; T = 50 μs t g2(t), V
100
t g3(t), V
100
τ Figure Q4.7 Question 4.7.
t T
Questions
g1(t)
d = τ/T
A τ g2(t)
t
T
d = τ/T
A
T –τ/2
t
τ/2 g3(t)
τ = τr + τc; d = τ/T A
T –τr
t
τc g4(t) d = τ/T
A τr
τ
t
T
Figure Q4.8 Question 4.8.
4.8 (a) . Starting from the results in Eq. (4.68) for the coefficients of the Fourier series of a trapezoidal pulse train, derive the Fourier series of each of the waveforms g1 (t), g2 (t), g3 (t), and g4 (t) shown in Figure Q4.8. (b) Validate each of the Fourier series derived above by carrying out a Fourier synthesis of each waveform using its series up to the 15th harmonic, selecting your own waveform parameter values (i.e. amplitude, period, and duty cycles). Note that a MATLAB code as in Worked Example 4.3c will help ease the synthesis effort. 4.9
The triangular pulse train g1 (t) and ramp pulse train g2 (t) shown in Figure Q4.9 have the same amplitude, pulse duration, and waveform period. Calculate the following parameters for each waveform. (a) Null bandwidth. (b) Ninety-nine percent fractional power containment bandwidth.
g1(t) d = τ/T = ½; T = 10 μs
10 V
T –τ/2
10 V
d = τ/T = ½; T = 10 μs T
–τ/2 Figure Q4.9 Question 4.9.
t
τ/2 g2(t)
τ/2
t
321
322
4 Frequency Domain Analysis of Signals and Systems
(c) Percentage of power in the main spectral lobe. (d) Percentage of power in each of the first three spectral sidelobes. How do the parameters compare between the two signals? Give reasons for any differences. 4.10
Figure Q4.10 shows a periodic staircase waveform g(t). (a) Use Eq. (4.20) to calculate the values of the DC component A0 and amplitudes A1 , A2 , and A3 and phases 𝜙1 , 𝜙2 , and 𝜙3 of the first three harmonics of g(t) at respective frequencies f o , 2f o , and 3f o . (b) Check the correctness of your results by plotting the synthesised waveform gs (t) = A0 + A1 cos(2𝜋fo t + 𝜙1 ) + A2 cos(4𝜋fo t + 𝜙2 ) + A3 cos(6𝜋fo t + 𝜙3 )
4.11
Given that a periodic signal g(t) of period T is represented by the Fourier series g(t) = A0 +
∞ ∑
an cos(2𝜋nf o t) +
n=1
∞ ∑
bn sin(2𝜋nf o t),
fo = 1∕T
n=1
and that the normalised power P of a periodic signal g(t) of period T is T∕2
P=
1 g2 (t)dt T ∫−T∕2
show that 1∑ 2 A , 2 n=1 n ∞
P = A20 +
where A2n = a2n + b2n
4.12
The Fourier series of a sinusoidal pulse train (Figure 4.22) that completes m half-cycles per pulse interval is given by Eq. (4.56). A binary amplitude shift keying (BASK) system operates in on–off keying (OOK) fashion at a carrier frequency of 500 kHz and bit rate 50 kb/s. Employ this equation to determine the amplitude spectrum of this BASK signal for the following bit sequences: (a) 100100100… (b) 110110110… (c) 110011001100… Compare the bandwidth and frequency content of each spectrum with the spectrum for the fastest-changing sequence 101010… discussed in Figure 4.24.
4.13
Figure Q4.13 shows the waveform of a binary phase shift keying (BPSK) signal g(t) for the fastest-changing bit sequence 101010… The carrier (of frequency f c ) completes an integer number M of cycles in each bit interval and is transmitted with phase −90∘ for binary 1 and +90∘ for binary 0. (a) Derive the Fourier series of g(t). (b) Use this Fourier series to calculate and sketch the amplitude spectrum of a BPSK signal for the fastest-changing sequence 101010… for a system that operates at a bit rate of 50 kb/s with carrier frequency f c = 500 kHz. g(t), V 20 10 0 –10
Figure Q4.10 Question 4.10.
11
3 5
7 8
13
t, ms
Questions
A
g(t)
t –A Figure Q4.13 Question 4.13.
T/2
T/2
BPSK waveform for fastest changing bit sequence 101010…
(c) Compare the spectrum and bandwidth of the above signal with those of Figure 4.24 for the same bit sequence transmitted at the same bit rate and carrier frequency but using binary amplitude shift keying. 4.14
Figure Q4.14 shows the waveform of a binary frequency shift keying (BFSK) signal g(t) for the fastest-changing bit sequence 101010… The carrier frequency is set to f 2 (which completes M cycles in a single bit interval) for binary 1 and to f 1 (which completes Q cycles in a single bit interval) for binary 0, where M > Q and both are positive integers. (a) Derive the Fourier series (FS) of g(t). (b) Use this FS to calculate and sketch the amplitude spectrum of a BFSK signal for the fastest-changing sequence 101010… for a system that operates at a bit rate of 50 kb/s with carrier frequencies f2 = 500 kHz for binary 1 and f1 = 200 kHz for binary 0. (c) Compare the spectrum and bandwidth of the above signal with those of Figure 4.24 for the same bit sequence transmitted at the same bit rate and carrier frequency but using BASK.
4.15
The triangular pulse g(t) shown in Figure Q4.15a has its Fourier transform (FT) listed in row 10 of Table 4.5 as ( ) 𝜏 𝜏 G(f ) = A sinc2 f 2 2
A
g(t)
t –A Figure Q4.14 Question 4.14.
A (a)
–τ/2
T/2
BFSK waveform for fastest changing bit sequence 101010…
g(t) = Atrian(t/τ)
τ/2
T/2
t
A (b)
τ –τ
0 –A
Figure Q4.15 Question 4.15.
g1(t)
t
323
324
4 Frequency Domain Analysis of Signals and Systems
Starting from this result and applying relevant FT properties, determine the FT of the bipolar triangular pulse g1 (t) shown in Figure Q4.15b. 4.16
.(a) Starting from first principles, derive an expression for the FT of the ramp pulse shown in Figure Q4.16a in terms of its amplitude A and duration 𝜏. (b) Hence, using the time shifting property of FT, find the FT of the delayed ramp pulse g1 (t) in Figure Q4.16b. (c) Hence, using the time reversal and time shifting properties of FT, find the FT of the ramp pulse g2 (t) in Figure Q4.16c. (d) Hence, using the above results and the FT of a rectangular pulse listed in row 8 of Table 4.5, find the FT of the trapezoidal pulse g3 (t) in Figure Q4.16d, which has rising edge duration 𝜏 r , constant level duration 𝜏 c and falling edge duration 𝜏 f . (e) Show that, under the right conditions, the result in (d) will reduce to the FT of a triangular pulse as stated in Q4.15.
4.17
Eq. (4.68) gives the Fourier coefficients of the trapezoidal pulse train (Figure 4.26), which has period T. In the limit T → ∞, this pulse train reduces to a trapezoidal pulse as shown in Figure Q4.16d. Starting from Eq. (4.68) and using the relationship between the Fourier transform G(f ) and the Fourier series coefficient Cn given in Eq. (4.72), obtain an expression for the FT of a trapezoidal pulse.
4.18
By direct computation using its defining formula, find the DFT sequence G[k] of the following data sequences: (a) g[n] = {0, 1, 1, 0} (b) g[n] = {2, 0, 1, 1}.
4.19
.(a) By direct computation using its defining formula, find the inverse DFT of each of the transform sequences G[k] obtained in Question 4.18. How does each inverse DFT compare with the original data sequence? (b) Use each of the above DFT pairs to verify Parseval’s theorem stated in Eq. (4.115) for discrete-time (DT) signals. Do this by calculating the energy of each sequence in both time and frequency domains as suggested by the theorem and checking if both results are equal.
g(t) (a)
g1(t) (b)
A τ
0
A
t
(d)
A τr
–τc/2
0
Figure Q4.16 Question 4.16.
t τf
g3(t)
g2(t) (c)
τc/2
0
A
t τr
–τc/2
0
τc/2
t τf
Questions
4.20
.(a) Sketch a fully labelled signal flow graph for a four-point decimation-in-frequency (DIF) fast Fourier transform (FFT) algorithm. (b) Using this signal flow graph, find the transform sequence of each of the data sequences g[n] = {0, 1, 1, 0} and g[n] = {2, 0, 1, 1}. How do your results here compare with those obtained in Question 4.18 using a direct DFT evaluation?
4.21
Derive a closed-form expression for the z-transform (ZT) of each of the following sequences, where u[n] denotes the unit sequence: (a) nu[n] (b) e-𝛼n u[n] (c) 𝛼 n u[n].
4.22
Consider the transmission of a signal through the simple LTI system shown in Figure Q4.22. (a) Obtain the transfer function H(f ) of the system. (b) Using Table 4.5 or otherwise, obtain the impulse response h(t) of the system. For R = 200 Ω and C = 39.79 nF (c) Determine the output voltage v2 (t) for the following input voltages (i)
v1 (t) = 10 cos(1000 𝜋t)
(ii)
v1 (t) = 10 cos(8 × 104 𝜋t)
Comment on your results by discussing the action of the circuit. (d) Obtain and sketch the amplitude and phase spectra of the output when the input signal is a centred RPT (see Figure 4.4a) of amplitude A = 10 V, pulse duration 𝜏 = 20 μs, and waveform period T = 100 μs. Discuss the manner in which the pulse train has been distorted by the system. 4.23
Figure Q4.23 shows a parallel LC circuit, usually referred to as a tank circuit. (a) Obtain an expression for the transfer function H(f ) of the circuit and hence plot the amplitude and phase response of the filter as a function of frequency for C = 0.5 nF, L = 240 μH, and R = 15 Ω. (b) Show that the circuit has maximum gain at the frequency (known as the resonant frequency) 1 fr = √ 2𝜋 LC (c) Show that the circuit has 3 dB bandwidth R B3dB = 2𝜋L
Figure Q4.22 Question 4.22.
C R
ʋ1(t)
ʋ2(t)
Figure Q4.23 Question 4.23.
C Input
ʋ1(t)
L R
ʋ2(t)
Output
325
326
4 Frequency Domain Analysis of Signals and Systems
(d) A filter’s quality factor may be defined as the ratio Q = fr ∕B3dB . It gives a measure of the selectivity of the filter, i.e. how narrow the filter’s passband is compared to its centre frequency. What is the Q of this filter for the given component values? (e) The shape factor of a filter is defined as the ratio between its 60 dB bandwidth and 6 dB bandwidth. It gives a measure of how steeply the filter’s attenuation increases beyond the passband. An ideal bandpass filter has a shape factor of 1. Determine the shape factor of this filter. 4.24
A system has the impulse response h(t) = 10𝛿(t) (a) Determine the response of this system to an input x(t) = 50cos(800𝜋t + 45∘ ) (b) Is the system distortionless? Explain. (c) Is there a limit on the symbol rate which this system can support?
4.25
Determine the response of a system with impulse response h(t) = 20u(t) to a pulse input x(t) = 50cos(800𝜋t)rect(t/0.01 − 1/2). Discuss the extent of pulse dispersion in the system and the implication on the symbol rate that can be employed without intersymbol interference.
4.26
Determine the noise equivalent bandwidth of a Gaussian filter with frequency response H(f ) = exp(−4a𝜋 2 f 2 − j2𝜋fto ), where a and to are constants. How does this compare with the 3 dB bandwidth of the filter?
4.27
The gain response of a Butterworth lowpass filter (LPF) of order n is given by 1 |H(f )| = √ 1 + (f ∕f1 )2n (a) Determine the noise equivalent bandwidth B of the filter. (b) For filter order n = 1 to 5, determine how much error in dB would be incurred if the 3 dB bandwidth of this filter were used for noise power calculation in place of B. Comment on the trend of the error as n increases.
4.28
In Worked Example 3.6 the autocorrelation function of a sinusoidal signal g(t) = Am cos(2𝜋fm t + 𝜙m ) was derived as A2 Rg (𝜏) = m cos(2𝜋fm 𝜏) 2 Based on this result, show that the power spectral density (PSD) of the sinusoidal signal g(t) is A2m A2 𝛿(f − fm ) + m 𝛿(f + fm ) 4 4 and hence that the power obtained as the total area under the PSD curve is P = A2m ∕2, in agreement with the power calculated in the time domain. Sg (f ) =
4.29
Using a frequency domain approach, show that the response y(t) of a system with impulse response h(t) = 5rect(t/6 − 0.5) to an input x(t) = rect(t/12) is given by ( )] [ ( ) t−6 t + trian y(t) = 30 trian 12 12 How easy is this method compared to the graphical time-domain approach employed to solve the same problem in Question 3.15a? [NOTE: You may wish to refer to sections 2.6.3 and 2.6.5 for definitions of the rect() and trian() pulses used above].
327
5 Transmission Media
Our greatest hindrance is not so much what we don’t have as what we don’t use. In this Chapter ✓ Overview of closed and open transmission media. ✓ Signal attenuation and impairments in various transmission media including copper lines, optical fibre, and radio. ✓ Transmission line theory: a detailed discussion of wave attenuation and reflection on metallic transmission lines and various techniques of line termination and impedance matching. ✓ Scattering parameters of two-port networks. ✓ Optical fibre: a historical review and brief discussion of fibre types, extrinsic and intrinsic losses, dispersion, and optical fibre link design. ✓ Radio: a nonmathematical review of Maxwell’s equations and a discussion of various radio wave propagation modes, effects, and mechanisms, and path loss calculations on terrestrial and earth–space radio paths. ✓ Worked examples to demonstrate the interpretation and application of concepts and to deepen your insight and hone your skills in engineering problem solving. ✓ End-of-chapter questions to test your understanding and (in some cases) extend your knowledge of the material covered.
5.1 Introduction The transmission medium provides a link between the transmitter and the receiver in a communication system. One important note on terminology should be made here. We will sometimes use the term communication channel to mean transmission medium but will use the word channel on its own to mean the bandwidth or other system resource devoted to one user in a multiple-user communication system. For example, in Chapter 13 we refer to a 30-channel TDM, which means a system that can simultaneously convey 30 independent user signals in one transmission medium using time division multiplexing (TDM). There are two broad classifications of transmission media, namely closed and open media. Closed media enable communication between a transmitter and a specific physically connected receiver. They include twisted wire pair (also called paired cable), coaxial cable, optical fibre, and metallic waveguide. Open media provide broadcast (point-to-area communication) and mobility capabilities in telecommunication, which are not possible with Communication Engineering Principles, Second Edition. Ifiok Otung. © 2021 John Wiley & Sons Ltd. Published 2021 by John Wiley & Sons Ltd. Companion Website: www.wiley.com/go/otung
328
5 Transmission Media
closed media. It includes all forms of electromagnetic wave (e.g. radio and infrared) propagation not only in the earth’s atmosphere, but also in empty space and seawater. A metallic waveguide is preferred to coaxial cables above about 3 GHz for very short links, for example to link an antenna to its transmitter or receiver. It is a metallic tube of rectangular or circular cross-section. An electromagnetic wave is launched into the waveguide using a wire loop or probe. The wave travels down the guide by repeated reflections between the walls of the guide. Ideally, energy is totally reflected at the walls and there are no losses. In practice, although the guide walls are polished to enhance reflection, some absorption takes place leading to losses, which, however, are small compared to cable losses at these frequencies. Our discussion of closed transmission media will be limited to metallic copper lines and optical fibre. We begin the discussion of each transmission medium with a nonmathematical treatment followed by a deeper analysis of the relevant wave propagation phenomena and concepts that equip us to understand and quantify the limitations and impairments of each medium. Our goal is twofold: to develop the tools necessary for determining signal attenuation, and hence the received signal strength at various points in each transmission medium, and to apply our understanding of the characteristics of each transmission medium in the design of a communication system, including particularly its signal power budget.
5.2 Metallic Line Systems Figure 5.1 shows various metallic line examples, including unscreened twisted pair (UTP), screened twisted pair (STP), coaxial line, and microstrip line. The conducting metal is usually copper, and the insulation or dielectric material includes polystyrene (with relative permittivity 𝜀r = 2.5), polyethylene (𝜀r = 2.3), and Teflon (𝜀r = 2.1). Metallic lines dominated the signal transmission landscape until the advent of commercial optical fibre transmission in the 1980s culminating in TAT8 (8th transatlantic cable system), the world’s first transoceanic undersea optical fibre system in 1988. The use of metallic lines for power transmission and short distance telecommunication links, for example to connect an outdoor antenna to an indoor unit or a broadband/landline hub to a street cabinet device, or for transmission within buildings will continue into the foreseeable future. However, long-distance metallic line communication links are now obsolete. Even the use of copper cables in the link between local exchange and customer premises (known as local loop or last mile) of the public switched telephone network (PSTN) is not expected to continue much beyond 2025 due to the rapid deployment of optical fibre to connect between street cabinet or customer premises and the local exchange. Many of the applications of metallic lines in communication systems summarised below are therefore largely historical and discussed for the sake of completeness. Metallic lines can be used in a balanced (also called differential) mode or in an unbalanced (also called single-ended) mode. In the unbalanced mode, one wire carries the signal while the other serves as the ground connection. In the balanced mode, both wires carry the signal v(t) divided such that one wire carries + 1/2v(t) while the other carries − 1/2v(t). There is symmetry about ground. The receiving end reads the signal as the difference between the voltages on the two wires which therefore cancels out common-mode noise. Two-wire lines use balanced modes and incorporate twisting, as shown in the top row of Figure 5.1 to ensure that both wires pick up roughly the same amount of noise and interference. Coaxial lines use the unbalanced mode, with the inner conductor carrying the signal and the outer conductor grounded.
5.2.1 Wire Pairs Wire pairs are made from copper in several standardised diameters, e.g. 0.32, 0.4, 0.5, 0.63, 0.9, and 1.1 mm. The conductors are insulated with plastic, which also protects against humidity effects. A pair of insulated conductors may be twisted together, and the pair contained in plastic insulation to provide the go and return paths for
5.2 Metallic Line Systems
d
Unscreened twisted pair (UTP):
Insulation
s
Wire pair Insulation Wire pair
Screened twisted pair (STP):
Braided copper screen
Rubber covering
Polyethylene discs
Air dielectric
Coaxial line: μ ln (b/a) H/m 2π C = 2πε/ln (b/a) F/m L=
Zo = 60 ln (b/a)
εr
a
Microstrip line: 377 εr Zo ≃ ([𝑤/h] + 2)
Figure 5.1
b
Insulation
𝑤
Dielectric
Figure 5.2
Conductors
h
Examples of metallic lines.
Wire pair
Outer plastic cover Conductors Conductor insulation (a)
Inner conductor
Outer conductor
(b)
Wire pair
Wire pairs: (a) single pair; (b) star-quad arrangement of two wire pairs.
one circuit connection, as in Figure 5.2a. Four insulated conductors may be grouped together in a star-quad arrangement with diagonally opposite conductors forming a pair, as shown in Figure 5.2b. This arrangement reduces capacitance imbalance between pairs and increases packing density in multipair cable cores, when compared to the twisted-pair arrangement of Figure 5.2a. Often, e.g. in a local telephone link, many wire pairs are required between two points. This is realised by making a cable with tens or even hundreds of insulated wire pairs carried in the cable core. A sheath surrounds the cable core to give mechanical protection. If the sheath is made from a combination of plastic and aluminium foil then it also shields the wires from electrical and magnetic interference. The wire pairs are colour coded for pair identification. Limiting factors in the use of wire pairs include: ● ● ●
Crosstalk between pairs. Interference noise from power lines, radio transmission, and lightning strikes. Severe bandwidth limitation.
The attenuation in decibels (dB) of wire pairs increases as the square root of frequency, as shown in Figure 5.3 for a 0.63 mm diameter copper wire pair. Wire pairs make a good transmission medium for low-bandwidth
329
5 Transmission Media
50 45 40 Attenuation, dB/km
330
35 30 25 20 15 10 5 1kHz
Figure 5.3
10kHz
100kHz
1MHz
10MHz
Frequency
Attenuation of unloaded 0.63 mm copper wire pair at 20 ∘ C.
applications. Underground copper cable cores were a massive global investment designed for the analogue transmission technologies of the twentieth century. They began to be adapted for high-bandwidth digital transmission applications well into the first decade of the twenty-first century but are now being gradually superseded by the increasingly ubiquitous superfast fibre. Some of the applications of copper wire pairs include: ●
●
●
●
A wire pair is used to carry a single voice channel in telephone circuits, where the frequency band of interest is 300–3400 Hz. It is important to maintain a constant attenuation in this band to avoid distortion. This is achieved through inductive loading, a technique invented by Oliver Heaviside in 1885 but which took several years to be appreciated and used. Here, extra inductance is inserted in the wire pair at a regular spacing. For example, a loading of 88 mH per 1.83 km is used in 0.63 mm audio cables to obtain a constant attenuation of ∼0.47 dB/km up to 3.5 kHz, which is down from about 0.9 dB/km at 3.5 kHz for an unloaded wire pair. Thus, loading reduces the attenuation of the wire pair at voice frequencies. However, the bandwidth of a loaded cable is much more restricted, with attenuation increasing very rapidly beyond 3.5 kHz. Extensive growth in the use of digital communication made it necessary to explore ways of transmitting digitised speech signals over these audio cables. This could not be done if the cable bandwidth were restricted to 3.5 kHz by inductive loading that was designed to support analogue speech transmission. A solution was found by removing the loading coils (called de-loading the cables) and installing regenerative repeaters at the old loading coil sites. The de-loaded cables were used to carry 24-channel and 30-channel time division multiplexed signals operating at bit rates 1.5 and 2 Mbit/s, respectively. Wire pairs were also specified for carrying 12–120 analogue voice channels in frequency division multiplexed systems, necessitating operation at bandwidths up to 550 kHz. The wires were of conductor diameter 1.2 mm and had an attenuation of about 2 dB/km at 120 kHz. The same type of wire pair was also used for digital communication systems operating at bit rates of 6 and 34 Mbit/s. With careful installation, wire pairs have been successfully deployed within buildings for some local area network (LAN) connections. UTP have been used for data rates up to 10 Mbit/s (e.g. 10Base-T Ethernet), with cable lengths up to 100 m. STP can support data rates up to 100 Mbit/s (e.g. 100Base-T Ethernet), with cable lengths of a few hundred metres.
5.2 Metallic Line Systems
Downstream transmission (≤ 24 Mb/s) Local loop (To/from NSP) Upstream transmission (≤ 3.3 Mb/s)
Figure 5.4 ●
●
Splitter
ADSL modem
Data
User equipment
POTS Telephone handset
Subscriber equipment
ADSL system components.
Wire pairs are still widely used (as at the end of 2019) to provide fixed-line broadband connection between communication equipment at a customer site and the local exchange (called central office in North America) of a network service provider (NSP). A range of digital subscriber line (DSL) technologies are used, which began to be deployed in the late 1990s. The abbreviation xDSL is often used, where x is a placeholder for specifying different DSL standards. The latest asymmetric digital subscriber line (ADSL2+) standard provides higher download transmission bit rate or downstream rate (up to 24 Mb/s) from the NSP to the subscriber, and lower upload bit rate or upstream rate (up to 3.3 Mb/s) from the subscriber to the NSP. ADSL allows simultaneous transmission of three independent services on one wire pair of maximum length ranging from 3.5 to 5.5 km (depending on transmission bit rate): (i) analogue voice – called plain old telephone service (POTS), (ii) downstream data, and (iii) upstream data. Figure 5.4 shows an FDM-based ADSL system, which requires a splitter and an ADSL modem to place the three services into separate frequency bands in the wire pair – POTS in the lower band from 0 to 4 kHz, upstream data in the next band up to 140 kHz, and downstream data in the final band up to 2.2 MHz. Other xDSL standards include SDSL (single-line DSL), which provides equal, and therefore symmetric, transmission rates in both directions using a single wire pair of maximum length 3 km. It is suitable for certain high-bandwidth applications such as video conferencing, which require identical upstream and downstream transmission speeds. The latest standard of Very high-speed DSL (VDSL2-V+) uses up to 12 MHz of the wire pair bandwidth to support downstream rates up to 300 Mb/s and upstream rates up to 100 Mb/s, the top data rates being achieved only over short distances not exceeding 300 m. Another use of wire pairs in fixed-broadband connection is in a part-copper, part-fibre arrangement known as fibre-to-the-cabinet (FTTC). A wire pair connects customer equipment to a street cabinet from where the subscriber is connected via optical fibre to a digital subscriber line access multiplexer (DSLAM) located in the local exchange. Because the signal is carried over copper for a very short distance (typically 20𝜋l
(5.2)
A transmission line is a four-terminal device for conveying energy or information-bearing signals from one point to another. The signal enters through two terminals at the line’s input and leaves through two terminals at the output. A metallic line is characterised by the following four primary line constants: ● ● ●
●
Resistance R of both conductors of the line per unit length in ohm/m (Ω/m). Inductance L of both conductors of the line per unit length in henry/m (H/m). Leakage conductance G between the line conductors per unit length in siemens/m (S/m). Note that the unit of siemens was formally called mho (which is ohm written backwards). Leakage capacitance C between the line conductors per unit length in farad/m (F/m).
With these four parameters constant with time, the transmission line can be treated as a linear time invariant system, which allows us to focus on what the system does to a complex exponential input signal of frequency f and (hence) angular frequency Ω = 2𝜋f . The response of the system to some arbitrary input at any time may then be determined from our knowledge of the principle of superposition and the fact that the arbitrary input signal will simply be a collection of complex exponentials. We cover this strategy in exhaustive detail in previous chapters.
5.3 Transmission Line Theory
A brief digression is in order to briefly explain the concepts of impedance and admittance in circuit analysis. The impedance Z of a circuit element (resistor, capacitor, or inductor) is the ratio between the voltage v across the element and the current i flowing through the element. Admittance Y is the reciprocal of impedance and hence the ratio between the current and the voltage of the element. For a resistor of resistance R, Ohm’s law states that v = iR, so that v/i = R, and therefore impedance (5.3)
Z=R
For an insulator of leakage conductance G, Ohm’s law again states that i = vG, so that i/v = G, and therefore admittance (5.4)
Y =G
For a capacitor of capacitance C, the voltage v(t) = Q/C, where Q is the charge accumulated on the capacitor plates due to the flow of current until time t. Assuming a complex exponential current i(t) = I max ej𝜔t , we have t
1 1 Q= I e j𝜔𝜏 d𝜏 C C ∫−∞ max t I e j𝜔𝜏 || Imax e j𝜔t = max | = j𝜔C ||−∞ j𝜔C i(t) = j𝜔C
v(t) =
Thus, the impedance Z and admittance Y of the capacitor are Z=
v(t) 1 = = −j∕2𝜋fC i(t) j𝜔C
Y = j𝜔C
(5.5)
The presence of the factor j in the expression for the admittance of a capacitor is essential to capture the fact that the current i through the capacitor leads the voltage V across the capacitor by 90∘ . Finally, for an inductor of inductance L di(t) d = L (Imax e j𝜔t ) dt dt = j𝜔LI max e j𝜔t
v(t) = L
= j𝜔Li(t) Thus, the impedance of an inductor is Z=
v(t) = j𝜔L i(t)
(5.6)
This indicates that an inductor of inductance L has impedance 𝜔L at angular frequency 𝜔 and that the voltage across the inductor leads the current through it in phase by 90∘ . Returning to the transmission line problem at hand, it follows that, with the primary line constants earlier stated, a sinusoidal current of angular frequency 𝜔 (being a pair of conjugate complex exponentials at the same frequency) will see impedance Z per unit length (accounting for voltage drop along the line) and admittance Y per unit length (accounting for leakage current through the insulation) given by Z = R + j𝜔L Y = G + j𝜔C
(5.7)
An infinitesimally short section of the transmission line of length denoted dx will therefore have impedance Zdx and admittance Ydx, as shown in Figure 5.6. It is important to recognise the coordinate system that we will use
335
336
5 Transmission Media
i + di
i Zdx
ʋ + dʋ
+x direction
dx
x + dx Figure 5.6
ʋ
Ydx
x
Transmission line section of length dx. This section length is hugely exaggerated below for illustration.
throughout in our analysis. The transmission line lies along the x axis, which increases from right to left starting with the load termination at x = 0 (not shown in Figure 5.6). The change in voltage from v + dv at the section input to v at the output is because of a voltage drop dv across the impedance Zdx, which has current i + di flowing through it. Also, the change in current from i + di at the section input to i at the output is because of a leakage current di flowing through admittance Ydx, which has a voltage drop v across it. Applying Ohm’s law to these two situations allows us to write (since i ≫ di)
dv = (i + di) Zdx = iZdx di = vYdx
Restating the first and second lines above dv = iZ (i) dx di = vY (ii) (5.8) dx Taking the derivative of (i) with respect to (wrt) x and using (ii) to substitute for the resulting di/dx yields d2 v = ZYv dx2 Similarly, taking the derivative of (ii) wrt x and using (i) to substitute for dv/dx yields
(5.9)
d2 i = ZYi (5.10) dx2 Equation (5.9) is a second-order differential equation, the solution of which gives the voltage v on the line. The form of this equation, whereby the second derivative of v wrt x equals a constant times v, suggests that the variation of v with x is exponential and of the form v = K 1 exp(𝛾x), where K 1 and 𝛾 are independent of x. We have d2 v = 𝛾 2 K1 e𝛾x 2 dx Substituting the above expressions for v and d2 v/dx2 into Eq. (5.9) yields v = K1 e𝛾x ;
dv = 𝛾K1 e𝛾x ; dx
𝛾 2 K1 e𝛾x = ZYK 1 e𝛾x ⇒ 𝛾 2 = ZY ;
⇒ 𝛾=
√
ZY
Thus, v = K 1 exp(𝛾x) is a solution, provided √ 𝛾 = ZY ≡ 𝛼 + j𝛽
(5.11)
5.3 Transmission Line Theory
where 𝛼 is the real part of√𝛾 and 𝛽 its imaginary part. Following the same steps, we find that v = K 2 exp(−𝛾x) is also a solution, provided 𝛾 = ZY . Therefore, the general solution for the voltage on the line is v = K1 e𝛾x + K2 e−𝛾x
(5.12)
Now that we have a solution for v, we make use of Eq. (5.8) (i) to obtain a solution for current i on the line as follows √ √ 1 d 1 dv = (K1 e ZY x + K2 e− ZY x ) i= Z dx Z dx √ 1 √ = ( ZY K1 e𝛾x − ZY K2 e−𝛾x ) Z Thus K K (5.13) i = √ 1 e𝛾x − √ 2 e−𝛾x Z∕Y Z∕Y Equations (5.12) and (5.13) explicitly show the variation of v and i with distance x along the transmission line. The variation of v and i with time is implicit in K 1 and K 2 , which are functions of time in the form of complex exponentials of angular frequency 𝜔 since the input signal is a complex exponential. So, substituting K1 = A1 e j𝜔t , K2 = A2 e j𝜔t in the two equations, we obtain the final expressions for voltage and current on the transmission line as follows v = A1 e j𝜔t e𝛾x + A2 e j𝜔t e−𝛾x = A1 e j𝜔t e(𝛼+j𝛽)x + A2 e j𝜔t e−(𝛼+j𝛽)x = A1 e𝛼x e j(𝜔t+𝛽x) + A2 e−𝛼x e j(𝜔t−𝛽x) ≡ v1 + v2
(5.14)
A A i = √ 1 e𝛼x e j(𝜔t+𝛽x) − √ 2 e−𝛼x e j(𝜔t−𝛽x) ≡ i1 − i2 Z∕Y Z∕Y
(5.15)
Equations (5.14) and (5.15) are hugely significant results that encapsulate all the features of signal transmission on metallic lines, as elaborated in the subsections below.
5.3.1 Incident and Reflected Waves The above solution stipulates that the voltage v is not just a signal varying with time, but it is a function of both time t and distance x along the line. In other words, the voltage v (and similarly for the current i) is a ‘disturbance’ or wave that propagates down the line while also varying with time at any given spot. The solution further stipulates that there are two voltage waves on the line, namely v1 = A1 e𝛼x e j(𝜔t+𝛽x) v2 = A2 e−𝛼x e j(𝜔t−𝛽x) e𝛼x e j(𝜔t+𝛽x)
(5.16) e𝛼x ∠(𝜔t
e𝛼x
≡ A1 + 𝛽x), i.e. v1 has magnitude A1 and phase 𝜔t + 𝛽x radian, it means that Since v1 = A1 to track a particular phase of v1 one must decrease x as time t increases so that phase 𝜔t + 𝛽x remains constant. That is, the location x of a phase of v1 (such as the crest for which 𝜔t + 𝛽x = 0) decreases with time. v1 is therefore a wave travelling in the −x direction towards the load at x = 0 and is therefore referred to as the incident wave. The amplitude of v1 equals A1 at x = 0, t = 0. This amplitude increases exponentially with +x by the factor e𝛼x . Another way of putting this is that the amplitude of the incident wave decreases exponentially as it travels along the line towards the load. By a similar consideration, we see that v2 is a wave travelling in the +x direction away from the load and is therefore referred to as the reflected wave. The amplitude of v2 equals A2 at x = 0, t = 0, and
337
338
5 Transmission Media
then decreases exponentially with x by the factor e−𝛼x . The line also carries incident and reflected current waves i1 and i2 , respectively, given by A i1 = √ 1 e𝛼x e j(𝜔t+𝛽x) Z∕Y A2 −𝛼x j(𝜔t−𝛽x) e e i2 = √ Z∕Y
(5.17)
5.3.2 Secondary Line Constants
√ Since impedance and admittance are in general complex quantities, the parameter 𝛾 = ZY given in Eq. (5.11) is also a complex quantity, known as the propagation constant of the line, having a real part 𝛼 and an imaginary part 𝛽. From Eq. (5.16), the amplitudes V 1 (x + l) and V 1 (x) of the incident wave at distances x + l and x away from the load are V1 (x + l) = A1 e𝛼(x+l) ;
V1 (x) = A1 e𝛼x
The attenuation in dB imposed by the transmission line on the signal over length l is ) ( ( ) V1 (x + l) A1 e𝛼(x+l) = 20 log10 = 20 log10 (e𝛼l ) 20log10 V1 (x) A1 e𝛼x = 20𝛼l log10 e = 8.686𝛼l (dB) Thus Line attenuation = 8.686𝛼
dB per unit length
(5.18)
Note that this attenuation is equivalent to 𝛼 neper per unit length – see Eq. (2.64). The parameter 𝛼 is called the attenuation constant of the line and has units of neper per metre (Np/m). A loss-free line has 𝛼 = 0. Recalling that the phase of v1 is 𝜔t + 𝛽x radian, it follows that, observing the wave at some time instant t (i.e. by keeping t fixed, which is akin to taking a snapshot of the wave), we will find that the phase of the wave changes by 𝛽 radian per unit length. The parameter 𝛽 is therefore known as the phase constant (sometimes called the wavenumber) of the line. The distance over which the phase of the wave changes by 2𝜋 radian is known as the wavelength of the wave, so 𝛽𝜆 = 2𝜋, and we may write 2𝜋 (5.19) 𝜆 On a loss-free line, the incident and reflected wave expressions in Eq. (5.16) may be simplified by substituting 𝛼 = 0, expanding the complex exponential using Euler’s formula (ej𝜃 = cos𝜃 + jsin𝜃), and retaining only the real part to obtain 𝛽=
v1 = A1 cos(𝜔t + 𝛽x) v2 = A2 cos(𝜔t − 𝛽x)
(5.20)
Figure 5.7 shows snapshots of these two waves taken at time instants t = 0, T/4, and T/2, where T = 1/f = 2𝜋/𝜔 is the wave period. Putting a tracker (♣) on the leftmost crest (defined as 𝜔t + 𝛽x = 6𝜋) of the incident wave at t = 0, we see that at some later time t this crest ( will be at a)location given by 𝛽x = 6𝜋 − 𝜔t or x = (6𝜋 − 𝜔t)/𝛽. Thus 2𝜋 6𝜋 = 3𝜆 since =𝜆 At t = 0, crest is located at x = 𝛽( ) 𝛽 T 1 = 2.75𝜆 (since 𝜔T = 2𝜋) At t = T/4, crest is located at x = 6𝜋 − 𝜔 4 𝛽
5.3 Transmission Line Theory
ʋ1 = A1 cos(ωt + kx)
ʋ2 = A2 cos(ωt – kx) t=0 x
3λ
5λ/2
2λ
3λ/2
λ
λ/2
0 β = 2π/λ ω = 2π/T
t = T/4 x
3λ
5λ/2
2λ
3λ/2
λ
λ/2
0
3λ
5λ/2
2λ
3λ/2
λ
λ/2
0
t = T/2 x
Figure 5.7
Snapshots of incident and reflected waves at t = 0, T/4, and T/2.
) ( T 1 At t = T/2, crest is located at x = 6𝜋 − 𝜔 = 2.5𝜆 2 𝛽 So, a constant-phase point of the incident wave travels in the −x direction on the line at a speed known as the phase velocity and given by vp =
𝜆 𝜔 3𝜆 − 2.75𝜆 = = f𝜆 = T∕4 T 𝛽
(5.21)
Similarly, putting a tracker (⧫) on the third crest of the reflected wave (defined as 𝜔t − 𝛽x = 2𝜋) at t = 0, we see that at some later time t this crest is at a location given by (2𝜋 + 𝜔t)∕𝛽. Thus 2𝜋 =𝜆 At t = 0, tracked crest is at x = 𝛽( ) T 1 = 1.25𝜆 At t = T/4, tracked crest is at x = 2𝜋 + 𝜔 ( 4 𝛽 ) T 1 And at t = T/2, tracked crest is at x = 2𝜋 + 𝜔 = 1.5𝜆 2 𝛽 So, a constant-phase point of the reflected wave travels to the left in the +x direction on the line at the same phase velocity vp given in Eq. (5.21), which is applicable to incident and reflected voltage and current waves on both lossy and loss-free lines. The constants discussed above are usually referred to as secondary line constants to distinguish them from the primary line constants (R, L, C, and G) from which they may be calculated using approximate formulas that are applicable at specified frequency ranges. In what follows we use the well-known approximation (1 + a)n = 1 + na + ≈ 1 + na,
n(n − 1) 2 n(n − 1)(n − 2) 3 a + a +··· 2 3×2 a≪1
(5.22)
339
340
5 Transmission Media
At radio frequencies when 𝜔L ≫ R and 𝜔C ≫ G, propagation constant 𝛾 becomes √ √ √ 𝛾 = ZY = (R + j𝜔L)(G + j𝜔C) = j𝜔L(R∕j𝜔L + 1)j𝜔C(G∕j𝜔C + 1) )( ) √ ( √ G R 1+ = j𝜔 LC(1 + R∕j𝜔L)1∕2 (1 + G∕j𝜔C)1∕2 ≃ j𝜔 LC 1 + 2j𝜔L 2j𝜔C √ √ ) ( √ √ G 1 R G R C G L R ≃ j𝜔 LC 1 + + − ≃ + + j𝜔 LC 2j𝜔L 2j𝜔C 4 𝜔L 𝜔C 2 L 2 C ≡ 𝛼 + j𝛽 Therefore, at radio frequencies when 𝜔L ≫ R and 𝜔C ≫ G, attenuation constant and phase constant are given by √ √ R C G L + 𝛼= 2 L 2 C √ 𝛽 = 𝜔 LC (5.23) At audio frequencies when R ≫ 𝜔L but G ≪ 𝜔C, the propagation constant becomes √ √ 𝛾 = (R + j𝜔L)(G + j𝜔C) ≃ (R)(j𝜔C) √ √ √ √ ≃ j𝜔CR = 𝜔CR(j)1∕2 = 𝜔CR(e j𝜋∕2 )1∕2 = 𝜔CRe j𝜋∕4 ( ) √ √ 1 1 ∘ = ( 𝜔CR) × 1∠45 = 𝜔CR √ + j √ 2 2 √ √ = 𝜔CR∕2 + j 𝜔CR∕2 ≡ 𝛼 + j𝛽 Thus, at audio frequencies, when R ≫ 𝜔L and G ≪ 𝜔C, the attenuation constant 𝛼 and phase constant 𝛽 are given approximately by √ (5.24) 𝛼 = 𝛽 = 𝜔CR∕2 Using the approximate formulas for 𝛽 in Eqs. (5.23) and (5.24), we obtain approximate formulas for phase velocity at radio and audio frequencies as 𝜔 vp = 𝛽 ⎧ 𝜔 1 = √ , ⎪ √ LC ⎪ 𝜔 LC √ ≈⎨ 2𝜔 ⎪√ 𝜔 , = ⎪ 𝜔CR∕2 RC ⎩
Radio frequencies (5.25) Audio frequencies
5.3.3 Characteristic Impedance The characteristic impedance Z o of a transmission line is the ratio between voltage and current for a single wave travelling in one direction. Equivalently, characteristic impedance is the input resistance of an infinite length of the line. It is also the ratio of electric to magnetic field strengths of a single wave and is analogous to the wave impedance of free space, but has a different value determined by the primary line constants. We will shortly find that when a line is terminated by its characteristic impedance, there is no reflection from the end of the line, so
5.3 Transmission Line Theory
this is an important parameter in the design of efficient transmission line systems. From Eqs. (5.16) and (5.17) the ratio between incident voltage v1 and incident current i1 is A1 e𝛼x e j(𝜔t+𝛽x) A1 𝛼x j(𝜔t+𝛽x) e e √ Z∕Y √ √ R + j𝜔L Z = = Y G + j𝜔C √ L(R∕L + j𝜔) = C(G∕C + j𝜔)
Zo =
v1 = i1
(5.26)
At radio frequency (RF) (where 𝜔 ≫ R∕L, 𝜔 ≫ G∕C) or for a loss-free line (where R = G = 0) or under Heaviside condition (G/C = R/L), the last expression reduces to √ L Zo = (5.27) C which is independent of frequency and is also a real quantity. At audio frequencies when R ≫ 𝜔L but G ≪ 𝜔C, we have √ Zo = R∕j𝜔C (5.28) In an actual line, G∕C ≪ R∕L so it was common practice in the past to add loading coils at regular intervals along transmission lines in order to make L large enough to satisfy the Heaviside condition G R = (5.29) C L which, as stated above, is required to ensure Z o is real and independent of frequency. This coil loading practice has long been discontinued for reasons discussed in the overview of wire pairs. In the few remaining installations of metallic lines for long-distance (> ∼ 2 km) signal transmission nowadays, digital repeaters have replaced loading coils. Worked Example 5.1
Primary and Secondary Line Constants
A coaxial cable, as shown in the third row of Figure 5.1, consists of copper-conducting material and air dielectric medium. The diameter of the inner conductor is a = 0.7 mm and the inner diameter of the outer conductor is b = 2.9 mm. Resistivity of the air medium is in excess of 109 Ω m, so leakage conductance G ≈ 0. A matched load is connected to the receiving end of 100 m of this cable. The sending end is connected to a 10 MHz sinusoidal signal generator and adjusted to give 10 V rms measured across its input terminals. Given the following primary parameters of the relevant media Resistivity of copper material 𝜌 = 1.72 × 10−8 Ω m Relative permeability of copper 𝜇 r = 1 Relative permittivity of air 𝜀r = 1 Permittivity of free space 𝜀o = 8.8541878128 × 10−12 F/m Permeability of free space 𝜇 o = 1.25663706212 × 10−6 H/m (a) (b) (c) (d)
Determine the primary line constants R, L, and C of the cable. Determine the secondary line constants 𝛼 and 𝛽. Calculate the magnitude of the receiving end current and receiving end voltage. Calculate the efficiency of the transmission line.
341
342
5 Transmission Media
(a) Primary line constant R is resistance per unit length. Eq. (5.1) gives R=
𝜌l 1.72 × 10−8 × 1 1.72 × 10−8 × 1 = = 2 A 𝜋(a∕2) 𝜋(0.7 × 10−3 ∕2)2
= 0.045 Ω∕m Formulas for inductance L and capacitance C per unit length are given in row 3 of Figure 5.1 ( ) ( ) ( ) 𝜇𝜇 𝜇 1 × 1.25663706212 × 10−6 2.9 b b ln = r o ln = ln L= 2𝜋 a 2𝜋 a 2𝜋 0.7 = 284 nH∕m C=
2𝜋𝜀r 𝜀o 2𝜋 × 1 × 8.8541878128 × 10−12 2𝜋𝜀 = = ln(2.9∕0.7) ln(b∕a) ln(b∕a)
= 39.14 pF∕m (b) Operating frequency f = 10 MHz, so we first check to see whether the RF condition (𝜔L ≫ R, 𝜔C ≫ G) is satisfied 𝜔L = 2𝜋 × 10 × 106 × 284 × 10−9 = 17.84 ≫ R = 0.045 𝜔C = 2𝜋 × 10 × 106 × 39.14 × 10−12 = 0.00246 ≫ G = 0 RF condition is therefore satisfied, so we employ the expressions for 𝛼 and 𝛽 in Eq. (5.23) that are applicable at RF to obtain √ √ √ 0.045 3.914 × 10−11 R C G L + = +0 𝛼= 2 L 2 C 2 2.84 × 10−7 = 2.622 × 10−4 Np∕m √ √ 𝛽 = 𝜔 LC = 2𝜋 × 10 × 106 2.84 × 10−7 × 3.914 × 10−11 = 0.21 rad∕m = 12 deg /m (c) Characteristic impedance of the line (under RF conditions) is given by √ √ L 2.84 × 10−7 = = 85 Ω Zo = C 3.914 × 10−11 (d) Sending end voltage V s = 10 V rms. Since the line is terminated with a matched load, the signal source sees an input impedance equal to Z o . Thus, sending end current I s = V s /Z o = 117.3 mA. As the voltage and current waves travel down the line through length l = 100 m to the receiving end, they are attenuated by 8.686𝛼l dB = 0.2278 dB. Thus, receiving end voltage V o and receiving end current I o are given by Vo = Vs × 10−0.2278∕20 = 10 × 10−0.2278∕20 = 9.74 V rms Io = Is × 10−0.2278∕20 = 117.3 × 10−0.2278∕20 = 114.3 mA rms (e) Transmission line efficiency 𝜂 is given by the ratio between output power and input power. Thus 𝜂=
Vo I o 9.74 × 0.1143 × 100% = 94.89% × 100% = Vs Is 10 × 0.1173
5.3.4 Reflection and Transmission Coefficients A transmission line is typically employed to convey a signal from a source to a receiving end or load that is characterised by an impedance Z L . The signal will take some time to propagate as a wave along the line from its source
5.3 Transmission Line Theory
i2
i = i1 – i 2
i1
ʋ = ʋ1+ ʋ2
ZL +x direction
Figure 5.8
Zx
x=0
Line terminated by load impedance Z L at x = 0.
to the load, which could be a transmitting antenna, a low noise amplifier, or other electrical system. We wish to examine what happens to the signal when it arrives at the load. Ideally, the signal should be delivered in its entirety to the load, as any reflection represents not only a signal loss to the load but also a potentially damaging return signal towards the source. Figure 5.8 shows a transmission line terminated by a load impedance Z L at x = 0. Recall that distance along the line is measured leftward from the load. The impedance measured between the line pair at a distance x from the load is known as the line impedance at x, denoted Z x . This impedance is the ratio between the resultant voltage and resultant current at the location and will in general vary with distance, not necessarily being equal to characteristic impedance Z o since there are two voltage waves v1 and v2 and two current waves i1 and i2 on the line. An incident current i1 flows along the line towards the load and a reflected current i2 flows along the same line away from the load, giving a resultant current i = i1 − i2 at any point. The incident voltage v1 and reflected voltage v2 combine to give a resultant voltage v = v1 + v2 . The voltage reflection coefficient 𝜌v provides a measure of how much reflection takes place at the load and is defined as the ratio between reflected voltage v2 and incident voltage v1 at the load. At the load (x = 0), Eqs. (5.16), (5.17), and (5.26) give v1 = A1 e𝛼x e j(𝜔t+𝛽x) |x=0 = A1 e j𝜔t And similarly v2 = A2 e j𝜔t ;
A A i1 = √ 1 e j𝜔t = 1 e j𝜔t ; Zo Z∕Y
i2 =
A2 j𝜔t e Zo
so that 𝜌v =
v2 || A = 2 | v1 |x=0 A1
A2 = 𝜌v A1 = |𝜌v |A1 ∠𝜃v
(5.30)
where 𝜃 v is the angle of 𝜌v which is in general a complex quantity. 𝜃 v is the phase shift that is added to v2 purely by the reflection process. A reflection-induced phase shift will occur whenever the characteristic impedance Z o and load Z L are not both a pure resistance. One other situation where a reflection-induced phase shift will occur is that if both Z o and Z L are a pure resistance with Z L < Z o , then phase shift 𝜃 v = 180∘ . Since Z L = v/i = (v1 + v2 )/ (i1 − i2 ), and (by definition) Z o = v1 /i1 = v2 /i2 , which gives i1 = v1 /Z o and i2 = v2 /Z o , it follows that v1 + v2 || A1 e j𝜔t + A2 e j𝜔t = | i1 − i2 |x=0 (A1 ∕Zo )e j𝜔t − (A2 ∕Zo )e j𝜔t A1 + A2 1 + A2 ∕A1 = = Z A1 ∕Zo − A2 ∕Zo 1 − A2 ∕A1 o 1 + 𝜌v = Z 1 − 𝜌v o
ZL =
343
344
5 Transmission Media
Rearranging to make 𝜌v the subject of the equation yields 𝜌v =
ZL − Zo ≡ |𝜌v |∠𝜃v ZL + Zo
(5.31)
where we have emphasised the fact that the reflection coefficient 𝜌v is in general complex, having magnitude |𝜌v | and angle 𝜃 v , which amounts to a reflection-induced phase shift. The voltage transmission coefficient 𝜏 v is defined as the ratio between the voltage v that is delivered to the load and the incident voltage v1 . Thus 𝜏v = =
v + v2 v v = 1 = 1 + 2 = 1 + 𝜌v v1 v1 v1 2ZL ZL + Zo
(5.32)
The voltage vL delivered to a load, which could be another transmission line, may be expressed in terms of incident voltage v1 and voltage transmission coefficient 𝜏 v as vL = 𝜏v v1
(5.33)
Notice in Eq. (5.32) that if Z L = Z o , then 𝜏 v = 1 and the voltage delivered to the load equals the incident voltage. The current reflection coefficient 𝜌i is the ratio of reflected current i2 to incident current i1 . With direction of current flow as shown in Figure 5.8, it follows that 𝜌i = −
v ∕Z i2 v =− 2 o =− 2 i1 v1 ∕Zo v1 (5.34)
= −𝜌v
The current transmission coefficient 𝜏 i is the ratio between the current iL delivered to the load and the incident current i1 . Thus 𝜏i =
iL i −i i = 1 2 =1− 2 i1 i1 i1
= 1 + 𝜌i = 1 − 𝜌v
(5.35)
The primary coefficient is therefore the voltage reflection coefficient 𝜌v . Once 𝜌v is known, all the other coefficients may be calculated as follows Voltage transmission coefficient: 𝜏v = 1 + 𝜌v Current reflection coefficient:
𝜌i = −𝜌v
Current transmission coefficient: 𝜏i = 1 − 𝜌v
(5.36)
The extent of reflections on a transmission line may also be quantified in terms of signal powers. A parameter known as return loss (RL) is defined as the ratio of incident power P1 to reflected power P2 expressed in dB. Thus ( ) ( ) ( ( ) ) A21 | A1 | | A2 | P1 | | = −20 log | | = 20 log = 10 log10 RL = 10 log10 10 10 |A | |A | P2 A22 | 2| | 1| = −20 log10 (|𝜌v |) dB
(5.37)
Notice that in an open- or short-circuited line, |𝜌v | = 1, so that RL = 0, whereas in a matched line |𝜌v | = 0 and RL = ∞. The design goal is therefore usually a large return loss.
5.3 Transmission Line Theory
Mismatch loss (ML) is defined as the ratio (in dB) between incident power P1 and power PL delivered to the load. Thus ( ) ( ) P1 P1 ML = 10 log10 = 10 log10 P P1 − P2 ( L ) 1 = 10 log10 = −10 log10 (1 − P2 ∕P1 ) 1 − P2 ∕P1 = −10 log10 (1 − |𝜌v |2 ) dB
(5.38)
We see that an open- or short-circuited line leads to ML = ∞, since |𝜌v | = 1; and a matched line, for which |𝜌v | = 0, has ML = 0. Thus, the aim in design will be to achieve a negligibly small mismatch loss. Finally, the power transfer efficiency of the transmission line is defined as the percentage ratio between power delivered to the load and incident power. Thus PL × 100% P1 P − P2 = 100 1 P1
Power transfer efficiency =
= 100(1 − |𝜌v |2 )
(5.39)
5.3.5 Standing Waves The voltage signal v(x, t) observed on the transmission line at time t and distance x from the load is the resultant or sum of the incident wave v1 and reflected wave v2 . There will be points along the line where v1 and v2 are exactly in phase and their resultant has maximum amplitude Amax . There are also points along the line where v1 and v2 are exactly out of phase so that their resultant has minimum amplitude Amin . The rest of the points along the line will have a range of amplitudes between Amin and Amax . What we have here is therefore a situation where two oppositely propagating waves create a pattern of ‘disturbance’ at each location along the line. Once enough time has elapsed for the two waves to be established throughout the line, every point along the line is actively ‘disturbed’ but each point by a different amount, ranging from a peak-to-peak voltage variation of 2Amin to a peak-to-peak variation of 2Amax . This pattern of disturbance is called a standing wave pattern. For simplicity, let us assume a loss-free so that v1 = A1 cos(𝜔t + 𝛽x);
v2 = A2 cos(𝜔t − 𝛽x + 𝜃v )
where A1 and A2 are positive and real and any reflection-induced phase shift in v2 has been transferred into the angle as 𝜃 v . At point x, v1 and v2 are just two sinusoidal signals of the same (angular) frequency 𝜔 but different amplitudes A1 and A2 and different phases 𝛽x and −𝛽x + 𝜃 v . We therefore use the method of sinusoidal addition discussed in Section 2.7.3 to obtain the resultant signal v(x, t) = A cos(𝜔t + 𝜙(x)). Making use of relevant trigonometric entries to simplify the expression for resultant amplitude A, the in-phase component AI , quadrature component AQ , resultant amplitude A, and phase 𝜙(x) of v(x, t) are AI = A1 cos(𝛽x) + A2 cos(−𝛽x + 𝜃v ) AQ = A1 sin(𝛽x) + A2 sin(−𝛽x + 𝜃v ) √ A = A2I + A2Q √ = A21 + A22 + 2A1 A2 cos(2𝛽x − 𝜃v ) 𝜙(x) = tan−1 (AQ ∕AI )
345
346
5 Transmission Media
There is a little more subtlety to the phase 𝜙(x) than indicated by the above formula. See Eq. (2.52) for details. Our interest here is, however, only in the amplitude A. Using Eq. (5.30) to make the substitution A2 = |𝜌v |A1 in the above expression for A, we obtain that the sinusoidal voltage signal at a distance x from the load has amplitude √ A = A21 + A22 + 2A1 A2 cos(2𝛽x − 𝜃v ) √ = A1 1 + |𝜌v |2 + 2|𝜌v | cos(2𝛽x − 𝜃v ) (5.40) This equation indicates that the variation of resultant voltage amplitude with distance along the line is a function of the voltage reflection coefficient 𝜌v . If 𝜌v = 0 then A = A1 at every point on the line. A line terminated with a matched load Z L = Z o has no signal reflection and hence no standing wave pattern. In all other situations of an unmatched line, 𝜌v will be nonzero, and the resultant amplitude will vary with distance. Since, in Eq. (5.40), cos(2𝛽x − 𝜃 v ) ranges in value from −1 to +1, it means that, at locations x where cos(2𝛽x − 𝜃 v ) = 1, the resultant voltage will have maximum amplitude Amax , whereas locations x at which cos(2𝛽x − 𝜃 v ) = −1 will have minimum amplitude Amin . A point of maximum amplitude is often referred to as an antinode and a point of minimum amplitude a node. Locations of nodes and antinodes are fixed along the line; they do not drift with the waves; they just stand in their various locations, hence the term standing waves. Denoting antinode locations as xa and node locations as xd , and recalling that cosΘ = 1 for Θ = 0, 2𝜋, 4𝜋, …, 2n𝜋, and that cosΘ = −1 for Θ = 𝜋, 3𝜋, 5𝜋, …, (2n+1)𝜋, we solve cos(2𝛽xa − 𝜃 v ) = 1 and cos(2𝛽xd − 𝜃 v ) = −1 to obtain the following results (using Eq. (5.19) to aid simplification) ( ) 𝜃 2n𝜋 + 𝜃v 𝜆 = n+ v xa = 2𝛽 2𝜋 2 ( ) 𝜃v 𝜆 (2n + 1)𝜋 + 𝜃v 1 = n+ + xd = 2𝛽 2 2𝜋 2 n = 0, 1, 2, 3, · · ·
(5.41)
Therefore, antinodes are spaced apart by half a wavelength along the line. Nodes are also spaced apart by 𝜆/2, and the distance between a node and an adjacent antinode is 𝜆/4. If 𝜃 v = 0 (i.e. no phase shift upon reflection) then the first antinode is located at the load and the first node is located one-quarter of a wavelength from the load. The maximum and minimum resultant amplitudes are obtained by substituting cos(2𝛽x − 𝜃 v ) = ±1 into Eq. (5.40) √ √ Amax = A1 1 + |𝜌v |2 + 2|𝜌v | = A1 (1 + |𝜌v |)2
Amin
= A1 (1 + |𝜌v |) √ √ = A1 1 + |𝜌v |2 − 2|𝜌v | = A1 (1 − |𝜌v |)2 = A1 (1 − |𝜌v |)
(5.42)
The ratio Amax /Amin is called the voltage standing wave ratio (VSWR). Thus VSWR =
1 + |𝜌v | 1 − |𝜌v |
(5.43)
VSWR is a real and positive parameter that gives an important indication of how well an electrical load such as an antenna is matched to a transmission line that feeds it. Under perfect matching, 𝜌v = 0 and VSWR has its smallest value of 1.0. Since impedance is a function of frequency, such perfect matching will only be achieved at one frequency, say f c . The level of mismatch and hence VSWR and the amount of lost power (i.e. power reflected away
5.3 Transmission Line Theory
Zo = 100 Ω; ZL = 100 Ω; VSWR = 1
2
Matched line: ZL = Zo
1
Resultant amplitude A (volt) →
0
Zo = 100 Ω; ZL = ∞; VSWR = ∞
2 1 AN 0
N
N
AN
2
AN
AN
N
Zo = 100 Ω; ZL = 200 Ω; VSWR = 2
1 AN 0 1
Open line: ZL = ∞
Zo = 100 Ω; ZL = 0 Ω; VSWR = ∞
2 1 N 0
N
AN
N
N
AN 0.5
AN 0
Short-circuited line: ZL = 0
Resistive load: RL > Zo
← Distance from load (wavelengths)
Figure 5.9 Standing wave pattern for various line terminations and VSWR. Locations of nodes (N) and antinodes (AN) are identified on each plot.
from the load) will then increase as signal frequency departs from f c up or down. So, the operational bandwidth of an antenna is typically specified as the range of frequencies over which VSWR remains below a certain level. It is straightforward to manipulate Eq. (5.43) to express the magnitude of reflection coefficient in terms of VSWR as VSWR − 1 (5.44) |𝜌v | = VSWR + 1 Figure 5.9 shows the standing wave pattern on a transmission line for various line terminations and VSWR values. Plots of the resultant amplitude of the voltage signal at various points on the line starting from the load and covering up to one wavelength from the load are shown. For example, an open line has ZL = ∞ so Eqs. (5.31) and (5.43) yield 𝜌v =
ZL − Zo Z = L = 1∠0∘ ; ZL + Zo ZL
VSWR =
1 + |𝜌v | 1 + 1 = =∞ 1 − |𝜌v | 1 − 1
So, with 𝜃 v = 0, it follows from Eq. (5.41) that in an open-circuit the first antinode (for which n = 0) is located at distance xa = 0 from the load (and thus at the load), and the first node is located at one-quarter of a wavelength from the load. Notice also how the standing wave ripple reduces from being twice the incident wave amplitude when VSWR = ∞ (for open-circuit and short-circuit conditions) to zero when VSWR = 1.0 (for a matched line condition).
5.3.6 Line Impedance and Admittance The line impedance Z x at a distance x from load Z L (looking towards the load as indicated in Figure 5.10) on a transmission line of characteristic impedance Z o is the ratio between the voltage v measured across the line pair at x and the current i flowing in the line at the same point. Using the expressions for v and i in Eqs. (5.12) and (5.13)
347
348
5 Transmission Media
Zo cable
ZL
Zx x
Figure 5.10
Line impedance Z x .
with K1 = A1 e j𝜔t , K2 = A2 e j𝜔t , Zo = Zx =
=
e𝛾x
√
Z∕Y , A2 = 𝜌v A1 as earlier established leads to
e−𝛾x
K1 + K2 K K1 𝛾x e − √ 2 e−𝛾x √ Z∕Y Z∕Y j𝜔t 𝛾x j𝜔t −𝛾x A1 e e + A2 e e
v = i
A1 e j𝜔t 𝛾x A2 e j𝜔t −𝛾x e − e Zo Zo e𝛾x + 𝜌v e−𝛾x = Zo 𝛾x e − 𝜌v e−𝛾x Using Eq. (5.31) for 𝜌v ⎡ 𝛾x ⎢e + Zx = ⎢ ⎢ 𝛾x ⎢e − ⎣
ZL − Zo −𝛾x ⎤ e ⎥ ] [ ZL e𝛾x + Zo e𝛾x + ZL e−𝛾x − Zo e−𝛾x ZL + Zo ⎥Z = Z ZL − Zo −𝛾x ⎥ o ZL e𝛾x + Zo e𝛾x − ZL e−𝛾x + Zo e−𝛾x o e ⎥ ZL + Zo ⎦
⎡ Z + Z e𝛾x − e−𝛾x ] o 𝛾x ZL (e𝛾x + e−𝛾x ) + Zo (e𝛾x − e−𝛾x ) ⎢ L e + e−𝛾x Z = = ⎢ o 𝛾x −𝛾x Zo (e𝛾x + e−𝛾x ) + ZL (e𝛾x − e−𝛾x ) ⎢ Zo + ZL e − e ⎣ e𝛾x + e−𝛾x [
⎤ ⎥ ⎥ Zo ⎥ ⎦
The hyperbolic functions sinh(x), cosh(x), and tanh(x) are defined as follows ex − e−x = −j sin(jx) 2 ex + e−x cosh(x) = = cos(jx) 2 sinh(x) ex − e−x = x = −j tan(jx) tanh(x) = cosh(x) e + e−x sinh(x) =
(5.45)
Introducing the hyperbolic tangent allows us to express Z x compactly in terms of load impedance and the line’s characteristic impedance and propagation constant as ] [ ZL + Zo tanh(𝛾x) Z (5.46) Zx = Zo + ZL tanh(𝛾x) o On a loss-free line (𝛼 = 0), propagation constant 𝛾 = 𝛼 + j𝛽 = j𝛽, so tanh(𝛾x) =
2j sin(𝛽x) e j𝛽x − e−j𝛽x = j tan(𝛽x) = 2 cos(𝛽x) e j𝛽x + e−j𝛽x
and Eq. (5.46) reduces to ] [ ZL + jZ o tan(𝛽x) Zx = Z Zo + jZ L tan(𝛽x) o
(5.47)
5.3 Transmission Line Theory
There are three special cases of this equation that find extensive applications in impedance matching and RF system design. First, if the line is short-circuited (i.e. Z L = 0), then the line impedance Z x reduces at every point to a pure reactance, and hence the line admittance Y x = 1/Z x is a pure susceptance. These are given by Zx = jZ o tan(𝛽x) ⎫ ⎪ Yx = −jY o cot(𝛽x)⎬ , yx = −j cot(𝛽x) ⎪ ⎭
Short-circuit stub (Yo = 1∕Zo ; yx = Yx ∕Yo )
(5.48)
Second, if the line is open-circuited (i.e. Z L = ∞), the line impedance also reduces to a pure reactance (and the line admittance to a pure susceptance) given by Zx = −jZ o cot(𝛽x)⎫ ⎪ Yx = jY o tan(𝛽x) ⎬ , ⎪ yx = j tan(𝛽x) ⎭
Open-circuit stub
(5.49)
These behaviours provide a means of creating pure inductors and capacitors at higher frequencies by using a short piece of short- or open-circuited transmission line known as a resonant line section or tuning stub. A short-circuit stub is usually preferred because of fringing effects in open-circuit stubs. Third, when x = 𝜆/4, the term 𝛽x equals 𝜋/2. This means that the second term in the denominator and numerator of Eq. (5.47) dominates since tan(𝛽x) → ∞ at this point. Therefore, the input impedance of a loss-free quarter-wave line of characteristic impedance Z o terminated with load Z L is obtained as ] ] [ [ ZL + jZ o tan(𝛽𝜆∕4) ZL + jZ o tan(𝜋∕2) Zo = Z Z𝜆∕4 = Zo + jZ L tan(𝛽𝜆∕4) Zo + jZ L tan(𝜋∕2) o ] [ jZ o tan(𝜋∕2) Z Zo = o Zo = jZ L tan(𝜋∕2) ZL = Zo2 ∕ZL
(5.50)
Before leaving this topic, it is helpful to introduce admittance parameters, namely characteristic admittance Y o = 1/Z o , load admittance Y L = 1/Z L , and line admittance Y x = 1/Z x . Also, we divide impedances by Z o and admittances by Y o to obtain normalised parameters (denoted using lowercase letters) such as normalised characteristic impedance zo = Z o /Z o = 1, normalised characteristic admittance yo = Y o /Y o = 1, normalised load admittance yL = Y L /Y o = Z o /Z L , and normalised line admittance yx = Y x /Y o = Z o /Z x . We may then manipulate Eq. (5.47) to obtain an expression for normalised line admittance yx at distance x from normalised load yL as } { Z + jZ L tan(𝛽x) Zo + jZ L tan(𝛽x) Zo yx = = o = Zo Zx Zo [ZL + jZ o tan(𝛽x)] ZL + jZ o tan(𝛽x) Z ∕Z + j tan(𝛽x) = o L 1 + jZo ∕ZL tan(𝛽x) y + j tan(𝛽x) = L (5.51) 1 + jyL tan(𝛽x) Worked Example 5.2
Transmission Line Analysis 1
A loss-free 100 Ω transmission line is terminated in a load 80 + j60 Ω. We wish to analyse various aspects of the performance of the line. (a) (b) (c) (d)
Determine the voltage reflection coefficient. Determine the voltage standing wave ratio. Determine impedance at a distance 0.3𝜆 from load. Determine impedance at a distance 𝜆/4 from load.
349
350
5 Transmission Media
(e) Discuss the variation of impedance with distance from load and identify all salient features along with a comparison with the line’s standing wave pattern. A 100 Ω sinusoidal signal generator of frequency 25 MHz which on open circuit gives 50 V rms is now connected to the input of a 2 m length of this line terminated by the above load. (f) (g) (h) (i)
Sketch a phasor diagram showing the incident, reflected and resultant rms voltages at the load. Calculate the average power consumed by the load. Determine the maximum and minimum rms voltages on the line. Determine the impedance presented to the generator at the line input.
(a) Voltage reflection coefficient Z − Zo 80 + j60 − 100 −20 + j60 63.246∠108.435∘ 𝜌v = L = = = ZL + Zo 80 + j60 + 100 180 + j60 189.737∠18.435∘ 1 = ∠90∘ 3 (b) Voltage standing wave ratio VSWR =
1 + |𝜌v | 1 + 1∕3 4∕3 = = =2 1 − |𝜌v | 1 − 1∕3 2∕3
(c) At distance x = 0.3𝜆 from load, 𝛽x = 2𝜋/𝜆 × 0.3𝜆 = 0.6𝜋. Impedance Z x at this point is obtained using Eq. (5.47) since line is loss-free 80 + j60 + j100 × tan(0.6𝜋) × 100 Zx |x=0.3𝜆 ≡ Z0.3𝜆 = 100 + j(80 + j60) × tan(0.6𝜋) 80 − j247.77 260.364∠−72.11∘ = × 100 = × 100 284.661 − j246.215 376.369∠−40.858∘ = 69.18∠−31.25∘ Ω = 59.14 − j35.89 Ω (d) Impedance at a distance 𝜆/4 from load is determined using Eq. (5.50) Zo2 104 (80 − j60) 104 (80 − j60) 1002 = = = ZL 80 + j60 (80 + j60)(80 − j60) 802 + 602 = 80 − j60 Ω = 100∠−36.87∘ Ω
Z𝜆∕4 =
(e) A full discussion requires plots of resultant amplitude and line impedance versus distance from load as shown in Figure 5.11. We note the following features and trends, which apply in general to all transmission lines. Maximum impedance Z max occurs at antinodes and is purely resistive with a value given by Zmax = Zo × VSWR
(5.52)
(i) Minimum impedance Z min occurs at nodes and is also purely resistive with a value given by Zmin = Zo ∕VSWR
(5.53)
(ii) Looking at line sections between adjacent antinode and node, the imaginary part of impedance, called reactance, alternates between positive and negative as one goes from one section to the next. The line is inductive when reactance is positive and is capacitive when reactance is negative. The reactance is zero at nodes and antinodes. (iii) The real part of impedance, called resistance, is always positive, having a minimum value of Z o /VSWR at nodes and a maximum value of Z o × VSWR at antinodes.
5.3 Transmission Line Theory
Resultant Amplitude
1.5A1
A1
7λ/4
3λ/2
5λ/4
λ
3λ/4
λ/2
λ/4
0
nce peda
Im
200 150 100 50
Resistance x
0 –50
Reactan ce
2λ
7λ/4
3λ/2
5λ/4 λ 3λ/4 ← Distance from load
Impedance (Ω)
0.5A1 2λ
λ/2
λ/4
0
–100
Figure 5.11 Worked Example 5.2e: Standing wave pattern and line impedance on transmission line with Z o = 100 Ω, Z L = 80 + j60 Ω, and VSWR = 2; Reflection-induced phase shift = 90∘ .
(iv) The standing wave pattern and line impedance are periodic along the line, having a period of half a wavelength (𝜆/2). This of course assumes a loss-free line. If there are losses then the standing wave pattern would be damped down as the signal propagates down the line due to an exponentially decaying amplitude. (f) At the instant of connection, the signal is not yet reflected; therefore, the source sees impedance Z o . The scenario is as illustrated in Figure 5.12a. The rms voltage V i applied to the line, which travels down the loss-free line as the incident voltage, is obtained by voltage division between Rs and Z o as Vi =
Zo 100 × 50 = 25 V rms V = Rs + Zo s 100 + 100
A phasor diagram depicts the relative phases and amplitudes of sinusoidal signals of the same frequency that are summed to obtain a resultant signal. One of the signals must be chosen as reference and its phase arbitrarily set to zero. On transmission lines, the reference is usually the incident voltage. Thus, denoting the incident rms voltage, reflected rms voltage, and resultant rms voltage at load as V i , V r , and V L respectively, we have V = 25∠0∘ V rms i
1 ∠90∘ × 25∠0∘ = 8.33∠90∘ V rms 3 VL = Vi + Vr = 25∠0∘ + 8.33∠90∘ = 25 + j8.33 = 26.35∠18.4∘ V rms Vr = 𝜌v Vi =
Note that V L is the voltage delivered to the load, which is also given by Eq. (5.33) in terms of the voltage transmission coefficient and incident voltage as ( ) 1 VL = 𝜏v Vi = (1 + 𝜌v )Vi = 1 + ∠90∘ × 25∠0∘ = 26.35∠18.4∘ V rms 3 The phasor diagram depicting V i , V r , and V L is shown in Figure 5.12b.
351
352
5 Transmission Media
Rs = 100 Ω (a)
Vs = 50 V rms
Zo = 100 Ω
Vi
°V
VL =
(b)
18.4 .35∠
rms
26
Vr = 8.33∠90° V rms
18.4° Vi = 25∠0° V rms
Figure 5.12
Worked Example 5.2: (a) Line scenario at instant of connection; (b) Phasor diagram.
(g) You may wish to refer to Section 3.5.2 for a discussion of active or average power consumed in a load based on Figure 3.17 if you are new to the concepts of reactive and active powers. With ZL = 80 + j60 = 100∠36.87∘ ≡ |ZL |∠ZL , average power consumed by the load is PL =
Vl2 rms |ZL |
cos(∠ZL ) =
26.352 cos(36.87∘ ) 100
= 5.56 W (h) Maximum and minimum rms voltages Vrmsmax and Vrmsmin , respectively, occur at an antinode and a node and are given by Vrmsmax = Vi
rms (1
+ |𝜌v |) = 25(1 + 1∕3) = 33.33 V rms
Vrmsmin = Vi
rms (1
− |𝜌v |) = 25(1 − 1∕3) = 16.67 V rms
(i) Since no information is provided, we assume a velocity ratio of 1 on the line, which means that signals propagate at the speed of light. The wavelength of the signal is therefore 𝜆 = v/f = 3 × 108 /25 × 106 = 12 m. Thus, a 2 m length of the line is 𝜆/6 in terms of wavelength and what is required is to use Eq. (5.47) to calculate line impedance Z𝜆/6 ] ] [ ZL + jZ o tan(𝛽𝜆∕6) ZL + jZ o tan(𝜋∕3) Zo = Z Zo + jZ L tan(𝛽𝜆∕6) Zo + jZ L tan(𝜋∕3) o 80 + j60 + j100 × tan(𝜋∕3) 80 + j233.21 × 100 = × 100 = 100 + j(80 + j60) × tan(𝜋∕3) −3.923 + j138.56 [
Z𝜆∕6 =
= 166.53 − j62.45 Ω = 177.86∠−20.56∘ Ω Worked Example 5.3
Transmission Line Analysis 2
Measurements at 100 MHz show that a loss-free 75 Ω transmission line with air dielectric has a voltage maximum 25 V, and a voltage minimum 12.5 V (located 45 cm from the load). Determine. (a) Operating wavelength. (b) Voltage reflection coefficient. (c) Load impedance.
5.3 Transmission Line Theory
(d) (e) (f) (g) (h)
Maximum line impedance. Return loss. Mismatch loss. Power transfer efficiency. Shortest distance between points of voltage minimum and maximum.
√ (a) Air dielectric has relative permittivity 𝜀r = 1, which gives propagation speed on the line v = c∕ 𝜀r = 3 × 108 m∕s, and hence operating wavelength v 3 × 108 =3m = f 100 × 106 (b) To determine the voltage reflection coefficient 𝜌v , we first calculate VSWR using the Amax and Amin information provided then the magnitude of 𝜌v using VSWR, and finally the angle of 𝜌v using the information given on the location of the voltage minimum A 25 VSWR − 1 2 − 1 1 = 2. Thus, |𝜌v | = = = VSWR = max = Amin 12.5 VSWR + 1 2 + 1 3 𝜆=
From Eq. (5.41), voltage minima occur at ) ( 𝜃 1 𝜆 xd = n + + v 2 2𝜋 2 and the first such minimum corresponds to n = 0 and occurs at location ) ( 𝜃v 𝜆 1 + = 45 cm (given) xdmin = 2 2𝜋 2 Thus, phase angle of 𝜌v is 𝜃v = 4𝜋(xmin ∕𝜆 − 1∕4) = 720∘ (45∕300 − 0.25) = −72∘ Hence 1 ∠−72∘ = 0.103 − j0.317 3 (c) Load impedance is determined from Z o and 𝜌v as follows using the equation just before Eq. (5.31) 1 + 𝜌v 1.103 − j0.317 × 75 ZL = Z = 1 − 𝜌v o 0.897 + j0.317 82.73 − j23.78 86.07∠−16.04∘ = = 0.897 + j0.317 0.9514∠19.46∘ ∘ = 90.47∠−35.5 Ω = 73.66 − j52.54 Ω 𝜌v = |𝜌v |∠𝜃v =
(d) (e) (f) (g)
Maximum line impedance Z max = Z o × VSWR = 75 × 2 = 150 Ω. Return loss, RL = −20 log10 (|𝜌v |) = −20 log10 (1∕3) = 9.54 dB Mismatch loss, ML = −10 log10 (1 − |𝜌v |2 ) = −10 log10 (8∕9) = 0.51 dB Power transfer efficiency 𝜂 = (1 − |𝜌v |2 ) × 100% = (8∕9) × 100% = 88.9%
(h) Shortest distance between points of voltage minimum and maximum is the separation between a node and an adjacent antinode, which is 𝜆/4 = 75 cm.
5.3.7 Line Termination and Impedance Matching We have seen that, whenever a transmission line of characteristic impedance Z o is terminated in a load Z L (as shown in Figure 5.10), a fraction 𝜌v = (Z L − Z o )/(Z L + Z o ) of the voltage signal is reflected back. The load Z L
353
354
5 Transmission Media
could be another cable. For example, TV aerials use 75 Ω coaxial cable, whereas LANs use 50 Ω coax. Reflections represent a waste of some of the power intended for the load. Reflections can also give rise to unpleasant echo (due to multiple reflections from both ends of the line), distortion, instability at the source, or damage to sensitive system components. In extreme cases of high-power connection to a transmit-antenna, reflections could even lead to electrical breakdown at points of voltage maximum on the line. The following three special cases of line termination have consequences that are readily analysed. Matched circuit, Z L = Z o : the line is said to be terminated with a matched load (i.e. a load matched to the characteristic impedance of the line). In that case 𝜌v = 0, 𝜌i = 0 and there is no reflection and hence no standing waves along the line. The incident voltage is delivered in its entirety to the matched load. Assuming a loss-free line, Eq. (5.47) shows that the line impedance Z x at every point along the line equals the characteristic impedance Z o of the line, and Eq. (5.42) shows that wave amplitude is the same at every point of the line and equals the incident amplitude A1 . Open circuit, Z L = ∞: in an open circuit, 𝜌v = 1, 𝜌i = −1. On switching on the source, a current pulse i1 travels down the line, reaching the load after a time l/vp whence a reflected current pulse i2 = i1 is produced travelling in the opposite direction. The resultant current at every point on the line reached by i2 is i = i1 − i2 = 0. After a transient time 2l/vp this reflected current i2 reaches the source end so that current is zero everywhere on the line. Considering the steady-state open-circuit condition along the line, Eq. (5.42) indicates that maximum voltage amplitude Amax = 2A1 and minimum voltage amplitude Amin = 0 at antinodes and nodes, respectively. With reflection-induced phase shift 𝜃 v = 0, Eq. (5.41) gives antinode locations at x = 0, 𝜆/2, 𝜆, …; and node locations at 𝜆/4, 3𝜆/4, … Since line impedance under open-circuit condition follows from Eq. (5.47) as ] ] [ [ ZL + jZ o tan(𝛽x) ZL Zx |ZL =∞ ≡ Zoc = Z = Z Zo + jZ L tan(𝛽x) o jZ L tan(𝛽x) o Zo (5.54) = j tan(𝛽x) it follows that at antinodes (x = 0, 𝜆/2, 𝜆, …) where voltage is maximum, impedance is infinite (since tan𝛽x = 0) and therefore current is zero. At nodes (x = 𝜆/4, 3𝜆/4, …), we have tan(𝛽x) = tan((2𝜋∕𝜆) × (𝜆∕4)) = tan(𝜋∕2) → ∞, etc., which means that impedance → 0; but voltage is zero, so current is also zero. Therefore steady-state current is zero in an open-circuit line. Short circuit, Z L = 0: in a short circuit, 𝜌v = −1, 𝜌i = 1. On switching on the source, a current pulse i1 travels down the line, reaching the load after a time l/vp whence a reflected current pulse i2 = −i1 is produced travelling in the opposite direction. The resultant current at every point on the line reached by i2 is i = i1 − i2 = 2i1 , and this grows indefinitely after time 2l/vp as further reflections add to the current. Line impedance under short-circuit condition follows from Eq. (5.47) as ] [ jZ o tan(𝛽x) Zx |ZL =0 ≡ Zsc = Zo Zo = jZ o tan(𝛽x)
(5.55)
Thus, from Eqs. (5.54) and (5.55), the product of open-circuit line impedance and short-circuit line impedance equals the square of the line’s characteristic impedance Zoc Zsc = Zo2
(5.56)
To avoid reflections due to a mismatch, proper termination must be used with Z L = Z o . A range of methods exist to match two cables of different characteristic impedances, or to match a transmission line of characteristic impedance Z o with an electrical load or system having input impedance Z L ≠ Z o . We briefly introduce the methods of quarter-wave transformer, attenuation pad, and single-stub matching, and present a detailed design example on the last.
5.3 Transmission Line Theory
λ/4 R
(a) Zo Zo1 Zin = Z2o1/R = Zo
d1
λ/4
ZL
(b) Zo Zo2
Zo1 Zin = Z2o1/R = Zo Figure 5.13
R
Quarter-wave transformer: (a) Matching resistor R to Z o ; (b) Matching impedance Z L to Z o .
Quarter-wave transformer: Eq. (5.50) indicates that the impedance looking into the input terminals of an arrangement or subsystem consisting of a quarter-wavelength long transmission line of characteristic impedance Z o1 that is terminated by a resistive load R (as illustrated in Figure 5.13a) is 2 Zo1
(5.57) R It means that when this arrangement is connected to a transmission line of characteristic impedance Z o , there will be impedance matching if Zin =
Zo = Zin = or Zo1 =
√
2 Zo1
Zo R
R (5.58)
This relationship reveals one way to match a resistive load R to a transmission line (of characteristic impedance) Z o : we connect R to the line through a 𝜆/4-length of a cable of characteristic impedance Z o1 selected to satisfy Eq. (5.58). The desired value of Z o1 is achieved using the physical dimensions a and b of the 𝜆/4-length interface cable and the relative permittivity 𝜀r of the cable’s dielectric medium, as defined in Figure 5.1. Once the correct values of a, b and 𝜀r have been set, Eq. (5.58) will be satisfied at all frequencies since Z o and R are practically frequency independent over a wide range of frequencies. However, a perfect match is only achieved while the interface cable is of length exactly 𝜆/4, and for a physical cable length l this will be the case at one frequency f c given by c (5.59) fc = √ 4l 𝜀r Here, 𝜀r is the relative permittivity of the dielectric medium between the inner and outer conductors of the interface cable and c = 3 × 108 m/s is the speed of light. For example, for 𝜀r = 1, a cable length l = 150 cm is a
355
356
5 Transmission Media
quarter wavelength only at f c = 50 MHz. When one departs from this frequency, up or down, the specification will no longer be satisfied, and a mismatch will develop that leads to VSWR > 1, which increases as the difference between signal frequency and f c increases. The quarter-wave transformer described above transforms any real impedance R to another real impedance Z o = Z in given by Eq. (5.57). Since a line terminated by a general impedance Z L is purely resistive at its nodes and antinodes (see Figure 5.11), it is possible to apply the quarter-wave transformer method to transform a non-real load impedance Z L to a real impedance Z o in two stages as shown in Figure 5.13b. First connect Z L to length d1 of a cable of characteristic impedance Z o2 . The impedance seen looking into the input of this Z o2 -cable has a purely resistive value R determined by whether d1 is a nodal or anti-nodal distance according to the relations { Zo2 × VSWR, d1 = (𝜃v ∕𝜋)𝜆∕4 R= d1 = (1 + 𝜃v ∕𝜋)𝜆∕4 Zo2 ∕VSWR, 1 + |𝜌v | ZL − Zo2 (5.60) ≡ |𝜌v |∠𝜃v ; VSWR = ZL + Zo2 1 − |𝜌v | With Z L transformed into R in this way, connecting this Z L -terminated Z o2 -cable of length d1 to the quarter-wave transformer as discussed above achieves the desired match to Z o . Attenuation pads: an attenuation pad may be inserted to match one cable of, say, 75 Ω to another cable or device of a different impedance such as 50 Ω. Figure 5.14 shows three different arrangements. The minimum loss pad uses two resistors whereas the T-network and 𝜋-network pads make use of three resistors. Besides impedance matching, attenuation pads also serve an important purpose of introducing a controlled amount of attenuation into the line, which, for example, helps to prevent sustained multiple reflections in the event of a mismatch at both ends or unwanted discontinuities in the line, especially at joints. The correct design values have been inserted in the minimum loss pad to match a 75 Ω cable to a 50 Ω cable and vice versa. To see that there is impedance matching in both directions, consider that the 75 Ω cable sees a 43.3 Ω resistance connected in series with a parallel connection Minimum loss pad
75 Ω
75 Ω Cable
43.3 Ω 86.6 Ω
50 Ω Cable
50 Ω
T-Network
Ro1
Ro1 Cable
R1
R3
R2
Ro2 Cable
Ro2
Ro2 Cable
Ro2
π-Network
Ro1
Figure 5.14
Ro1 Cable
Attenuation pads.
R1
R2
R3
5.3 Transmission Line Theory
of 86.6 Ω and 50 Ω resistances. Thus, the resistance seen by the 75 Ω cable looking into the minimum loss pad input is 86.6 × 50 = 43.3 + 31.7 = 75 Ω 43.3 + 86.6 + 50 So, the minimum loss pad with the indicated resistance values does successfully match a 75 Ω cable to a 50 Ω cable. Now consider the matching achieved for signal flowing in the opposite direction. Looking back into the minimum loss pad, the 50 Ω cable also sees a 50 Ω impedance since it sees a parallel connection of 86.6 Ω with a series connection of 43.3 Ω and 75 Ω, which is equivalent to 86.6 × (43.3 + 75) = 50 Ω 86.6 + 43.3 + 75 Single-stub matching: the stub tuner technique matches a load Z L to a transmission line of characteristic impedance Z o by connecting length d2 of a short-circuited Z o -line (called a stub) in parallel at distance d1 from the load, as shown in Figure 5.15. The stub is connected to the line at SS′ . The design task is to calculate d2 and d1 . Recalling that the line admittance of a stub is given by Eq. (5.48), the idea behind single-stub matching is that d1 is the shortest distance from the load at which the real part of line admittance Y x equals line conductance Y o (= 1/Z o ), i.e. Y x = Y o + jB, and d2 is the length of the stub with input admittance –jB. Since the total admittance of two circuit elements that are connected in parallel is the sum of their individual admittances, summing the admittances of the stub and the line at SS′ gives a total admittance Y o (and hence total impedance Z o ) at SS′ , which achieves the required matching of Z L to the line. Expressing in terms of normalised admittance, yx = 1 + jb and d2 is the length of the stub with normalised input admittance –jb. In the next worked example, we show that, to match a load of normalised admittance yL = a − jh, the distance d1 is the smallest positive value in the set { } ( ) √ −h ± a[h2 + (a − 1)2 ] 1 n −1 d1 = tan 𝜆, n = 0, 1, 2, · · · (5.61) + 2𝜋 2 h2 + a(a − 1) This value of d1 is then employed to calculate b as b=
h tan2 𝛽d1 + (1 − h2 − a2 ) tan 𝛽d1 − h
(5.62)
(1 + h tan 𝛽d1 )2 + a2 tan2 𝛽d1
And finally, the design is completed by obtaining distance d2 as the smallest positive value in the set } { −1 tan (1∕b) n + 𝜆, n = 0, 1, 2, · · · d2 = 2𝜋 2
(5.63)
The quickest way to obtain the values of d1 and d2 needed for single-stub matching is to implement the above three equations in a short computer code using a range of software platforms, such as MATLAB. However, the Smith Chart (briefly introduced later) may also be used to carry out a manual design. Zo
d1 S YL = 1/ZL
Zo-line Sʹ
d2
Figure 5.15
Single-stub matching.
Stub
357
358
5 Transmission Media
Worked Example 5.4
Single-stub Matching
We wish to match a load Z L = 30 + j25 Ω to a loss-free 50 Ω transmission line using the stub tuning technique. The design solution requires us to calculate the minimum distances d1 and d2 in Figure 5.16a so that there is no reflection at point S on the line. Figure 5.16b shows the normalised admittances that we wish to achieve. The solution is in two steps. First, we use Eq. (5.51) to determine the minimum distance d1 at which the real part of the normalised line admittance is 1, i.e. yx = 1 + jb, where jb is the normalised susceptance of the line at that point. Second, we use Eq. (5.48) to calculate length d2 of the stub that has normalised input admittance equal to −jb. When this stub is connected in parallel to the line at SS′ , the total normalised admittance yx = 1 (which means that Y x = Y o and Z x = Z o , i.e. total impedance at SS′ equals Z o ), so the required matching of Z L to the Z o line has been accomplished. Setting yx in Eq. (5.51) equal to 1 + jb at x = d1 , we have yd1 =
yL + j tan 𝛽d1 = 1 + jb 1 + jyL tan 𝛽d1
Given Z o = 50 and Z L = 30 + j25, normalised load yL (which we will denote as a − jh for convenience) in the above equation is Z 50(30 − j25) 1 50 = yL = = o = zL ZL 30 + j25 (30 + j25)(30 − j25) 1500 − j1250 1500 1250 60 50 −j = −j = = 2 2 1525 1525 61 61 30 + 25 ≡ a − jh; a = 60∕61, h = 50∕61 d1 S (a)
Zo = Ro = 50 Ω
ZL = 30 + j25 Ω Sʹ
d2
Stub
1 + jb y = 1.0
d1 S yL = 1/zL
(b) Sʹ –jb d2
Figure 5.16
Worked Example 5.4.
Stub
5.3 Transmission Line Theory
Thus a − jh + j tan 𝛽d1 a + j(tan 𝛽d1 − h) = 1 + j(a − jh) tan 𝛽d1 1 + h tan 𝛽d1 + ja tan 𝛽d1 a + atan2 𝛽d1 htan2 𝛽d1 + (1 − h2 − a2 ) tan 𝛽d1 − h = + j (1 + h tan 𝛽d1 )2 + a2 tan2 𝛽d1 (1 + h tan 𝛽d1 )2 + a2 tan2 𝛽d1
1 + jb =
The real part being equal to 1 and imaginary part equal to b means that a + atan2 𝛽d1 = (1 + h tan 𝛽d1 )2 + a2 tan2 𝛽d1 ⇒ (h2 + a2 − a)tan2 𝛽d1 + 2h tan 𝛽d1 + (1 − a) = 0
(5.64)
and b=
h tan2 𝛽d1 + (1 − h2 − a2 ) tan 𝛽d1 − h (1 + h tan 𝛽d1 )2 + a2 tan2 𝛽d1
(5.65)
Equation (5.64) is a quadratic equation in tan 𝛽d1 , and therefore yields √ −h ± a[h2 + (a − 1)2 ] tan 𝛽d1 = h2 + a2 − a = −0.01, −2.49 (Putting a = 60∕61, h = 50∕61) Taking the inverse tangent of both sides, recalling that 𝛽 = 2𝜋/𝜆 and that tan(𝜃) = tan(𝜃 + n𝜋), n = 0, 1, 2, …, we obtain 𝜆 d1 = {tan−1 (−0.01, −2.49) + n𝜋} , n = 0, 1, 2, · · · 2𝜋 𝜆 = {(−0.01, −1.189) + n𝜋} 2𝜋 } { } { −1.189 n −0.01 n + 𝜆 or + 𝜆 = 2𝜋 2 2𝜋 2 = {−0.002, 0.4984, 0.9984, · · ·}𝜆 or {−0.1892, 0.3108, 0.8108, · · ·}𝜆 = 0.3108 𝜆(choosing the smallest positive result) We are now able to determine b in Eq. (5.65) by using this value of d1 (and hence 𝛽d1 = 2𝜋/𝜆 × 0.3108𝜆 = 1.9527) along with h = 50/61 and a = 60/61 to obtain b = 0.82664. This means that the normalised admittance at point SS′ on the line is 1 + jb. An admittance of value −jb must therefore be connected in parallel at this point in order to make total normalised admittance equal to 1 + jb − jb = 1, which establishes a match with the line. Since we know from Eq. (5.48) that a stub of length x has admittance −jcot(𝛽x) = −j/tan(𝛽x), we determine stub length d2 as the value of x at which −j/tan(𝛽x) = −jb or tan(𝛽x) = 1/b. Thus 𝛽d2 = tan−1 (1∕b) + n𝜋 = tan−1 (1∕0.82664) + n𝜋 = 0.88002 + n𝜋 𝜆 d2 = (0.88002 + n𝜋)∕𝛽 = (0.88002 + n𝜋) , n = 0, 1, 2, · · · 2𝜋 { } 0.88002 n = + 𝜆 = {0.1401, 0.6401, 1.1401, · · ·}𝜆 2𝜋 2 = 0.1401𝜆, (Choosing smallest positive value) In conclusion, the load Z L = 30 + j25 Ω may be matched to a 50 Ω line by connecting in parallel to the line a short-circuit stub of length 0.1401𝜆 at a distance 0.3108𝜆 from the load.
5.3.8 Scattering Parameters Scattering parameters or S-parameters are used for characterising microwave networks, and their definitions are based on 50 Ω termination of network ports. They are readily measured in the laboratory using network analysers. Figure 5.17 shows a two-port network with incident and reflected voltages propagating in opposite directions in each port of the network. The voltages are identified using two subscripts, which are either digit 1 or 2. The first
359
360
5 Transmission Media
2
1 Zg
ʋ11
ʋg
ʋ12
Two-port Network
1ʹ
ʋ21
ZL
ʋ22 2ʹ Z2
Z1 Figure 5.17
S-parameters.
subscript identifies the port (using 1 for input port and 2 for output port), whereas the second subscript identifies the wave (using 1 for incident wave and 2 for reflected wave). It is important to note that incidence and reflection are relative to the ports and not the load or source. Thus, v21 is the wave incident on port 2, whereas v12 is the wave reflected from port 1. The reflected voltages v12 and v22 are given in terms of the incident voltages v11 and v21 through the S-parameters v12 = s11 v11 + s12 v21 v22 = s21 v11 + s22 v21
(5.66)
This indicates that the reflected voltage v12 observed at port 1 comes from two contributions, namely a reflection of the voltage v11 that is incident at port 1 and a transmission through the network of the voltage v21 that is incident on port 2. Similarly, the reflected voltage v22 observed at port 2 is the result of reflection of v21 at port 2 and transmission of v11 through the network from port 1 to port 2. If s12 = 0 then the signal v21 incident on port 2 does not make any contribution to the signal v12 propagating away from port 1. This means that signals propagate through the network in only one direction from port 1 to port 2. The network is therefore said to be unilateral if s12 = 0. Since v21 is the signal coming back (i.e. reflected) from load Z L connected at port 2, it follows that under conditions of matched load at port 2 (i.e. Z L = Z 2 ) the signal v21 = 0 and we obtain from Eq. (5.66) s11 = s21
v12 || v11 ||v21 =0
v | = 22 || v11 |v21 =0
Matched load at port 2
(5.67)
Furthermore, if Z g = Z 1 (i.e. matched load at port 1) and the generator vg is turned off so that signals in the network originate from v21 only (being reflected at port 2 to produce v22 and transmitted through the network to produce v12 ), there will be no reflection at the ‘load’ Z g and hence v11 = 0, so we have from Eq. (5.66) s22 =
v22 || v21 ||v11 =0
s12 =
v12 || v21 ||v11 =0
Matched load at port 1
(5.68)
These S-parameters therefore have the following physical meaning and applications. Parameter s11 is the voltage reflection coefficient 𝜌v1 of the input port. Parameter s22 is the voltage reflection coefficient 𝜌v2 of the output port. |s21 |2 is the forward insertion power gain of the two-port network, usually expressed in dB. And |s12 |2 is the reverse insertion power gain of the two-port network, also called the reverse power leakage. It is important to note that these parameters are always defined (and measured) under a matched condition at one of the two ports. So, v11 is the available voltage from a matched generator and v22 is the voltage delivered to a matched load.
5.3 Transmission Line Theory
Worked Example 5.5
S-parameters
We wish to calculate the S-parameters of the T-network shown in Figure 5.18a and hence to determine the return loss of the network at its input and the insertion loss of the network. S-parameter definitions are based on 50 Ω source and 50 Ω terminations, so these must be adhered to as we seek to calculate each parameter for the given network. First, s11 is the voltage reflection coefficient 𝜌v1 of the input 1
1
2 25 Ω
25 Ω
2 25 Ω
25 Ω
50 Ω
50 Ω
50 Ω Z1
1ʹ
1ʹ
2ʹ
(a)
2ʹ
(b)
1
2 25 Ω
25 Ω
50 Ω
50 Ω Z2 1ʹ
2ʹ
(c) 1
2 25 Ω
50 Ω
25 Ω ʋa
(d)
ʋ22
50 Ω
50 Ω
2ʋ11 1ʹ
2ʹ
1
2 25 Ω
(e)
50 Ω
ʋ12
25 Ω ʋa
50 Ω
50 Ω
2ʋ21 1ʹ
2ʹ
Figure 5.18 Worked Example 5.5: (a) T-network; (b) Configuration for calculating s11 ; (c) Configuration for calculating s22 ; (d) Configuration for calculating s21 ; (e) Configuration for calculating s12 .
361
362
5 Transmission Media
port, so to determine s11 , we terminate output port 2 with a (matched) 50 Ω load, as shown in Figure 5.18b, and under that configuration we determine the impedance Z 1 seen looking into port 1. Z 1 is a series connection of 25 Ω to a parallel connection between 50 Ω and a series connection of 25 Ω and 50 Ω. We adopt the convention of using ‘+’ to represent a series connection and ‘∥’ to represent a parallel connection. Thus Z1 = 25 + (50 ∥ (25 + 50)) = 25 +
s11
50 × (25 + 50) 50 + (25 + 50)
= 25 + 30 = 55 Ω Z − Zo 1 55 − 50 = ∠0∘ ≡ 𝜌v1 = 1 = Z1 + Zo 55 + 50 21
Next to s22 , which is the voltage reflection coefficient of the output port 2. So, to determine s22 we terminate input port 1 with 50 Ω, as shown in Figure 5.18c, and (under this configuration we) determine the impedance Z 2 seen looking into port 2. Thus Z2 = 25 + (50 ∥ (25 + 50)) = 25 + 30 = 55 Ω Z − Zo 1 55 − 50 = ∠0∘ = s22 ≡ 𝜌v2 = 2 Z2 + Zo 55 + 50 21 We notice that s11 = s22 , but this is only due to the symmetry of this network and is in no way a general relationship for all networks. Parameter s21 is the ratio between voltage v22 at port 2 and voltage v11 at port 1 when a matched load is connected at port 2 and a matched generator is connected at port 1, as shown in Figure 5.18d. Note that the generator voltage is 2v11 in order for it to deliver an incident voltage equal to v11 when a matched load is connected to it at port 1. We calculate the required ratio v22 /v11 in two stages. First, va is the voltage across resistance 50 ∥ (25 + 50) when voltage 2v11 is divided between resistance 50 + 25 and resistance 50 ∥ (25 + 50). Second, v22 is the voltage across resistance 50 when voltage va is divided between resistance 25 and resistance 50. Thus 50 ∥ 75 30 4 2v = 2v = v (50 ∥ 75) + (50 + 25) 11 30 + 75 11 7 11 50 2 2 4 8 v = v = × v = v = 25 + 50 a 3 a 3 7 11 21 11 v22 8 ∠0∘ = = v11 21
va = v22 s21
Parameter s12 is defined in a similar way to s21 but with the roles of ports 1 and 2 reversed, as shown in Figure 5.18e. It follows, as earlier explained, that 50 ∥ 75 30 4 2v = 2v = v (50 ∥ 75) + (50 + 25) 21 30 + 75 21 7 21 50 2 2 4 8 v = v = × v = v = 25 + 50 a 3 a 3 7 21 21 21 v 8 ∠0∘ = 12 = v21 21
va = v12 s12
We observe again that s12 = s21 just for this network but not as a rule for all other networks. The return loss (RL) of a transmission line was defined in Eq. (5.37), and is applicable to a network but with S-parameter s11 replacing 𝜌v . Thus RL = −20 log10 (|s11 |) = −20 log10 (1∕21) = 26.4 dB
5.3 Transmission Line Theory
Note that the higher the RL value, the better is the coupling of a signal from source into the network with minimal reflection. Insertion loss (IL) is the reciprocal of the forward insertion power gain |s21 |2 of the network. Thus IL = 1∕|s21 |2 = 10 log10 (1∕|s21 |2 ) dB = −20 log10 (|s21 |) = −20 log10 (8∕21) = 8.4 dB Note that the lower the IL value, the better is the transfer of a signal through the network with minimal dissipation.
5.3.9 Smith Chart The Smith chart of Figure 5.19 is a graphical tool that was once very popular for solving problems in transmission lines and impedance matching. Nowadays, such problems are more quickly solved using widely available software tools, but the Smith chart is still widely used in radio frequency (RF) circuit analysis software and measurement instruments such as network analysers as a display screen to present various RF parameters in a format that is more instructive than a simple tabular layout. We only briefly introduce features of the Smith chart here. The interested reader is referred to [1] for examples of a range of problem solving using the Smith chart. The Smith chart is based on normalised impedances, with the centre (labelled O in Figure 5.19) representing the characteristic impedance zo = 1.0 of the line. The centre line (labelled BD) is the real axis. Circles of constant normalised resistance traverse the real axis and have values from 0.0 on the left (B) to ∞ on the right (D). The resistance circle that passes through O has a value of 1.0. Curves of constant normalised reactance radiate outward away from BD, starting from 0.0 along BD to ∞ for the most curved away from BD. The reactance curves above BD are positive (or inductive reactance), whereas the ones below are negative or capacitive reactance. The circumference of the chart consists of several circular scales. The innermost circular scale gives the angle of transmission coefficient in degrees from 0∘ at D going counterclockwise to 90∘ near B for positive angles, and from 0∘ at D going clockwise to −90∘ near B for negative angles. The next circular scale gives the angle of reflection coefficient in degrees from 0∘ at D to 180∘ at B, going counterclockwise for positive angles and clockwise for negative angles. The next circular scale gives the distance moved in wavelengths along the transmission line towards the load from 0.0 at B going counterclockwise to 0.25 at D and continuing to 0.5 back at B. The outermost circular scale gives the distance in wavelengths towards the generator from 0.0 at B going clockwise to 0.25 at D and continuing to 0.5 back at C. Finally, the six bottom scales give various reflection and transmission coefficients and losses (in dB). The Smith chart is a very versatile tool for manually solving transmission line and impedance matching problems. For example, given a line of characteristic impedance Z o = 50 Ω that is terminated by a load Z L = 30 + j25 Ω, we mark the normalised load impedance zL = (30 + j25)/50 = 0.6 + j0.5 on the Smith chart as shown in Figure 5.20. Note that this mark is at the point of intersection between the normalised resistance circle 0.6 and the normalised reactance curve 0.5. A circle of radius OzL centred at O as shown in Figure 5.20 is a constant VSWR circle that passes through all possible impedance values on the transmission line. VSWR, Z max , Z min : moving from point zL clockwise along this circle is equivalent to moving from the load along the entire transmission line towards the source or generator. Since the maximum impedance on the line is Z o × VSWR (≡ VSWR normalised) and is entirely resistive, it follows that the intersection of this circle on the right-hand side of the real axis of the chart gives VSWR. Thus, VSWR = 2.24. This circle intersects the left-hand
363
364
5 Transmission Media
B
Figure 5.19
O
D
Smith chart.
side of the real axis of the chart at 0.44, which gives the normalised minimum impedance on the line. So, the line has Z max = Z o × 2.24 = 112 Ω and Z min = Z o × 0.44 = 22 Ω. Reflection coefficient: drawing a straight line OzL and extending it to intersect the circumference of the chart at F, the line OF intersects the circular scale for angle of reflection at 𝜃 v = 111.5∘ . Using a ruler or compass to transfer the radius OzL to read from the centre of the (second from) bottom scale towards the left of the scale gives a reading for |𝜌v | of 0.383. Thus, the reflection coefficient is 0.383∠111.5∘ . Note that the bottom scales are shown in Figure 5.19 but not in Figure 5.20. Line impedance: we may use the Smith chart to read the impedance at any point on the line by moving from F clockwise along the outermost circular scale (towards the generator) through the required distance in wavelengths. For example, to read the impedance at a distance 0.3𝜆 from the load, we move from F (= 0.0952) through 0.3 to get to point G (= 0.0952 + 0.3 = 0.3952). To locate point G on the transmission line, we draw line OG which intersects the constant VSWR circle (representing the transmission line) at zx = 0.64 − j0.56. This is the normalised line impedance at a distance 0.3𝜆 from the load, so line impedance Z x = zx × 50 = 32 − j28 = 42.5∠−41.2∘ .
5.4 Optical Fibre
F
zL
O
zx
zL = 0.6 + j0.5 zx
x = 0.3λ
= 0.64 – j0.56
θʋ = 111.5o VSWR = 2.24 zmax = 2.24 zmin = 0.44 VSWR – 1 ρʋ = = 0.383 VSWR + 1
G Figure 5.20
Example of Smith chart use.
5.4 Optical Fibre Due to its cost, rigid structure, and bulk, a metallic waveguide is only suitable for very short links (e.g. to connect between an outdoor antenna and an indoor unit) at frequencies above about 2 GHz. Optical fibre, however, is a much cheaper and more suitable form of waveguide, which may be used for long-distance links. Signals are carried by light waves in the fibre. Optical fibre is nowadays the main transmission medium for long-distance high-capacity terrestrial communication, including intercontinental links (e.g. transoceanic cables), interexchange networks, metropolitan area networks (MANs), backhaul links for satellite, and terrestrial wireless systems, fixed broadband links for homes and businesses through fibre to the cabinet (FTTC), and fibre to the premises (FTTP), etc. Most of the key developments that firmly established optical fibre as an indispensable transmission medium in the global telecommunications network infrastructure took place in the second half of the twentieth century and included: ●
●
1954: microwave amplification by stimulated emission of radiation (Maser) and light amplification by stimulated emission of radiation (laser) are demonstrated by Charles H. Townes (1915–2015, USA) in 1954 and 1958, respectively, building on a theory first proposed in 1917 by Albert Einstein. 1966: in a landmark paper in IEE Proceedings [2], Charles Kao and George Hockham (UK) propose glass fibre for optical signal transmission. They are convinced that the prohibitive glass attenuation (∼1000 dB/km) at the time was due to absorption by impurities in the glass, and that losses less than 20 dB/km were attainable in
365
366
5 Transmission Media
Coating
Figure 5.21
Optical fibre structure.
Cladding
Core
●
●
●
●
●
better purified glass. In 1972, within six short years, Corning Glass Works (USA) produce germanium-doped fibre core having a loss of just 4 dB/km. 1974: John MacChesney and Paul O’Connor (Bell Labs, USA) develop the modified chemical vapour deposition (MCVD) process for the manufacture of ultra-pure glass, a method that remains the standard for mass-producing low-loss optical fibre cables. 1975: the first commercial continuous-wave semiconductor laser operating at room temperatures is developed at Laser Diode Labs (USA). 1987: erbium doped fibre amplifier (EDFA), able to directly boost light signals, is developed by David Payne (UK) [3]. This leads to an all-optical system by 1991 which supports 100 times more data than a system that uses electronic amplification where light is converted to electrical and back again. 1988: the first transoceanic optical fibre cable, TAT-8, of length 5580 km is laid across the Atlantic Ocean to connect the USA with the UK and France. TAT-8 had a capacity of 280 Mb/s and was the eighth transatlantic cable but the first to use fibre, the previous ones having used copper coaxial cable. 1996: the Trans-Pacific Cable 5 (TPC-5), an all-optic fibre cable and the first to use optical amplifiers, is laid in a loop across the Pacific Ocean from San Luis Obispo in California to Guam in Hawaii and Miyazaki in Japan and back to the Oregon coast. TPC-5 has a total cable length of 22 500 km and a transmission capacity of 5 Gb/s.
An optical fibre is a dielectric waveguide about 125 μm in diameter made from high-purity silica glass. Figure 5.21 shows the structure of optical fibre. It consists of a glass core made of nearly pure silicon dioxide (SiO2 ) or silica surrounded by a glass cladding, which has a slightly lower refractive index than the core and is instrumental in confining the transmitted information-bearing light signal entirely within the core. A plastic sheath or coating made from durable resin covers the cladding to protect both the core and the cladding from moisture and mechanical damage. A group of fibres is usually bundled together in a larger protective jacket known as a cable, which may hold in excess of a thousand individual fibres. The fibres were originally used in pairs, one for each transmission direction, but today the technique of wavelength division multiplexing (WDM) enables a single fibre to support bidirectional communication using separate light wavelengths, e.g. 1310 nm downstream and 1550 nm upstream. Light waves propagate within the inner core over long distances. The dimension of an optical fibre is usually stated as a pair of numbers in micrometres (μm), called microns, which gives the diameters of the core and cladding. For example, a 50/125 fibre has a core diameter of 50 μm and a cladding diameter of 125 μm. Optical fibre offers several significant advantages as a transmission medium. ●
●
Optical fibre conveys an optical carrier of very high frequency, around 200 000 GHz. Assuming a transmission bandwidth of only 1% of the supported carrier frequency, we see that optical fibre offers an enormous potential bandwidth of about 2000 GHz. This is practically a limitless bandwidth when one considers the requirements of foreseeable broadband communication services. However, the realisable bandwidth in practical optical communication systems will be constrained by the limitations of the electronic interface devices. Intensive research into optical fibre over the last 50 years has led to the development of optical fibre with attenuation reduced from a figure of about 20 dB/km in 1970 to around 0.2 dB/km and 0.5 dB/km today within transmission windows at 1550 nm and 1310 nm, respectively. This is in sharp contrast to an attenuation figure of around 7.5 dB/km at 10 MHz for the least lossy (standardised landline) coaxial cable. The much lower attenuation
5.4 Optical Fibre
●
●
●
●
●
offered by optical fibre and the fact that this attenuation is independent of frequency (within the said transmission windows) mean that very high data rates can be supported using repeaters that are widely spaced at around 80–100 km. This spacing is an order of magnitude better than the largest repeater spacing used in normal-core coax to support a transmission bandwidth of only 4 MHz. This yields a significant reduction in initial and maintenance costs of the communication system. In fact, some long-haul communication systems and MANs may be implemented using optical fibre without intermediate repeaters. The fibre carries an optical signal, which is immune to electrical interference. Optical fibre can therefore be deployed in electrically noisy environments without the need for the expensive measures that must be adopted for wire pairs and coax. The problem of crosstalk is eliminated. The optical signal in one fibre is confined entirely within that fibre. It does not cause any direct or indirect effects on adjacent fibres in the same or other cable core. The fibre (an electrical insulator) presents infinite resistance to the transmitter and receiver that it connects. Voltage and current levels at the transmitter and receiver are therefore kept apart. The electrical isolation (of DC bias levels) of different parts of a communication system is desirable to simplify design. Optical fibre provides this isolation naturally, and no further measures like transformer or capacitor coupling need to be taken. Fibre is made from a cheap and abundant raw material, and advances in the manufacturing process have reached a stage where the finished fibre is now much cheaper than copper cable. Fibres are small in dimension (about the thickness of a human hair), low-weight, and flexible. They take up much less space than copper cables and can therefore be easily accommodated in existing cable conduits.
Optical fibre does have several disadvantages, but these are entirely manageable and are far outweighed by the above advantages. ●
●
●
●
Special and costly connectors are required to join two fibres and to couple a fibre to an optical source. Losses are always incurred even in the best splices or connectors. A good splice introduces a loss n1
(5.69)
Applying Snell’s law of refraction to the interface between the input medium (usually air of refractive index na ≈ 1) and the core of refractive index n2 , and noting that (for total internal reflection to occur in the core) the maximum angle of refraction in the core is 𝜃 2 = 90 − 𝜃 c , we obtain na sin 𝜃a = n2 sin 𝜃2 n2 n n sin 𝜃a = sin 𝜃2 = 2 sin(90 − 𝜃c ) = 2 cos 𝜃c na na na This angle 𝜃 a is the maximum angle of acceptance. Any ray that is incident on the air/core interface at an angle larger than 𝜃 a will have an angle of refraction larger than 90 − 𝜃 c and hence will be incident on the core/cladding interface at an angle less than the critical angle and will therefore be partly refracted through into the cladding Figure 5.23 aperture.
Critical angle, cone of acceptance and numerical
Input medium (na)
Cladding (n1) θ2
θa
Cone of acceptance
θ2
θc
Core (n2)
369
370
5 Transmission Media
and partly reflected back into the core. In other words, the total internal reflection necessary to confine the light entirely within the core will only take place for rays within the cone of acceptance shown in Figure 5.23 and 2 2 specified below in terms of the refractive indices of air √na , core n2 , and cladding n1 . Since cos 𝜃c + sin 𝜃c = 1 and 2 sin 𝜃c = n1 ∕n2 (by Eq. (5.69)), it follows that cos 𝜃c = 1 − (n1 ∕n2 ) so the expression for sin𝜃 a simplifies to √ n √ 1 sin 𝜃a = 2 1 − (n1 ∕n2 )2 = n22 − n21 na na sin𝜃 a gives a measure of the maximum angle of acceptance and is called the numerical aperture (NA). Thus √ 1 n22 − n21 (5.70) Numerical aperture (NA) = na It should be noted, however, that even for light falling in the cone of acceptance a small fraction will be rejected due to reflection at the input medium/core interface. Reflectivity or reflectance gives an indication of the fraction of light from a source that is reflected at the source/fibre interface so that it fails to be coupled into the fibre core. It is given by the expression ) ( n2 − na 2 (5.71) Reflectivity = n2 + na where n2 and na are, respectively, the refractive index of the core and the input medium (usually air). Once the light has been coupled, the number of modes that can propagate in the core of the fibre depends on core diameter D, NA, and wavelength 𝜆, and may be estimated from the normalised frequency parameter 𝜐 NA (5.72) 𝜆 If 𝜐 ≤ 2.4 then only a single mode can be supported, but if 𝜐 > 2.4 then two or more modes exist, in which case if 𝜐 ≫ 1 then the actual number of modes M can be estimated using { 0.5𝜐2 , Step index M= (5.73) 2 0.25𝜐 , Graded index 𝜐 = D𝜋 ⋅
Worked Example 5.6 A fibre has core and cladding refractive indices n2 = 1.5 and n1 = 1.485, respectively. If light of wavelength 1550 nm is coupled into the fibre from an air input medium of refractive index 1.0, determine: (a) (b) (c) (d) (e)
The NA of the coupling. The apex angle of the cone of acceptance The reflectance of the coupling. The number of propagation modes in a 5/125 fibre. The number of propagation modes in a 62.5/125 graded index fibre.
(a) Numerical aperture (NA) √ √ 1 1 n22 − n21 = 1.52 − 1.4852 NA = na 1 = 0.2116 (b) The apex angle of the cone of acceptance is twice the angle 𝜃 a shown in Figure 5.23. Thus, apex (c) angle = 2𝜃 a = 2sin−1 (NA) = 2sin−1 (0.2116) = 24.4∘ . ) ( n2 − na 2 ( 1.5 − 1 )2 ( 0.5 )2 = = Reflectance = n2 + na 1.5 + 1 2.5 = 1∕25
5.4 Optical Fibre
(d) Using the numerical aperture obtained in (a) and the given signal wavelength and fibre dimension (D = 5 μm = 5000 nm), the normalised frequency parameter is NA 0.2116 = 5000𝜋 ⋅ 𝜆 1550 = 2.144
𝜐 = D𝜋 ⋅
Since 𝜐 ≤ 2.4, there is only one mode of propagation. (e) With D = 62.5 μm = 62 500 nm and the other parameters remaining the same as in (d), the normalised frequency parameter is a factor (62 500/5000) larger than the previous value, so 𝜐 = 26.81. This is much larger than 1, so the number of modes M is M = 0.25𝜐2 = 0.25 × 26.812 = 180
5.4.3 Attenuation in Optical Fibre Attenuation in optical fibre may be broadly classified into two categories, namely intrinsic and extrinsic attenuation. Intrinsic attenuation arises from the interaction of light with the material of the fibre resulting in scattering and absorption of the light. Absorption loss is the result of a small portion of the signal energy being dissipated as heat by impurities in the fibre. The main contributors are traces of transition metal ions and hydroxyl ions. The glass material itself will also absorb and dissipate a small amount of the signal energy as heat. Scattering loss occur due to the scattering of signal energy at points of refractive index irregularity in the fibre. The irregularities arise from the amorphous structure of silica material and are of a scale less than the wavelength of the light signal. On such a scale the resulting scattering is known as Rayleigh scattering, which causes the signal power to be attenuated by 8.686𝛼l dB over a fibre length l, where 𝛼 and 8.686𝛼 are the attenuation coefficient in Np/km and dB/km, respectively, and decrease as the fourth power of wavelength. The combination of absorption and Rayleigh scattering gives an intrinsic fibre loss that has a minimum value of around 0.2 dB/km at a wavelength of 1550 nm. Extrinsic attenuation arises from external factors not related to the fibre material and includes bending loss, splicing loss, connector loss, and numerical aperture loss. These extrinsic losses may make up for the bulk of the total attenuation on the link. Considerable bending loss will occur if the fibre is bent at some point into an arc of a few millimetres in radius, which allows some of the light energy to escape into the fibre cladding. The escape at sharp bends is by refraction in multimode fibre, and by radiation in single-mode fibre. Losses are also incurred at joints between two fibre lengths. For a permanent joint by means of a fusion splice the loss is small, typically 0.1𝜆), or nonselective scatter (i.e. wavelength independent scatter) when scatterer sizes are much larger than wavelength. Rayleigh scattering within optical fibre is therefore the result of light interaction with molecules, particles, and refractive index discontinuities within the fibre whose sizes and dimensions are much smaller than the wavelength of the incident light. As illustrated in Figure 5.25, a portion of the incident light will be scattered away from the forward direction as back scatter (towards the source) and as side scatter (in other non-backward and non-forward directions) so that the amount of light propagating towards the intended destination is slightly reduced. This reduction in signal strength is conveniently expressed as specific attenuation, also called attenuation coefficient in dB/km, which for Rayleigh scattering in optical fibre may be estimated from ) ( 850 4 𝛼(𝜆) = 1.7 dB∕km (5.74) 𝜆 where 𝜆 is wavelength in nm. So, Rayleigh scattering accounts for around 1.7 dB/km loss at 850 nm wavelength but only around 0.14 dB/km at 1600 nm. It is worth mentioning that it is the phenomenon of scattering of light from the sun that is responsible for many of our observations of colour in nature. For example, Rayleigh scattering of light is inversely proportional to the fourth power of wavelength and is produced by molecules in the propagation medium whose sizes are much smaller than the wavelength of light. Shorter wavelengths (such as blue light) are therefore more Rayleigh scattered than longer wavelengths (such as red light) and this explains why the sky appears blue on a clear day when there are no particulates in the sky. The sizes of water droplets in a cloud are much larger than the wavelength of light Side scatter Incident em wave
Scattering
Back scatter
Incident em wave
medium
Absorptive medium
Forward scatter Transmitted em wave (Weaker than incident wave)
Transmitted em wave (weaker than incident wave)
Heat Figure 5.25
Scattering and absorption of electromagnetic (em) wave in a medium.
373
374
5 Transmission Media
and this gives rise to nonselective scatter whereby all wavelengths in the light from the sun are equally scattered, which explains why clouds appear white. 5.4.3.1.2 Nonlinear Effects
At high transmitted light intensity, the fibre medium will exhibit nonlinear behaviour that further contributes to signal attenuation and introduces nonlinear distortion such as the creation of new wavelength components in the output signal. The fundamental causes of a nonlinear system response in the fibre transmission medium include acoustic molecular vibrations and the Kerr effect. If variations in the electric field component of the transmitted light are strong enough, acoustic molecular vibrations and hence refractive index fluctuations may be generated in the fibre which give rise to scattering. Stimulated Brillouin scattering (SBS) describes the situation where the refractive index fluctuations produce a backward propagating wave called Stokes wave in response to the transmitted wave, which in this context is referred to as the pump wave. Another consequence of the molecular vibrations is known as stimulated Raman scattering (SRS), whereby scattered waves are produced in both forward and backward directions and some energy is transferred from shorter to longer wavelengths. SRS impairs multichannel transmission due to its relative attenuation of shorter wavelengths but may be beneficially exploited to achieve optical amplification. The Kerr effect describes a phenomenon in which the refractive index of a material is changed by an applied electric field. All materials exhibit this characteristic but some more strongly than others. Therefore, if the transmitted light intensity is sufficiently strong, the electric field component of the light will produce slight variations in the refractive index of the fibre material. This effect manifests itself in various ways in an optical communication signal. Since different intensities of light will therefore propagate at slightly different speeds, intensity modulation of transmission at a given wavelength will be converted into phase variation of that wavelength, an effect described as self-phase modulation (SPM). The variation of light intensity in one wavelength varies the refractive index seen by light in a different channel (or wavelength) and gives rise to phase variation in that other channel, an effect known as cross-phase modulation (CPM). Finally, in view of its nonlinear refractive index in the presence of a strong light intensity, the fibre medium has a nonlinear transfer characteristic. Therefore, if two or more wavelengths are simultaneously transmitted, new wavelength components will be generated in the medium. This effect is like the intermodulation distortion discussed in Section 4.7.6 but in optical systems it is known as four-wave mixing (FWM). The new wavelengths will cause interference in WDM optical systems where independent signals are simultaneously transmitted in the same fibre using several regularly spaced wavelengths. 5.4.3.1.3 Absorption
This is a process by which energy in an electromagnetic wave is converted to heat in a medium through which the wave is transmitted. See Figure 5.25. As a result, the strength of the wave is proportionately reduced with increasing length of propagation through the medium. It should be noted that the attenuation of a voltage wave on a metallic transmission line (discussed in Section 5.3) is also an absorption process by which energy in the voltage wave is converted to heat in the resistance of the metallic material. The reduction in signal strength caused by absorption is usually also quantified as specific attenuation in dB/km. The amount of light absorption in optical fibre depends on the wavelength of the light and comes from three main contributing factors: ●
●
●
The presence of impurities in the glass, namely hydroxyl (OH− ) and transition metal ions. These ions have absorption peaks at the various optical band wavelengths identified earlier. Absorption by these impurities may be reduced to negligible levels in ultra-pure fibre having impurity concentrations of less than 1 ppb. Intrinsic light absorption property of silica glass. The absorption of light by silica is negligible below 1500 nm and increases rapidly above 1600 nm. It is this factor that is responsible for the sharp rise in total intrinsic fibre loss above 1600 nm, as shown in Figure 5.24. The presence of defects in the atomic structure of the glass material of the fibre.
5.4 Optical Fibre
5.4.3.2 Extrinsic Fibre Loss 5.4.3.2.1 Bending Loss
A bend in the fibre may cause the light within the fibre core to be incident on the core/cladding interface at an angle that is less than the critical angle of the medium, which breaches the condition for total internal reflection within the core and leads to a portion of the light escaping from the core. The light propagating within the core thereby suffers some loss or attenuation. This situation is illustrated in Figure 5.26 for both macrobend and microbend. Microbend losses are due to small wrinkles in the fibre, which are not visible to the naked eye, caused by temperature variations during manufacture, extreme temperature variations in installed cable, external forces that deform the cable jacket, or small-scale fibre discontinuities. Macrobend losses are due to bends that are visible to the naked eye caused, for example, by laying the fibre around a sharp corner or wrongly installing tie wraps. A minimum bend radius of 5–10 times the outer diameter of the fibre is recommended to avoid excessive macrobend loss. Longer wavelengths are more impacted by macrobending than shorter wavelengths are. 5.4.3.2.2 Splicing Loss
A long-distance optical fibre link will typically consist of multiple fibre segments that are joined together through splicing. There are two methods of splicing, namely fusion splicing and mechanical splicing. In fusion splicing, the two fibre ends are heated using an electric arc and melted and fused together to create a homogeneous, nonreflective, and reliable joint of very low loss, typically between 0.01 and 0.10 dB for a single-mode fibre. Mechanical splicing creates a fast and easy joining of two fibre lengths using adhesives and gels. It is typically employed in emergency link restoration situations and introduces a loss of around 0.05–0.2 dB for a single-mode fibre. 5.4.3.2.3 Connector Loss
When two fibre segments are joined together using connectors, small losses might be introduced by the following defects, leading to losses totalling typically between 0.1 and 0.25 dB. ●
●
●
If there is a small air gap between the connected ends of the two fibre lengths as a result of a faulty mechanical connector or wrong fitting, a fraction of the light energy in the incoming fibre segment will be lost due to reflection at the glass/air interface created by the air gap. Equation (5.71) gives the reflectivity or fraction of lost power in such situations. In Worked Example 5.6, we calculate a typical reflectivity value of 1/25, which means that only a fraction 24/25 of the power would be delivered from one fibre to the next, representing a loss of −10log10 (24/25) = 0.18 dB due to reflection at the air gap. Any misalignment of the two fibre cores at the connector will cause some of the light in the source fibre segment not to be passed into the core of the second fibre segment. The amount of loss depends on the degree of misalignment. If the core diameter Dr of the receiving fibre is smaller than the core diameter Ds of the source fibre, there will be a loss of −20log10 (Dr /Ds ) dB. This loss is avoidable in both directions of transmission by ensuring that only fibres of the same dimensions are connected.
Escaped light
(a) Macrobend
Figure 5.26
Bending loss.
Fibre core
Escaped light
(b) Microbend
Fibre core
375
376
5 Transmission Media ●
Contamination of the connectors (e.g. by finger oil or dirt particles) will lead to losses due to absorption and the scattering of light by the contaminants. It is therefore important that connectors are handled with care and examined using an inspection scope and properly cleaned (if necessary) prior to use.
5.4.3.2.4 Numerical Aperture Loss
When two devices are coupled and the numerical aperture (NA) of the receiving device, denoted NAr , is smaller than the NA of the source device NAs , then only a fraction of the energy from the source device will be successfully transferred into the receiving device. The loss incurred is called NA loss and is given by ( ) NAr dB (5.75) NA loss = −20 log10 NAs Light-emitting diodes (LEDs) have NA ∼ 1, whereas laser diodes (LDs) have NA ∼ 0, so this loss may be ignored in laser sources. The attenuation A of an optical fibre link of length l is given by the sum of all intrinsic and extrinsic losses as A = 𝛼l + 𝛼s ns + 𝛼c nc dB
(5.76)
where 𝛼 is the attenuation coefficient (dB/km) of the fibre used in the link, 𝛼 s is the mean splice loss, ns is the number of splices in the link, 𝛼 c is the mean loss of line connectors, and nc is the number of line connectors used in the link.
5.4.4 Dispersion in Optical Fibre Information is conveyed in optical fibre as pulses of light transmitted in sequential symbol intervals. Ideally, the width of each pulse should be unchanged by propagation in the medium to avoid the problem of intersymbol interference (ISI), which we identify in Section 5.2.3 under metallic lines. In practice, however, the different components or portions of a light pulse will experience slightly different propagation delays in the fibre and therefore will arrive at slightly different times at the receiving end and combine to produce a broader pulse than what was transmitted. The broadening of an electromagnetic pulse in a transmission medium is known as dispersion. In this case, each light pulse will be broader at exit from the fibre than at entry, as illustrated in Figure 5.27. The components of the pulse may be the various wavelengths that constitute the light signal or the range of angles of incidence in the coupled light beam or the two orthogonal polarisations of the light. If each of these components has a different path length or propagation time through the fibre then dispersion will occur. There are two types of dispersion in optical fibre, namely intermodal and intramodal dispersion. 5.4.4.1 Intermodal Dispersion
Also called multimode or multipath or modal dispersion, this occurs only in multimode fibre as a result of the dependence of path length through the fibre on angle of incidence, as illustrated in Figure 5.22a and c. A multimode step index fibre has a uniform core refractive index, so all incident rays will travel at the same speed through the fibre core. The outer longer-path rays will therefore take longer to reach the receiver than the axial rays, thereby causing a broadening of the received light pulse. A measure of the amount of dispersion is given by the pulse spread
Optical fibre medium Input pulse Figure 5.27
Output pulse Pulse dispersion.
5.4 Optical Fibre
𝜏 ps , which is the difference in propagation time between a pulse or ray propagating in lowest-order mode along the core axis and another ray propagating by total internal reflection in highest-order mode along the longest path through the core. Modal dispersion is greatly reduced if the multimodal fibre core has a parabolic refractive index profile such that longer ray paths through the core have proportionately higher pulse propagation speeds. It is through this mechanism that the modal dispersion of multimode graded index fibre is in the range 𝜏 ps = 0.3 to 1.0 ns/km, whereas that of multimode step index fibre is 𝜏 ps = 50 ns/km. 5.4.4.2 Intramodal Dispersion
Also referred to as material dispersion, intramodal dispersion occurs in single-mode fibre and includes chromatic dispersion (CD) and polarisation mode dispersion (PMD). CD arises because each of the wavelength components of the light pulse will see a different fibre core refractive index, and therefore will travel at a slightly different speed. CD depends on the spectral width of the light, and on the type of single-mode fibre. It is usually specified in units of picoseconds per nanometre per kilometre. A CD of 1 ps/(nm km) means that a pulse of spectral width 1 nm will spread out by 1 ps for each kilometre travelled along the fibre. CD may be ignored in low-speed transmissions but must be considered in high-speed systems operating at 10 Gb/s or higher. Measures to combat CD include: (i) use of special fibres such as the dispersion shifted and nonzero dispersion shifted fibres stipulated by the ITU [4, 5], (ii) regeneration of the light pulses at regular intervals along the optical fibre link, and (iii) use of light sources and transmitters with narrow spectral width. The light pulse that is transmitted through a fibre core is usually unpolarised light that may be resolved into two pulses having mutually orthogonal polarisations, each describing the orientation of the electric field component of the light. Ideally, these two pulses travel at the same speed through the fibre so that, barring other sources of dispersion, the receiver sees a single undistorted and unpolarised pulse. However, random imperfections in practical fibres will disrupt the circular symmetry of the core structure, and this causes the refractive index of the fibre material to be dependent on light polarisation. Therefore, two orthogonal polarisation modes will travel at slightly different speeds through the fibre medium. An unpolarised light pulse sent by a transmitter will arrive at the receiver as two slightly time-displaced polarised pulses that combine to produce a single broadened pulse. The delay between the two polarisation modes over the length of the link is referred to as differential group delay (DGD) and the resulting distortion is known as polarisation mode dispersion (PMD), measured in picoseconds. The impact of PMD is, however, insignificant on links less than about 1600 km in length or 40 Gb/s in data rate. The main effect of dispersion in optical communication systems is to place an upper limit on link symbol rate, i.e. the rate Rs at which light pulses can be transmitted on the link. We may estimate this limitation by considering Figure 5.28 in which pulses of width 𝜏 are sent at regular intervals T s . The symbol rate is therefore Rs = 1/T s and the duty cycle of the transmitted waveform is d = 𝜏/T s . If there is to be no interference due to pulse spreading then the maximum allowed dispersion 𝜏 ps is as shown in Figure 5.28b such that each broadened pulse just stops short of the sampling instant of the next symbol interval. Thus 𝜏ps = Ts − 𝜏∕2 = Ts − dT s ∕2 = Ts (1 − d∕2) This relationship allows us to express maximum symbol rate in terms of medium dispersion as Rs =
1 − d∕2 2 − d 1 = = Ts 𝜏ps 2𝜏ps
(5.77)
Figure 5.28 illustrates a general case involving return-to-zero (RZ) pulses with waveform duty cycle d < 1. In the special case of full width pulses, called non-return-to-zero (NRZ) pulses, pulse width equals symbol interval, so that d = 1 and the above equation reduces to Rs =
1 2𝜏ps
(5.78)
377
378
5 Transmission Media
τ (a) Transmitted pulse sequence
τ/2 0
t Ts
2Ts
→ symbol intervals
Ts
2Ts
→ sampling instants
τps Broadened pulse
(b) Maximum allowed spread τps
t 0
Figure 5.28
Duty cycle d = τ/Ts
Analysis of limitation placed on symbol rate by pulse broadening.
Modal dispersion 𝜏 ps in a multimode step index fibre may be estimated from the path difference between the outermost ray and the axial ray divided by the speed of light in the fibre. This leads to n1 Δ l c n − n1 Δ= 2 n2
𝜏ps ≈
(5.79)
where, n2 = refractive index of fibre core; n1 = refractive index of fibre cladding; c = speed of light, l = link length, and Δ ≡ normalised core-cladding refractive index difference. Note the modal dispersion in a graded index fibre will be significantly lower than the value given by Eq. (5.79), which only applies to a step index fibre. Worked Example 5.7
Dispersion and Symbol Rate
Information is transmitted on a 1 km long multimode step index optical fibre using RZ pulses of negligible duty cycle. The fibre has core and cladding refractive indices of 1.5 and 1.485, respectively. (a) What is the maximum allowable symbol rate to avoid errors due to modal dispersion? (b) If the above link is implemented using a graded index fibre with dispersion 0.4 ns/km, what is the new maximum symbol rate? (c) If the link length in (b) is to be increased to 10 km without using intervening repeaters, what is the maximum bit rate possible on this longer link? (a) We first calculate modal dispersion as 𝜏ps n Δ n − n1 0.015 = 50 ns∕km = = 2 = 2 c c l 3 × 105 km∕s We then determine maximum symbol rate, noting that link length l = 1 km, and that transmission is by RZ pulses Rs =
109 2−d 2−0 1 1 = = 20 MBd = = = 2𝜏ps 2𝜏ps 𝜏ps 50 ns 50
(b) Total dispersion is now only 0.4 ns over the 1 km link. Using this value in the above calculation in place of 50 ns yields the new maximum symbol rate Rs =
109 1 = = 2.5 GBd 0.4 ns 0.4
5.4 Optical Fibre
(c) Total dispersion on the 10 km link will be 0.4 ns/km × 10 km = 4 ns. Therefore, the maximum allowed symbol rate on this longer link will be 109 1 = = 250 MBd 4 ns 4 Notice that as the repeater-less distance increases, the symbol rate decreases proportionately so that the product of symbol rate and link length is a constant called bandwidth-distance product or bandwidth-length product. The optical link in (b) thus has a bandwidth-length product of 2.5 GHz⋅km. Rs =
Worked Example 5.8
Signal Power Budget on Optical Fibre Link
An optical fibre communication system has the following specifications: ● ● ● ● ●
Transmit power = 1 mW. Receiver sensitivity = −42 dBm. Maximum length of continuous fibre section s = 2 km. Splicing loss 𝛼 = 0.5 dB per splice. Coupling loss = 1 dB at each end. Determine the maximum length of this optical fibre link that can be operated without a repeater.
Let l denote the maximum link length in km that we seek. It is more convenient to set out the solution compactly in tabular form, as shown in Table 5.2. Notice that all losses and the fade margin are entered as negative values so that they reduce the received signal power when algebraically summed in the final row. Received power is given in the final row of the table as Pr = −(0.45l + 1.5)
dBm
Since receiver sensitivity (i.e. minimum received signal power needed by the receiver for reliable operation) is specified as −42 dBm, we set the above expression for Pr to this value and solve for link length l Pr = −(0.45l + 1.5) = −42 42 − 1.5 ⇒l= = 90 km 0.45 Table 5.2
Worked Example 5.8: Signal power budget on optical link.
Quantity
Computation
Value
Pt , Transmit output power
Given (1 mW) = 10 log10 (1) dBm
0 dBm
Total fibre loss
Fibre loss per km multiplied by total link length
−0.2 l dB
Total coupling loss
Sum of coupling loss at either end (1 + 1 dB)
−2 dB
Total splicing loss
(Number of sections − 1)𝛼 = (l/s − 1)𝛼
−0.5(l/2–1) dB
M, fade margin
This is zero at maximum link length since no minimum value was specified or required
−0 dB
Pr , received signal power
Algebraic sum of above 5 quantities
−(0.45 l + 1.5) dBm
379
5 Transmission Media
5.5 Radio Radio is a small portion of the wider electromagnetic spectrum tabulated in Table 5.3, the final column of which identifies the radio bands, optical bands, and ionising radiation bands. The radio spectrum has traditionally been divided into bands from ELF (extremely low frequency) to EHF (extra high frequency), with each band spanning a factor of 10 in frequency (and wavelength). The bands from UHF to EHF are collectively referred to as the microwave band, whereas the EHF band on its own is commonly called the millimetre wave band in recognition of the fact that wavelengths in this band have values ranging from 1 to 10 mm. Much of the microwave radio spectrum has been further divided into smaller sub-bands identified by letter designations, as listed in Table 5.4. An electromagnetic wave consists of a changing electric field E, which generates a changing magnetic field H in the surrounding region, which in turn generates a changing electric field in the surrounding region, and so on. The coupled electric and magnetic fields therefore travel out in space. The speed of propagation in vacuum can be shown to be c = 299 792 458 m/s (often approximated as 3 × 108 m/s), which is the speed of light. The fields E and H are vector quantities having both magnitude and direction at every time instant and at every point in space covered by the wave. The direction or orientation of the E field defines the polarisation of the wave. Figure 5.29 shows a snapshot of a vertically polarised electromagnetic wave that propagates in the +z direction. Monitoring the E or H field strength at a fixed point in space we find that it varies sinusoidally with time, completing one cycle of values in a time T (called the wave period), which means that the wave completes 1/T cycles per second, a The electromagnetic spectrum.
ELF (Extremely low frequency)
< 3 kHz
> 100 km
VLF (Very low frequency)
3 – 30 kHz
10 –100 km
LF (Low frequency)
30 – 300 kHz
1 –10 km
MF (Medium frequency)
300 kHz – 3MHz
100 m –1 km
HF (High frequency)
3 – 30 MHz
10 m – 100 m
VHF (Very high frequency)
30 – 300 MHz
1 m – 10 m
UHF (Ultra high frequency)
300 MHz – 3 GHz
10 cm – 1 m
SHF (Super high frequency)
3 – 30 GHz
1 cm – 10 cm
EHF (Extra high frequency)
30 – 300 GHz
1 mm – 1 cm
Sub-millimetric
300 GHz – 3 THz
100 µm –1 mm
Far-infrared
3 – 30 THz
10 µm – 100 m
Near-infrared
30 – 430 THz
698 nm – 10,000 nm
Visible light
430 – 860 THz
349 nm – 698 nm
Ultraviolet
860 THz – 30 PHz
10 nm – 349 nm
X-ray
30 PHz – 300 EHz
1 pm – 10,000 pm
Gamma ray
> 100 EHz
< 3 pm
Radio Bands
Wavelength
Optical Bands
Frequency
Ionizing Radiation
Frequency Band
Millimetre wave
Table 5.3
Microwave
380
NOTES: Standard SI prefixes: deca(D) = 101; hecto(h) = 102; kilo(k) = 103; mega(M) = 106; giga(G) = 109; tera(T) = 1012; peta(P) = 1015; exa(E) = 1018; zetta(Z) = 1021; yotta(Y) = 1024; deci(d) = 10–1; centi(c) = 10–2; milli(m) = 10–3; micro(µ) = 10–6; nano(n) = 10–9; pico(p) = 10–12; femto(f) = 10–15; atto(a) = 10–18; zepto(z) = 10–21; yocto(y) = 10–24
5.5 Radio
Table 5.4 Sub-bands of the microwave radio spectrum. Letter designation of band
Frequency range of band (GHz)
L
1–2
S
2–4
C
4–8
X
8–12
Ku
12–18
K
18–26.5
Ka
26.5–40
Q
30–50
U
40–60
V
50–75
E
60–90
W
75–110
F
90–140
D
110–170
x
W
av
ele
ng
th,
λ
E field E = Em cos(ωt – kz)ˆx H = Hm cos(ωt – kz)ˆy y
H field
z
Figure 5.29
Snapshot of a vertically polarised electromagnetic wave propagating in the +z direction.
381
382
5 Transmission Media
quantity called the wave frequency f (in hertz, Hz). Since one cycle is 2𝜋 radian, it follows that the wave completes 2𝜋f radian per second, a quantity called the angular frequency of the wave and denoted 𝜔 (in rad/s). Taking a snapshot of the wave at a given time instant (as in Figure 5.29), we find that the fields also vary sinusoidally with distance z, completing one cycle in a distance 𝜆, called the wavelength of the wave. Again, since one cycle is 2𝜋 radian, it means that the wave completes 2𝜋 radian per 𝜆 metres, or 2𝜋/𝜆 radian per meter, a quantity called the wavenumber or phase constant of the wave and denoted 𝛽 (in rad/m). Combining the two sinusoidal dependencies (on time t and distance z) gives an expression for the electric field value at any time t and distance z as E = Em cos(𝜔t ± 𝛽z). The crest (or peak) of this field is located wherever the argument 𝜔t ± 𝛽z of the cosine function in this expression equals zero. Therefore, to track (i.e. move in step with) this crest and hence determine the speed of propagation or phase velocity v of the wave, the value of z must change as time t increases in such a way that 𝜔t ± 𝛽z = 0 at all times. In the case of a negative sign, i.e. (𝜔t − 𝛽z), z must increase as t increases to keep 𝜔t − 𝛽z = 0, whereas in the case of a positive sign, i.e. (𝜔t + 𝛽z), z must decrease as t increases. Thus, Em cos(𝜔t − 𝛽z) represents a wave moving in the +z direction and Em cos(𝜔t + 𝛽z) is a wave moving in the −z direction. The phase velocity is the ratio z/t in the equation 𝜔t − 𝛽z = 0. Thus, v = z/t = 𝜔/𝛽. Since 𝜔 = 2𝜋f and 𝛽 = 2𝜋/𝜆, it means that v = 𝜆f = 𝜆/T and 𝜆 = vT, from which we see that wavelength 𝜆 is the distance travelled by the wave in a time of one period T. The above parameters of a sinusoidally varying electromagnetic wave are related as follows Period
≡
Frequency:
T (s) f = 1∕T (Hz)
Angular frequency: 𝜔 = 2𝜋f (rad∕s) Wavelength
≡
𝜆
(m)
Wavenumber: 𝛽 = 2𝜋∕𝜆 (rad∕m) Phase velocity: v = 𝜔∕𝛽 = 𝜆f (m∕s)
(5.80)
For radio waves in air Speed of light (≈ 3 × 108 m∕s) (5.81) Radio frequency (Hz) Radio waves have been increasingly exploited for information transmission for over a hundred years and this trend is expected to continue into the foreseeable future. The information signal to be transmitted is first employed to vary a parameter (namely amplitude, frequency, phase, or combinations of these) of a high-frequency sinusoidal current signal in a process known as carrier modulation. The time-varying current signal is then fed into a suitably designed conducting structure or antenna, and this sets up a similarly time-varying magnetic field in the surrounding region, which in turn sets up a time-varying electric field, and so on. The antenna thereby radiates electromagnetic waves. Based on Faraday’s law of electromagnetic induction, we know that these waves can induce a similarly time-varying current signal in a distant receive-antenna. In this way the information signal may be recovered at the receiver, having travelled from the transmitter at the speed of light. For radio communications application, an appropriate frequency band is usually chosen that satisfies the required coverage and propagation characteristics. Radio wavelength (m) =
5.5.1 Maxwell’s Equations Maxwell’s equations are a collection of four physical laws that successfully predict the existence of electromagnetic (EM) waves travelling at the speed of light. For simplicity, we will not discuss these equations in detail, but it is useful to be at least familiar with their word descriptions given below. ●
●
The first of Maxwell’s equations is Gauss’s law, which states: the total electric flux emanating from a closed surface S is equal to the total charge enclosed within S. If the charge enclosed is zero, the electric flux lines are continuous. The second of Maxwell’s equations is Gauss’s law for magnetism, which states: the net magnetic flux out of a closed surface is always zero. Thus, there is no magnetic equivalent of electric charge.
5.5 Radio ●
●
The third of Maxwell’s equations is Faraday’s law of electromagnetic induction, which stipulates that a timevarying magnetic field produces a time-varying electric field and formally states: the induced electromotive force (emf) along a closed path equals the rate of decrease of the magnetic flux through the area enclosed by the path. The fourth of Maxwell’s equations is Ampere’s modified law, which stipulates that a time-varying electric field produces a time-varying magnetic field and formally states: the line integral of the magnetic field intensity along a closed path equals the conduction and displacement current through the area enclosed by the path.
The above laws lead to wave equations in E and H which may be solved in a general propagation medium having primary parameters 𝜎, 𝜀, and 𝜇, being respectively the conductivity, electric permittivity, and magnetic permeability of the medium. We find in the solution that the E and H waves travel at the same speed, so we say that they are coupled together and constitute what is named electromagnetic wave – ‘electro’ for E and ‘magnetic’ for H. The solution yields expressions for important secondary parameters of the medium in terms of 𝜎, 𝜀, and 𝜇. These secondary parameters include those already introduced in Eq. (5.80) (i.e. wavenumber 𝛽, wavelength 𝜆, and phase velocity v) as well as impedance Z, refractive index n, and attenuation constant 𝛼. The impedance Z at a given point in a medium is defined as the ratio between the amplitude Em of the electric field and the amplitude H m of the magnetic field observed at the same point. Since electric field is in volt per unit length and magnetic field is in ampere per unit length, note that this definition is consistent with the definition (in Section 5.3) of impedance on transmission lines as the ratio between voltage and current. The index of refraction or refractive index n of a medium is defined as the ratio between the speed c of electromagnetic wave in vacuum and the speed v of the same wave in the medium. That is Em Hm Speed of EM wave in vacuum n≡ Speed of same wave in medium
Z≡
(5.82)
The attenuation constant 𝛼 is the rate (in neper per unit distance) at which the amplitudes of the E and H waves decay with distance travelled in the medium. Thus, if the EM wave propagates in the +z direction and its electric and magnetic components have amplitudes Em and H m at z = 0, then their amplitudes after travelling distance z = l in the medium will be E(l) = Em e−𝛼l ;
H(l) = Hm e−𝛼l
(5.83)
This means that if the attenuation constant 𝛼 is specified in Np/m, then the wave will be attenuated by 8.686𝛼 dB/m. The expressions for the six secondary parameters are tabulated in Table 5.5, including their exact forms in all media and in a perfect insulator (𝜎 = 0) and their approximations in good dielectric and good conductor media. The electric permittivity 𝜀 of a medium is usually specified through the relative permittivity or dielectric constant 𝜀r of the medium, from which 𝜀 = 𝜀r 𝜀0 , where 𝜀0 = 8.8541878128 × 10−12 farad per metre (F/m) is the electric permittivity of free space or vacuum. Similarly, the relative permeability 𝜇r of a medium is usually specified, from which the magnetic permeability 𝜇 of the medium is 𝜇 = 𝜇 r 𝜇 0 , where 𝜇 0 = 1.25663706212 × 10−6 henry per metre (H/m) is the magnetic permeability of vacuum. Table 5.5 gives very useful results. For example, impedance and wave speed in free space (where 𝜎 = 0, 𝜀 = 𝜀0 , 𝜇 = 𝜇 0 ) are given by √ √ 𝜇0 1.25663706212 × 10−6 = = 376.73 Ω Z= 𝜀0 8.8541878128 × 10−12 1 v= √ = 299, 792, 458 m∕s ≡ c (5.84) 𝜇 0 𝜀0
383
384
5 Transmission Media
Table 5.5
Secondary parameters for electromagnetic wave in various media.
Secondary parameter
Wave number, 𝛽
Attenuation constant, 𝛼
Exact expression (all media)
Perfect insulator (𝝈 = 0)
Good dielectric (𝝈/𝛀𝜺)2 ≪ 1
Good conductor (𝝈/𝛀𝜺)2 ≫ 1
√ [√ ] √ ( )2 √ 𝜇𝜀 𝜎 √ 𝜔 1+ +1 2 𝜔𝜀
√ = 𝜔 𝜇𝜀
√ ≈ 𝜔 𝜇𝜀
≈
0
≈
√ [√ ] √ ( )2 √ 𝜇𝜀 𝜎 1+ −1 𝜔√ 2 𝜔𝜀 √
Wave impedance, Z
j𝜔𝜇 𝜎 + j𝜔𝜀
√
𝜎 2
√
√
𝜇 𝜀
=
𝜇 𝜀
≈
𝜇 𝜀
2𝜋 √ 𝜔 𝜇𝜀
≈
2𝜋 √ 𝜔 𝜇𝜀
Wavelength, 𝜆
2𝜋/𝛽
=
Phase velocity, v
𝜔/𝛽
√ = 1∕ 𝜇𝜀
Refractive index, n
c/v
=
√ 𝜇r 𝜀r
And the refractive index of a dielectric medium (where 𝜎 ≈ 0) is √ n = 𝜇 r 𝜀r
√
√ ≈
𝜔𝜎𝜇 2 𝜔𝜎𝜇 2
√
𝜔𝜇 (1 + j) 2𝜎 √ 2 ≈ 2𝜋 𝜔𝜎𝜇
≈
√ ≈ 1∕ 𝜇𝜀
≈
√ 2𝜔∕𝜎𝜇
√ 𝜇r 𝜀r
≈
√ 𝜎𝜇r ∕2𝜔𝜀0
≈
(5.85)
The solution of Maxwell’s equations in free space further reveals that the electric and magnetic fields are mutually perpendicular to each other and to the direction of propagation, the orientations of the three vectors being related according to the right-hand rule: let the four fingers of the right hand be opened out together and the thumb outstretched perpendicular to them then if the fingers are curled to fold the electric field vector onto the magnetic field vector the thumb points in the direction of propagation. Thus, a complete expression of each field for a plane electromagnetic wave propagating in the +z direction in a loss-free medium (𝛼 = 0) is E = Em cos(𝜔t − 𝛽z)̂ x H = Hm cos(𝜔t − 𝛽z)̂ y
(5.86)
where ̂ x, ̂ y are unit vectors in the +x and +y directions, respectively, E is in units of volt per meter (V/m), H in ampere per metre (A/m), 𝜔 is angular frequency (rad/s), 𝛽 is wavenumber (rad/m), and Em and H m are the respective amplitudes of the fields.
5.5.2 Radio Wave Propagation Modes How a radio wave makes its journey from transmitter to receiver, i.e. its mode of propagation, very much depends on the wave frequency. The modes of radio wave propagation in the atmosphere are depicted in Figure 5.30, which shows how a radio wave travels from a transmitter to a receiver in five different ways, namely ground wave, sky wave, line-of-sight (LOS), ionospheric scatter, and tropospheric scatter. Before discussing these propagation modes, it is useful first to review the vertical structure of the atmosphere. Figure 5.31 shows the division of the earth’s atmosphere into four shells. The troposphere is the lowest layer of the atmosphere and contains approximately 75% of total gaseous mass and 99% of water vapour and aerosols. The average depth of the troposphere varies with latitude, reducing from about 18 km in the tropics to as low as 6 km
5.5 Radio
Ionosphere Sky wave Ionospheric scatter
Troposphere Tx
Figure 5.30
Rx Earth
Tropospheric scatter Line-of-sight (LOS)
Tx ≡ transmitter Rx ≡ receiver
Ground wave
Different radio wave propagation modes in the atmosphere.
Ionosphere or Thermosphere km 00 (tenuous; plasma; neutral molecules) 6 m – 0k 00 ~2 ~12 km r e < 0 er ~9 lay F- -lay er < Mesosphere E lay (Negative temp. gradient) D
2000 km
80 km
Stratosphere Mesopause, –90° C Stratopause, 0° C Tropopause, –55° C
(ozone; positive temp. gradient)
Troposphere
30 km
temp. ≡ temperature
16 km Earth
(temp. inversion layer)
Figure 5.31
Vertical structure of the earth’s atmosphere (not to scale).
at the poles in winter. It has a negative temperature gradient of about −6.5 ∘ C/km and terminates in most places with a temperature inversion layer called the tropopause, which effectively limits convection. Above the tropopause lies the stratosphere, a region that contains most of the atmospheric ozone and extends to about 30 km. Temperature increases with height to a maximum of about 0 ∘ C at the stratopause. The heating of the stratopause is caused by absorption of the sun’s ultraviolet radiation by the ozone. The mesosphere is a region of negative temperature gradient that lies above the stratopause. It extends to about 80 km where temperatures reach a minimum of about −90 ∘ C. The ionosphere or thermosphere is a region of plasma (electrons and positive ions) and large quantities of neutral molecules extending from about 80 km to 2000 km above the earth’s surface. In this region, the sun’s radiation is strong enough to cause the ionisation of gas molecules. Since at this height the atmosphere is rare, the rate of recombination (of electron and ion back to neutral molecule) is very low. Thus, electrons, ions, and neutral molecules coexist. The amount of ionisation is usually given in terms of the number of free electrons per cubic metre. The term homosphere is sometimes used to refer to the first 80 km of the atmosphere where atmospheric gases are well mixed, and the term heterosphere is used to refer to the rest of the atmosphere beyond this height where the gases tend to stratify according to weight. About 99.9998% of the atmosphere is contained in the homosphere, of which about 75% lies in the comparatively small tropospheric volume. The ionosphere is therefore very tenuous.
385
386
5 Transmission Media
Ionospheric effects (below about 3 GHz): Faraday rotation
on
Direction of arrival variati Scint
illatio
n
lay p de Grou
p ce pa Tropospheric effects: s ee Fr iton c a on r n rpti Ref atio bso u a n n s tio ou tte tion ase nua risa in a G e a a t l t R a po og De &f d u ion llat Clo i t n Sci
Figure 5.32
ath
lo
ss
Disp
ersio
n
Absorption
ift
y sh
enc
qu r fre
ple
Dop
Radio wave propagation effects in the troposphere and ionosphere.
Practically all radio wave impairments in the atmosphere occur in the troposphere (due to interactions of the radio wave with gases, hydrometeors, i.e. airborne water and ice particles, and aerosols), and in the ionosphere (due to interactions of the radio wave with free electrons and ions). The only exception is free space path loss, which does not require the presence of matter in the transmission medium and is in any case, strictly speaking, not an impairment. Figure 5.32 shows a catalogue of various tropospheric and ionospheric propagation effects on a radio link between an earth station and a satellite. 5.5.2.1 Ground Wave
Ground-wave propagation is the dominant mode of propagation below about 2 MHz. The wave follows the contour of the earth by diffraction. This is the propagation mode used in amplitude modulation (AM) broadcasting. Reliable intercontinental communication can be achieved by ground-wave communication using high transmit power (>1 MW). The minimum usable frequency is determined by the condition that for efficient radiation the antenna should be longer than one-tenth of the signal wavelength. For example, a 1 kHz wave has a wavelength of 300 km – from Eq. (5.81) – and therefore would require an antenna at least 30 km long to efficiently radiate it. Energy in the surface waves is expended in setting up displacement and conduction currents in the earth. The resulting wave attenuation increases with frequency and depends on the salinity (salt content) of the earth’s surface. This attenuation is what sets the maximum usable frequency. 5.5.2.2 Sky Wave
In the sky-wave propagation mode, the atmosphere acts like a gigantic waveguide that is bounded below by the earth’s surface and above by a layer of the ionosphere. The wave travels by repeated reflections from these two layers. It should be noted, however, that describing this phenomenon as a reflection from an ionospheric layer is an oversimplification for convenience. What happens in practice is that the radio wave undergoes continuous refraction as it travels upwards through a decreasing refractive index profile until it reaches a layer in the ionosphere
5.5 Radio
at which the condition for total internal reflection is satisfied and the wave is thereby reflected towards the earth. Sky wave is the dominant mode of propagation for frequencies between about 2 and 30 MHz. It is the sky wave mode of propagation that contributes to the multipath fading experienced by AM broadcasts at night when the transmitted signal reaches a reception point via both ground wave and sky wave (after reflection from the D-layer). 5.5.2.3 Line-of-sight (LOS)
This is the dominant mode of propagation for frequencies above about 30 MHz. The wave propagates directly from transmitter to receiver without relying on any intervening matter. The signal path must be above the radio horizon, which differs from the geographical horizon due to atmospheric refraction. If the two antennas are located on the earth’s surface then the maximum distance d between the transmit-antenna of height ht and the receive-antenna of height hr for LOS to still be achieved (i.e. for the receiver to still ‘see’ the transmitter) is given by √ √ √ (5.87) d = 2Re ( ht + hr ) Here, Re = 8.5 × 106 m is the equivalent radius of the earth, which is 4/3 times the true mean earth radius of 6378.137 km. Using an equivalent earth radius gives correct results with the radio path treated as a straight line from the transmitter to the receiver, whereas radio waves are bent due to refraction. We see from Eq. (5.87) that to increase d we must increase the transmitter and/or receiver height. This partly explains why the antennas used for radio relay links are usually located on raised towers. LOS is also the mode of propagation employed in (unobstructed) mobile communications and satellite communications, although Eq. (5.87) is clearly not applicable to the latter. 5.5.2.4 Satellite Communications
In satellite communications, one end of the link is a transmitting and/or receiving earth station and the other end is an earth-orbiting spacecraft or satellite equipped with signal reception and transmission facilities. If the satellite is placed in an eastward orbit directly above the equator at an altitude of 35 786 km then it circles this orbit in synchrony with the earth’s rotation and therefore remains above the same spot on the earth. This special orbit is described as geostationary, and its use allows signal transmission to and reception from the satellite by an earth station antenna that is pointed in a fixed direction (towards the satellite). Tracking is not required for continuous communication, which is a significant reduction in earth station complexity and costs. A unique advantage offered by satellite communication is broad-area coverage. One geostationary satellite can be ‘seen’ from about 34% of the earth’s entire surface at elevation angles not below 10∘ , allowing reliable communications to take place between any two points within this large coverage area. In its simplest design, the satellite acts as a repeater located in the sky. It receives a weak signal from an earth station transmitter, boosts the signal strength, and re-transmits it back to earth on a different frequency. The original signal can then be received by one or more earth stations. In this way, a live television broadcast of an important event can be provided to many countries using one satellite (and of course many receiving earth stations). A relay of just three geostationary satellites allows the entire world to be covered, except for the polar regions, which cannot be seen from a geostationary orbit (GEO). The main drawback of the GEO is that the large distance between the earth station and the satellite gives rise to large propagation delays (≥120 ms one way) and extremely weak received signal strengths. These problems are alleviated by using low earth orbits (LEOs), at altitudes from about 500 to 2000 km, and medium earth orbits (MEOs), at altitudes in theory from above 2000 km to below GEO height but in practice around 19 000 to 24 000 km. However, tracking is required for communication using one LEO or MEO satellite. Furthermore, communication from a given spot on the earth is limited to a short duration each day when the satellite is visible. The use of a constellation of many such satellites, equipped with intersatellite links (to facilitate seamless handover from one satellite that is disappearing below the horizon to another visible satellite), allows continuous communication using nontracking portable units. The Iridium satellite constellation is a good operational example of this type of satellite communication system design.
387
388
5 Transmission Media
5.5.2.5 Mobile Communications
A mobile communication link consists of a mobile unit, which is capable of large-scale motion, at one end and a fixed base station at the other end, both equipped with radio transmission and reception facilities. An important difference between the mobile link and a fixed satellite or radio relay link is the increased significance of multipath propagation. The transmitted radio signal reaches the other end of the link through various paths due to reflections from a plane earth’s surface, nearby buildings and/or terrain features and diffraction around obstacles. In fact, the LOS path is often blocked altogether at some locations of the mobile unit. As the mobile unit moves, the received signal strength will therefore vary rapidly in a complicated manner since the multipath components add destructively in some locations and constructively in others. 5.5.2.6 Ionospheric Scatter
Radio waves are scattered by refractive index irregularities in the lower ionosphere. This mode occurs for frequencies between about 30 and 60 MHz and can provide communication (beyond LOS) up to distances of several thousand kilometres. 5.5.2.7 Tropospheric Scatter
Radio waves at frequencies of about 40 MHz to 4 GHz are scattered by refractive index irregularities in the troposphere. This scattering can provide communication beyond the visual horizon to distances of several hundred kilometres. Reliable communication with annual availability up to 99.9% is readily achieved over distances up to 650 km on a single hop, or thousands of kilometres using several hops in tandem. Tropospheric scatter (or troposcatter) communication has today been largely replaced by satellite communications, but it remains an attractive method to the military because it requires transmitter–receiver alignment which makes interception of the signal by an unauthorised entity more difficult. It has been used to provide multichannel FDM telephony (with a capacity of 12–240 voice channels) or wideband data communication over long distances spanning large bodies of water or other inhospitable terrain. Tropospheric scatter can, however, be a source of interference in radio relay and satellite communication systems when two independent systems become coupled as a result of their radio paths traversing a common scattering volume, which causes the signal of one system to be scattered into the antenna of the other.
5.5.3 Radio Wave Propagation Effects The presence of matter in the propagation medium and variations or discontinuities in the primary parameters of the medium (namely permittivity, permeability, and conductivity) will affect radio waves in various ways, leading to the propagation mechanisms of reflection, refraction, diffraction, and scattering and various propagation effects, as enumerated in Figure 5.32. It should be noted that scattering is both a mechanism (i.e. a means) of propagation and an effect that is imposed on radio waves As the former, it enables beneficial applications such as troposcatter communication but also causes unwanted impairments such as interference. As the latter, it gives rise to attenuation of the radio signal. We will here, however, consider only the attenuation effect of scattering along with absorption as we briefly summarise various ionospheric and tropospheric effects on radio waves. The propagation mechanisms of reflection, refraction, and diffraction are discussed in Sections 5.5.4–5.5.6. 5.5.3.1 Ionospheric Effects
Table 5.6 summarises the frequency dependence and maximum values of various ionospheric effects at 1, 3, and 10 GHz. The values listed are for total electron content (TEC) = 1018 electrons/m2 and path elevation = 30∘ , and are meant to convey a rough idea of the scale of the various effects. In addition, we deduce from this table that ionospheric effects decrease rapidly with frequency and the ionosphere may therefore be regarded as transparent to radio waves above about 3 GHz.
5.5 Radio
Table 5.6 Frequency dependence and maximum values of various ionospheric effects (assuming TEC = 1018 el/m2 , and path elevation angle = 30∘ ).
Effect
Frequency dependence
1 GHz
3 GHz
10 GHz
Faraday rotation
1/f 2
108∘
12∘
1.1∘
0.25 μs 1. In this case 14 − (−2) m= = 1.33 14 + (−2) The AM signal is said to be overmodulated. There is then a portion of the AM waveform at which the amplitude V am = Vc + kvm (t) is negative. This is equivalent to a phase shift of 180∘ , or phase reversal. There is a carrier phase reversal at every point where the top envelope crosses the x axis. The top and bottom envelopes cross each other at these points. You will also observe that the envelope of the AM signal is no longer a replica of the message signal. This envelope distortion makes it impossible to recover the original message signal from the AM waveform envelope. Overmodulation must be avoided by ensuring that the message signal vm (t) satisfies the following condition kV m ≤ Vc
(7.5)
477
478
7 Amplitude Modulation
Vppmin = B – A = 0
Vppmax = D – C = 12 V
D=6 4
(a)
2 t, ms
B=A=0 –2 –4 C = –6 D=7 6 (b) 5 4 3 2 A=1 0 B = –1 –2 –3 –4 –5 –6 C = –7
0
1
2
3
4
5
6
Vppmin = B – A = –2 V
Vppmax = D – C = 14 V
t, ms
Phase reversals
0
Figure 7.3
1
2
3
4
5
6
AM signal with (a) 100% modulation; (b) Over-modulation.
where V m is a positive voltage equal to the maximum excursion of the message signal below 0 V, V c is the carrier amplitude, and k is the modulation sensitivity (usually k = 1). Worked Example 7.1 An audio signal vm (t) = 30 sin(5000𝜋t) V modulates the amplitude of a carrier vc (t) = 65sin(50000𝜋t) V. (a) (b) (c) (d)
Sketch the AM waveform. What is the modulation factor? Determine the modulation sensitivity that would give a modulation index of 80%. If the message signal amplitude is changed to a new value that is 6 dB below the carrier amplitude, determine the resulting modulation factor.
(a) We will sketch the AM waveform over two cycles of the audio signal vm (t). The audio signal frequency f m and the carrier frequency f c must be known in order to determine how many carrier cycles are completed in one cycle of vm (t) vm (t) = 30 sin(5000𝜋t) ≡ Vm sin(2𝜋fm t),
⇒ fm = 2.5 kHz
vc (t) = 65 sin(50000𝜋t) ≡ Vc sin(2𝜋fc t),
⇒ fc = 25 kHz
Thus, vm (t) has the period T m = 1/f m = 0.4 ms and carrier frequency f c = 10f m , which means that the carrier completes 10 cycles in the time 0.4 ms that it takes the audio signal to complete one cycle. Following the steps outlined in Section 7.2.2 for sketching AM waveforms, we sketch two cycles of vm (t) at level V c = 65 V, and two cycles of −vm (t) at level −65 V. This defines the envelope of the AM waveform. We then sketch in the carrier, stretching its amplitude to always touch the envelope, and ensuring that there are exactly 10 cycles of this carrier in one cycle of the envelope. This completes the required sketch of the AM waveform vam (t), which is shown properly labelled in Figure 7.4. Note that we assumed modulation sensitivity k = 1. This is always the case, except where specifically otherwise indicated, as in (c) below.
7.2 AM Signals: Time Domain Description
95 65 35
0
–35 Vppmin –65 –95
Vppmax 0.4
0
Figure 7.4
0.8 → t (ms)
AM waveform in Worked Example 7.1.
(a) From Figure 7.4 and Eq. (7.2) Vpp max = 2 × 95 = 190 V Vpp min = 2 × 35 = 70 V 190 − 70 = 0.462 m= 190 + 70 (b) The expression for V ppmax and V ppmin taking account of a nonunity modulation sensitivity k is Vpp max = 2(Vc + kV m );
Vpp min = 2(Vc − kV m )
Using Eq. (7.2) m=
2(Vc + kV m ) − 2(Vc − kV m ) kV m = Vc 2(Vc + kV m ) + 2(Vc − kV m )
Therefore k=
mV c 0.8 × 65 = 1.73 = Vm 30
(c) Assuming k = 1, we have from (c) above V m= m Vc Given that V m is 6 dB below V c , we have ( ) Vm = −6 20 log10 Vc or Vm = 10(−6∕20) = 0.5 Vc Therefore, m = 0.5. Observe that m is simply the dB value converted to a ratio.
479
7 Amplitude Modulation
7.3 Spectrum and Power of Amplitude Modulated Signals By virtue of the Fourier theorem, any message signal vm (t) can be realised as the discrete or continuous sum of sinusoidal signals. The spectrum of an AM signal can therefore be obtained by considering a sinusoidal message signal and extending the result to information signals, which in general consist of a band of frequencies (or sinusoids).
7.3.1 Sinusoidal Modulating Signal So, consider a carrier signal of amplitude V c and frequency f c vc (t) = Vc cos(2𝜋fc t)
(7.6)
and a sinusoidal modulating signal of amplitude V m and frequency f m vm (t) = Vm cos(2𝜋fm t)
(7.7)
Usually Vm ≤ Vc ;
fm ≪ fc
(7.8)
An expression for the AM signal vam (t) is obtained by replacing the constant carrier amplitude V c in Eq. (7.6) with the expression for the modulated amplitude V am given in Eq. (7.1) vam (t) = [Vc + kvm (t)] cos(2𝜋fc t) = [Vc + Vm cos(2𝜋fm t)] cos(2𝜋fc t),
for k = 1
(7.9)
where we have set k = 1 (as is usually the case), and substituted the expression for vm (t) from Eq. (7.7). Following the steps outlined in Section 7.2, the AM waveform given by Eq. (7.9) is sketched in Figure 7.5. It is obvious from Figure 7.5 that the maximum and minimum peak-to-peak amplitudes of vam (t) are given, respectively, by Vpp max = (Vc + Vm ) − (−Vc − Vm ) = 2(Vc + Vm ) Vpp min = (Vc − Vm ) − (−Vc + Vm ) = 2(Vc − Vm ) Vc + Vm Vc Vc – V m
Vppmax Vppmin
480
0
t
–Vc + Vm –Vc –Vc – Vm Figure 7.5 AM waveform for a sinusoidal message signal of frequency f m and amplitude V m . The plot is for carrier frequency f c = 100f m , and carrier amplitude V c = 3V m .
7.3 Spectrum and Power of Amplitude Modulated Signals
From Eq. (7.2), we obtain a simple expression for the modulation factor m in the special case of a sinusoidal message signal and the usual case of unity modulation sensitivity 2(Vc + Vm ) − 2(Vc − Vm ) 2(Vc + Vm ) + 2(Vc − Vm ) V = m Vc
m=
(7.10)
The modulation factor is given by the ratio between the amplitude of a sinusoidal message signal and the amplitude of the carrier signal. The ideal modulation factor m = 1 is obtained when V m = V c , and overmodulation occurs whenever V m > V c . Returning to Eq. (7.9) and expanding it using the trigonometric identity cos A cos B =
1 1 cos(A − B) + cos(A + B) 2 2
we obtain vam (t) = Vc cos(2𝜋fc t) V + m cos[2𝜋(fc − fm )t] 2 Vm + cos[2𝜋(fc + fm )t] 2 The AM signal therefore contains three frequency components, namely ● ●
●
(7.11)
The carrier frequency f c with amplitude V c . A frequency component f c −f m with amplitude 12 Vm . This frequency component is called the LSF, since it lies below the carrier frequency. A frequency component f c + f m with amplitude 1/2V m . This frequency component is called the USF, since it lies above the carrier frequency.
Note that the AM signal does not contain any component at the message signal frequency f m . Figure 7.6 shows the single-sided amplitude spectra of vm (t), vc (t), and vam (t). You will observe by studying this figure along with Eq. (7.11) that AM translates the message frequency f m of amplitude V m to two side frequencies f c − f m and f c + f m , each of amplitude 1/2V m . Let us denote this process as follows fm |Vm
AM
−−−−→ fc
(fc − fm )| 1 Vm + (fc + fm )| 1 Vm 2
(7.12)
2
Equation (7.12) states that a frequency component f m of amplitude V m is translated by an AM process (that uses a carrier of frequency f c ) to two new frequency components at f c − f m and f c + f m each of amplitude 1/2V m . From Eq. (7.10), this amplitude can be expressed in terms of the carrier amplitude V c and the modulation factor m mV c Vm = (7.13) 2 2 Treating the sinusoidal message signal as a lowpass (i.e. baseband) signal, and the AM signal as a bandpass signal, it follows (see Section 4.7.3) that they have the following bandwidths Message signal bandwidth = fm AM signal bandwidth = (fc + fm ) − (fc − fm ) = 2fm Thus, the AM bandwidth is twice the message bandwidth. It should be noted that erroneously treating the sinusoidal message signal vm (t) as a bandpass signal would give it zero bandwidth, since the width of significant frequency components is the difference between maximum and minimum frequency, which in this case are both equal to f m , so that B = f m − f m = 0. This would lead to incorrect results.
481
482
7 Amplitude Modulation
Amn Vm
(a)
f
fm Acn
Vc
(b) f
fc Vc (c)
Aamn
Carrier
LSF
1 Vm 2
USF
fc – fm fc fc + fm Figure 7.6
f
Single-sided amplitude spectrum of (a) sinusoidal modulating signal; (b) carrier signal; and (c) AM signal.
Worked Example 7.2
For the AM waveform vam (t) obtained in Worked Example 7.1
(a) Determine the frequency components present in the AM waveform and the amplitude of each component. (b) Sketch the double-sided amplitude spectrum of vam (t). (a) The sinusoidal message signal vm (t) of amplitude 30 V and frequency f m = 2.5 kHz is translated by the AM process with carrier frequency f c = 25 kHz in the manner given by Eq. (7.12) 2.5 kHz|30 V
AM
−−−−→ 25 kHz
(25 − 2.5 kHz)|15 V + (25 + 2.5 kHz)|15 V
The carrier of amplitude 65 V is also a component of vam (t). So, there are three components as follows: (i) LSF of frequency 22.5 kHz and amplitude 15 V. (ii) Carrier of frequency 25 kHz and amplitude 65 V. (iii) USF of frequency 27.5 kHz and amplitude 15 V. (b) The double-sided amplitude spectrum showing each of the above components as a positive and negative frequency pair is sketched in Figure 7.7.
7.3.2 Arbitrary Message Signal In practice, a message signal consists of a band of frequencies from f 1 to f m , where f 1 may be zero and f m is larger than f 1 and finite. We will represent this band of frequencies symbolically by the trapezoidal spectrum shown in
7.3 Spectrum and Power of Amplitude Modulated Signals
An, volts
32.5
7.5 –27.5
–25
Figure 7.7
22.5
–22.5
25
f, kHz
27.5
Double-sided spectrum of AM waveform in worked example 7.2.
|Vm(f)|
Figure 7.8 Symbolic representation of the amplitude spectrum of a message signal.
V1 Vm
|Vm(f)|
|Vam(f)|
Vc
V1 AM fc
Vm
f fc + fm
fc
fc + f1
f
fc – f1
Figure 7.9
0.5V1 0.5Vm fc – fm
fm
f1
f
fm
f1
Translation of the maximum and minimum frequency components of a message signal in AM.
Figure 7.8. If this message signal is used to modulate a carrier of frequency f c , then Eq. (7.12) leads to f1 |V1 fm |Vm
AM
−−−−→ fc
AM
−−−−→ fc
(fc − f1 )| 1 V1 + (fc + f1 )| 1 V1 2
2
(fc − fm )| 1 Vm + (fc + fm )| 1 Vm 2
2
The above frequency translation is shown in Figure 7.9. Each of the remaining frequency components of the message signal is translated to a USF in the range f c + f 1 to f c + f m , and an LSF lying between f c − f m and f c − f 1 . The result is that the message signal spectrum is translated to a USB and an LSB, as shown in Figure 7.10. The shape of the message spectrum is preserved in the sidebands, although the LSB is mirror-inverted, whereas the USB is erect. However, the amplitude of each sideband is reduced by a factor of 2 compared to the message spectrum. The condition f c > f m ensures that the LSB lies entirely along the positive-frequency axis and does not overlap with the negative-frequency LSB in a double-sided spectrum. We will see that, for the AM waveform to have an envelope that can be easily detected at the receiver, the carrier frequency must be much larger than the highest frequency component f m in the message signal. That is f c ≫ fm
(7.14)
483
7 Amplitude Modulation
|Vm(f)|
|Vam(f)|
Vc
V1 AM fc
Vm
USB f fc + fm
fc
fc + f1
fc – f 1
Figure 7.10
f
LSB
0.5V1 0.5Vm fc – f m
fm
f1
Carrier
484
Production of lower sideband (LSB) and upper sideband (USB) in AM.
A message signal (also called the baseband signal) is always regarded as a lowpass signal, since it usually contains positive frequency components near zero. Its bandwidth is therefore equal to the highest significant frequency component f m . Thus, the message signal whose spectrum is represented by Figure 7.8 has bandwidth f m , rather than f m − f 1 , which would be the case if the signal were treated as bandpass. The AM signal, on the other hand, results from the frequency translation of a baseband signal, and is therefore a bandpass signal with a bandwidth equal to the width of the band of significant positive frequencies. It follows that, for an arbitrary message signal of bandwidth f m , the AM bandwidth (see Figure 7.10) is given by (fc + fm ) − (fc − fm ) = 2fm In general AM bandwidth = 2 × Message bandwidth
(7.15)
Worked Example 7.3 (c) A 1 MHz carrier is amplitude modulated by a music signal that contains frequency components from 20 Hz to 15 kHz. Determine (d) The frequencies in the AM signal and sketch its double-sided amplitude spectrum. (e) The transmission bandwidth of the AM signal. (f) Carrier frequency f c = 1 MHz and the message frequency band is from f 1 = 20 Hz to f m = 15 kHz. The AM signal therefore contains the following frequencies: (i) LSB in the frequency range fc − (f1 → fm ) = 1 MHz − (20 Hz → 15 kHz) = 985 kHz → 999.98 kHz (ii) Carrier frequency at f c = 1 MHz = 1000 kHz (iii) USB in the frequency range fc + (f1 → fm ) = 1 MHz + (20 Hz → 15 kHz) = 1000.02 kHz → 1015 kHz The double-sided spectrum is shown in Figure 7.11. Note that this plot is a mixture of a discrete spectrum (for the carrier) with the y axis in volts, and a continuous spectrum (for the sidebands) with the y axis in V/Hz. (g) The transmission bandwidth BT is given by Eq. (7.15) BT = 2fm = 2 × 15 kHz = 30 kHz Note that BT is the width of the positive frequency band in Figure 7.11.
7.3 Spectrum and Power of Amplitude Modulated Signals
1 V 2 c
Vam(f)
f, kHz –1015
–1000
Figure 7.11
–985
1000 985 1015 998.98 1000.02 BT
AM spectrum in Worked Example 7.3.
7.3.3 Power The distribution of power in AM between the carrier and sidebands is easier to determine using the spectrum of a carrier modulated by a sinusoidal message signal. This spectrum is shown in Figure 7.6c, from which we obtain the following (normalised) power distribution: Power in carrier Pc =
Vc2 2
(7.16)
Power in USF V2 (mV c )2 (Vm ∕2)2 = m = 2 8 8 m2 Pc = 4
PUSF =
(7.17)
Power in LSF PLSF = PUSF =
m2 Pc 4
(7.18)
Power in side frequencies (SF) PSF = PLSF + PUSF =
m2 Pc 2
(7.19)
Total power in AM waveform Pt = Pc + PSF = Pc + ( ) m2 = Pc 1 + 2
m2 Pc 2 (7.20)
Equation (7.19) states that the total power PSF in the side frequencies is a fraction m2 ∕2 of the power in the carrier. Thus, PSF increases with modulation factor. The maximum total side frequency power PSFmax is 50% of the carrier power, and this is obtained when m = 1, i.e. at 100% modulation. We may use Eq. (7.20) to express Pc in terms of Pt Pc =
Pt 2 P = 1 + m2 ∕2 2 + m2 t
(7.21)
485
486
7 Amplitude Modulation
This allows us to determine PSF as a fraction of the total power in the AM waveform as follows ( ) 2 PSF = Pt − Pc = Pt − Pt 2 + m2 ( ) 2 = Pt 1 − 2 + m2 m2 = P 2 + m2 t Therefore the maximum power in side frequencies, obtained at 100% modulation (m = 1), is 1 1 PSFmax = P = P 2+1 t 3 t This corresponds to a minimum carrier power
(7.22)
(7.23)
2 P (7.24) 3 t Equation (7.24) may also be obtained by putting m = 1 in. Eq. (7.21). It shows that at least two-thirds of the transmitted AM power is contained in the carrier, which carries no information. The sidebands, which contain all the transmitted information, are only fed with at most one-third of the transmitted power. This is a serious demerit of AM. Equations (7.16)–(7.24) are derived assuming a sinusoidal modulating signal, but they are equally applicable to all AM signals involving arbitrary message signals. The only change is that LSF, USF, and SF are replaced by LSB, USB, and SB (sideband), respectively. The frequency domain representation of the arbitrary message and AM signals is presented in Figure 7.8. The above equations are applicable to this general case provided we define the modulation factor m in terms of the total sideband power PSB and carrier power Pc as follows, based on Eq. (7.19) √ 2PSB (7.25) m= Pc Pcmin =
Worked Example 7.4
An AM broadcast station operates at a modulation index of 95%.
(a) Determine what percentage of the total transmitted power is in the sidebands. (b) If the transmitted power is 40 kW when the modulating signal is switched off, determine the total transmitted power at 95% modulation. (a) Using Eq. (7.22), the ratio of sideband power PSF to total power Pt is PSF m2 0.952 = = = 0.311 2 Pt 2+m 2 + 0.952 Therefore, 31.1% of the total power is in the sidebands. (b) The unmodulated carrier power Pc = 40 kW. Using Eq. (7.21), the total transmitted power Pt is given by ( ) ( ) m2 0.952 = 40 1 + Pt = Pc 1 + 2 2 = 40(1 + 0.45) = 58 kW Worked Example 7.5 The carrier vc (t) = 100 sin(3 × 106 𝜋t) V is amplitude modulated by the signal vm (t) = 60 sin(80 × 103 𝜋t) + 30 sin(100 × 103 𝜋t) V. (a) Determine the total power in the AM signal. (b) Determine the modulation index.
7.3 Spectrum and Power of Amplitude Modulated Signals
100
An, volt
30 15 1450 1460 Figure 7.12
1500
1540 1550
f, kHz
AM spectrum in Worked Example 7.5.
(a) Carrier amplitude V c = 100 V, and carrier frequency f c = 1.5 MHz. The message signal contains two frequencies f 1 = 40 kHz with amplitude 60 V, and f 2 = 50 kHz with amplitude 30 V. From Eq. (7.12), the AM signal has the following frequency components, which are shown in the single-sided spectrum of Figure 7.12: (i) f c −f 2 = 1450 kHz with amplitude 15 V. (ii) f c −f 1 = 1460 kHz with amplitude 30 V. (iii) f c = 1500 kHz with amplitude 100 V. (iv) f c + f 1 = 1540 kHz with amplitude 30 V. (v) f c + f 2 = 1550 kHz with amplitude 15 V. The total power Pt in the AM wave is the sum of the powers of these components. Therefore 152 302 1002 302 152 + + + + 2 2 2 2 2 = 6125 W
Pt =
(b) The modulation index can be determined from Eq. (7.25). The carrier power Pc , i.e. the power in the frequency component f c , and the total sideband power PSF are given by 1002 = 5000 W 2 = Pt − Pc = 6125 − 5000 W = 1125 W
Pc = PSF
Therefore modulation index is √ √ 2PSB 2250 × 100% × 100% = Pc 5000 = 67.08% You may wish to verify, using the last result of Worked Example 7.5, that in the case of a message signal consisting of multiple tones (i.e. sinusoids) of amplitudes V 1 , V 2 , V 3 , …, the modulation factor m can be obtained from the following formula √ ( )2 ( )2 ( )2 V3 V1 V2 + + +··· m= Vc Vc Vc √ = m1 2 + m2 2 + m3 2 + · · · (7.26) Here, m1 is the modulation factor due to the carrier of amplitude V c being modulated only by the tone of amplitude V 1 , and so on. See Question 7.5 for a derivation of Eq. (7.26).
487
488
7 Amplitude Modulation
7.4 AM Modulators 7.4.1 Generation of AM Signal AM signals can be generated using two different methods. In one, the carrier is passed through a device that has its gain varied linearly by the modulating signal and, in the other, both the carrier and modulating signal are passed through a nonlinear device. 7.4.1.1 Linearly-varied-gain Modulator
The first method is illustrated in Figure 7.13. The carrier signal is transmitted through a device whose gain G is varied linearly by the modulating signal according to the relation G = 1 + k1 vm (t)
(7.27)
The output of the device is given by vo (t) = Gvi (t) = [1 + k1 vm (t)]Vc cos(2𝜋fc t) = [Vc + kvm (t)] cos(2𝜋fc t),
k = k1 Vc
≡ vam (t)
(7.28)
The output is therefore an AM signal. This modulator has a modulation sensitivity k = k1 V c , where k1 is a constant that determines the sensitivity of the gain G to the value of the modulating signal. The operational amplifier (opamp) configuration shown in Figure 7.13b has the gain variation given by Eq. (7.27), and therefore will implement this method of AM. The variable input resistance Ri is provided by a field-effect transistor (FET), which is biased (using a fixed DC voltage) to operate at the centre of its linear characteristic. A modulating signal vm (t) connected to its gate then causes the FET to be more conducting as vm (t) increases. In effect, the source-to-drain Input, ʋi
Output, ʋo = Gʋi
Carrier signal, ʋc(t) (a)
Variable Gain G = 1 + k1ʋm(t)
ʋc(t) = Vc cos(2πfct)
AM signal, ʋam(t)
Message signal, ʋm(t) Rf ʋm(t) (b)
Ri
– + Carrier ʋc(t)
Figure 7.13
AM signal ʋo(t) = 1+
Rf ʋ (t) Ri c
(a) AM generation using a variable gain device; (b) Opamp implementation.
7.4 AM Modulators
conductance Gi (= 1/Ri ) of the FET is made to vary linearly with the modulating signal Gi = avm (t) =
1 Ri
or Ri =
1 avm (t)
(7.29)
Two important characteristics of an opamp, in the configuration shown, are that both the inverting (labelled −) and noninverting (labelled +) terminals are forced to the same potential, and negligible current flows into either of these terminals. There are two implications: ●
The absence of current flow into the inverting terminal implies that the same current flows through Rf and Ri , which are therefore in series. Thus, the voltage vo is shared between Rf and Ri according to the ratio of their resistance. In particular, the voltage drop across Ri , which is the voltage at the inverting terminal, is given by v− =
●
Ri v Rf + Ri o
The voltage v− = vc , since the two terminals are at the same potential. Therefore, the output voltage vo and the carrier voltage vc are related by ] [ Ri + Rf Rf v vc = 1 + vo = Ri Ri c
(7.30)
Substituting Eq. (7.29) for Ri gives the gain of the opamp circuit v G = o = 1 + Rf avm (t) = 1 + k1 vm (t) vc We see that the amplifier has a gain that varies linearly with modulating signal as specified by Eq. (7.27), with k1 = Rf a, and therefore vo is the AM signal given by Eq. (7.28). 7.4.1.2 Switching and Square-law Modulators
The second method for generating an AM signal is by adding together the carrier and modulating signals, and then passing the sum signal through a nonlinear device. Diodes and suitably biased transistors provide the required nonlinear characteristic. Transistors are, however, preferred because they also provide signal amplification. The summation can be realised by connecting the carrier and message voltages in series at the input of the nonlinear device, so that the device input is vi (t) = vm (t) + vc (t)
(7.31)
7.4.1.2.1 Switching Modulator
The device is called a switching modulator if the principle of operation is by the device being switched on and off (at the carrier frequency) during the positive half of the carrier signal cycle. This requires that the carrier amplitude be much greater than the peak value of the message signal, so that the switching action is controlled entirely by the carrier signal. The output voltage vo (t) equals input voltage vi (t) during one-half of the carrier frequency (when the device is on) and equals zero during the remaining half-cycle (when the device is off). Therefore the output is effectively the product of input signal and a periodic square wave vs (t), which has amplitude 1 during half the carrier cycle (when the device is on), and amplitude 0 during the other half-cycle (when the device is off). Note then that vs (t) has period T = 1/f c . We may write vo (t) = vi (t)vs (t) = [vm (t) + vc (t)]vs (t)
(7.32)
489
490
7 Amplitude Modulation
Recall from Chapter 4 that vs (t) can be written as a Fourier series vs (t) = Ao + A1 cos(2𝜋fc t) + A3 cos(2𝜋(3fc )t) + A5 cos(2𝜋(5fc )t) + · · · Substituting this expression in Eq. (7.32) and assuming for simplicity a sinusoidal message signal, we obtain the output signal vo (t) = [Vm cos(2𝜋fm t) + Vc cos(2𝜋fc t)][Ao + A1 cos(2𝜋fc t) + · · ·] = Ao Vm cos(2𝜋fm t) + Ao Vc cos(2𝜋fc t) + A1 Vm cos(2𝜋fm t)A1 cos(2𝜋fc t) + A1 Vc cos2 (2𝜋fc t) + · · · AV AV = Ao Vc cos(2𝜋fc t) + 1 m cos 2𝜋(fc − fm )t + 1 m cos 2𝜋(fc + fm )t 2 2 A1 Vc A1 Vc + + Ao Vm cos(2𝜋fm t) + cos(4𝜋fc t) + · · · (7.33) 2 2 The output vo (t) therefore contains the three frequency components that constitute an AM signal, namely the carrier at f c , the LSF at f c − f m , and the USF at f c + f m . However, it also contains other components at DC (i.e. f = 0), the message frequency f m , even harmonics of the carrier frequency (i.e. 2f c , 4f c , 6f c , …), and side frequencies around odd harmonics of the carrier frequency (i.e. 3f c ± f m , 5f c ± f m , …). You may verify this by including more terms in the Fourier series representation of vs (t), bearing in mind that it contains only odd harmonics of f c . To obtain the required AM signal vam (t), we pass vo (t) through a bandpass filter (BPF) centred on f c and having a bandwidth equal to twice the message bandwidth. This filter passes the carrier frequency and the two sidebands and blocks all the other frequency components in vo (t) including f m , the maximum frequency component in the message signal. The lowest frequency in the LSB is f c − f m , and this frequency must be larger than f m for the message signal to be excluded by the filter. Thus, the following condition must be satisfied fc > 2fm
(7.34)
7.4.1.2.2 Square-law Modulator
The nonlinear circuit is called a square-law modulator if the device is continuously on, but with a variable resistance that depends on the input voltage. In this case, the output voltage is a nonlinear function of input voltage, which may be approximated by the quadratic expression vo (t) = avi (t) + bvi 2 (t)
(7.35)
Substituting Eq. (7.31) for vi (t), and again assuming a sinusoidal message signal for simplicity vo (t) = a[Vm cos(2𝜋fm t) + Vc cos(2𝜋fc t)] + b[Vm cos(2𝜋fm t) + Vc cos(2𝜋fc t)]2
(7.36)
Expanding and simplifying the above equation using relevant trigonometric identities, we obtain vo (t) =
1 1 b(Vc2 + Vm2 ) + aV m cos(2𝜋fm t) + bV 2m cos(4𝜋fm t) 2 2 + aV c cos(2𝜋fc t) + bV c Vm cos 2𝜋(fc − fm )t
+ bV c Vm cos 2𝜋(fc + fm )t 1 (7.37) + bV 2c cos(4𝜋fc t) 2 Notice that vo (t) contains the AM signal comprising the carrier f c and sidebands f c ± f m . However, there are also other components at DC, f m , 2f m , and 2f c . These extra components can be excluded using a BPF as discussed above for the switching modulator. Because there is a component at twice the message frequency 2f m , which must
7.4 AM Modulators
Message ʋm(t)
+
Figure 7.14
Σ
ʋi(t)
Nonlinear device
ʋo(t)
+ Carrier ʋc(t)
AM BPF
ʋam(t)
Square-law and switching (AM) modulator.
be excluded by the filter while still passing the LSB, it means that if f m is the maximum frequency component of the message signal then the following condition must be satisfied fc > 3fm
(7.38)
In Eq. (7.35), the nonlinear characteristic was approximated by a polynomial of order N = 2. In practice, a higher-order polynomial may be required to represent the input–output relationship of the nonlinear device. But the AM signal can still be obtained as discussed above using a BPF, provided the carrier frequency is sufficiently higher than the maximum frequency component f m of the message signal. There will in this general case be a component in vo (t) at Nf m , and therefore the carrier frequency must satisfy the condition fc > (N + 1)fm
(7.39)
Figure 7.14 shows a block diagram of the AM modulator based on one of the two nonlinear principles, i.e. switching or square law. The BPF is usually realised using an LC tuned circuit, which has a resonance frequency f c , and a bandwidth that is just large enough to pass the sidebands.
7.4.2 AM Transmitters The AM transmitter is used to launch electromagnetic waves that carry information in their amplitude variations. It can be implemented using either low-level or high-level modulation. 7.4.2.1 Low-level Transmitter
Low-level modulation uses a nonlinear device, such as a diode, bipolar junction transistor (BJT), or FET, to generate a weak AM signal. This signal is then amplified in one or more linear amplifiers to bring it to the required high-power level before radiation by an antenna. A low-level modulation system is shown in Figure 7.15, where the amplitude modulator block consists of a nonlinear device followed by a BPF, as earlier discussed in connection with Figure 7.14. Antenna
Carrier signal
Amplitude Modulator
Low-power AM
Audio frequency signal Figure 7.15
Low-level AM transmitter.
Linear amplifiers
High-power AM
491
492
7 Amplitude Modulation
Crystal oscillator
NL tuned RF amplifiers
Antenna
High-power carrier NL tuned RF amplifier High-power audio Audio input Figure 7.16
High-power AM
NL ≡ Non-linear
Linear audio amplifiers High-level AM transmitter.
The radio frequency (RF) power amplifiers must be linear in order to preserve the AM signal envelope, avoiding the nonlinear distortions discussed in Section 4.7.6. The main disadvantage of these linear amplifiers is that they are highly inefficient in converting DC power to RF power. Transmitters for AM broadcasting rarely use low-level operation. The method was, however, widely used on international high frequency (HF) links for radiotelephony. 7.4.2.2 High-level Transmitter
In high-level modulation, the message signal directly varies the carrier amplitude in the final RF amplifier stage of the transmitter. Figure 7.16 shows the block diagram of a standard high-level broadcast transmitter. The carrier is generated by a crystal-controlled oscillator and amplified by several Class C tuned RF amplifiers. Some of these amplifiers may be operated as frequency multipliers to give the desired carrier frequency f c . The audio input is linearly amplified, and this may be by a Class A audio amplifier followed by a high-power Class B audio amplifier. This raises the audio signal to a level required to sufficiently modulate the carrier. The high-power audio is then connected (via transformer coupling) to a final (nonlinear) Class C tuned RF amplifier of passband f c −5 to f c + 5 kHz, where it amplitude modulates the high-power carrier signal. This gives a high-power AM signal that is launched by the antenna. High-power transmitters that radiate several kilowatts of power still use thermionic valves, at least in the last stage of amplification. However, low-power transmitters, such as those used in mobile units, may be fully implemented using solid-state components. The main advantage of the high-level transmitter is that it allows the use of the highly efficient, albeit nonlinear, Class C tuned amplifiers throughout the RF section. Unlike the audio signal, the single-frequency carrier can be nonlinearly amplified since the tuned section can be readily designed to eliminate the resulting harmonic products. The main drawback of a high-level transmitter is that it requires a high-power audio signal, which can only be achieved by using expensive high-power linear amplifiers. Nevertheless, almost all AM transmitters use high-level operation due to the above-mentioned overriding advantage.
7.5 AM Demodulators Demodulation is the process of recovering the message signal from the received modulated carrier. If the message signal is digital, the demodulation process is more specifically called detection since the receiver detects the range of the modulated parameter and is not concerned with determining its precise value. However, beware! The most common usage treats demodulation and detection as synonymous terms.
7.5 AM Demodulators
A simple yet highly efficient circuit for demodulating an AM signal is commonly called an envelope or diode detector, although a more appropriate name would be envelope or diode demodulator. This is a noncoherent demodulation technique, which does not require a locally generated carrier. Coherent demodulation is also possible, involving mixing the AM signal with a reference carrier (extracted from the incoming AM signal), but this a much more complex circuit and therefore is not commonly used. The input signal at a receiver usually consists of several weak carriers and their associated sidebands. A complete receiver system must therefore include an arrangement for isolating and amplifying the desired carrier before demodulation. A very efficient technique for achieving this goal is based on the superheterodyne principle, which we also discuss in this section.
7.5.1 Diode Demodulator A simple diode demodulator circuit is shown in Figure 7.17a. The circuit consists of a diode, which has the ideal characteristic shown in Figure 7.17b, in series with an RC filter. The AM signal vam (t) is fed from a source with internal resistance Rs . DC blocking iD
(a)
Cb
ʋD
Rs
RL ʋo(t)
C
ʋm(t)
LPF
ʋam(t) iD Rr =
(b)
Vr =∞ Ir
If
Rf =
Ir = 0 Vr
Vf
Vf If ʋD
ʋam(t)
(c)
t
Figure 7.17 Diode demodulator: (a) circuit; (b) ideal diode characteristic; (c) input AM signal; (d) envelope demodulator output v o (t) (bold curve); (e) smoothed output v m (t).
493
494
7 Amplitude Modulation
ʋo(t)
(d)
t
ʋm(t)
t
(e)
Figure 7.17
(Continued)
When the diode is reverse-biased, i.e. vD is negative, no current flows. That is, the diode has an infinite resistance (Rr = ∞) when reverse-biased. Under a forward bias (i.e. vD positive), the diode has a small and constant resistance Rf equal to the ratio of forward-bias voltage vD to diode current iD . To understand the operation of the demodulator, let the AM signal vam (t) be as shown in Figure 7.17c. This consists of a 50 kHz carrier that is 80% modulated by a 1 kHz sinusoidal message signal. Thus vam (t) = Vc [1 + m sin(2𝜋fm t)] sin(2𝜋fc t) = Vam (t) sin(2𝜋fc t)
(7.40)
with m = 0.8, and the envelope V am (t) = V c [1 + m sin(2𝜋f m t)]. During the first positive cycle of vam (t), the diode is forward-biased (i.e. vD ≥ 0), and has a small resistance Rf . The capacitor C then charges rapidly from 0 V towards V am (1/4f c ), which is the value of the envelope V am at t = 1/4f c . At the instant t = 1/4f c , the AM signal vam (t) begins to drop in value below V am = (1/4f c ), which causes the diode to be reverse-biased, since its positive terminal is now at a lower potential than its negative terminal. With the diode effectively an open circuit, the capacitor discharges slowly towards 0 V through the load resistance RL , and this continues for the remaining half of the first positive cycle and for the entire duration of the first negative cycle of vam (t). During the second positive cycle of vam (t), the diode becomes forward-biased again at the instant that vam (t) exceeds the (remaining) capacitor voltage. The capacitor then rapidly charges towards V am (5/4f c ). At the instant t = 5/4f c , the input vam (t) begins to drop below V am (5/4f c ), the diode becomes reverse-biased, and the capacitor again begins to discharge slowly towards 0 V. The process just described is repeated over and over, and this gives rise to the output signal vo (t) shown in Figure 7.17d. The output signal contains two large frequency components – one at DC and the other at the message frequency(s) f m , and several small (unwanted) components or ripples at frequencies f = nf c , and nf c ± f m , where
7.5 AM Demodulators
n = 1, 2, 3, … The DC component is often used for automatic gain control. It can be removed by passing vo (t) through a large DC-blocking capacitor. The ripples are usually ignored, being outside the audible range of frequencies, but can be easily removed by lowpass filtering. If both DC and ripples are removed, it leaves a smooth message signal vm (t), as shown in Figure 7.17e. For the diode demodulator to work properly as described above, several important conditions must be satisfied. Before stating these conditions it is important to note that, when a capacitor C charges or discharges through a resistor R from an initial voltage V i (at t = 0) towards a final voltage V f (at t = ∞), the voltage drop vC across the capacitor changes exponentially with time t according to the expression ) ( t (7.41) 𝜐C (t) = Vf − (Vf − Vi ) exp − RC The capacitor is charging if V f > V i and discharging if V f < V i . Figure 7.18 shows the voltage across a charging capacitor. The maximum change in capacitor voltage from its initial value at t = 0 to its final value at t = ∞ is V f − V i . After a time t = RC, the voltage has changed to 𝜐c (RC) = Vf − (Vf − Vi ) exp(−1) = Vf − 0.368(Vf − Vi ) and the amount of voltage change in the time from t = 0 to t = RC is 𝜐c (RC) − 𝜐c (0) = Vf − 0.368(Vf − Vi ) − Vi = 0.632(Vf − Vi ) This is 63.2% of the maximum change. This time interval is called the time constant of the series RC circuit. That is, the capacitor voltage undergoes 63.2% of its total change within the initial time of one time constant. The rate of voltage change slows down continuously with time, so that in fact the capacitor only approaches but never actually reaches 100% of its maximum possible change, except until t = ∞. The smaller the time constant, the more rapidly the capacitor charges or discharges, and the larger the time constant, the longer it takes the capacitor to charge or discharge. We may now state the conditions that must be satisfied in the design of the envelope demodulator: ●
The charging time constant must be short enough to allow the capacitor to charge rapidly and track the AM signal up to its peak value before the onset of the negative-going portion of the oscillation when charging is cut off by the diode. Since charging is done through resistance R = Rf + Rs , and the AM signal oscillates with period T c = 1/f c , this requires that (Rf + Rs )C ≪
●
1 fc
(7.42)
The discharging time constant must be long compared to T c so that the capacitor voltage does not follow the AM signal to fall significantly towards 0 V during its negative-going portion of its oscillation. At the same time, the discharging time constant must be short enough for the capacitor voltage to be able to track the fastest-changing
Figure 7.18 Exponential rise in voltage across a capacitor C charging through resistance R from initial voltage V i towards a final voltage V f .
ʋC(t)
0.63(Vf – Vi)
Vf
Vi
0
RC
2RC
3RC
t 4RC
495
496
7 Amplitude Modulation
component of the AM signal envelope. Knowing that this component is the maximum frequency f m in the message signal, and that the capacitor discharges through the load resistor RL , it follows that the condition that must be satisfied is 1 1 ≫ RL C ≫ (7.43) fm fc ●
●
The third condition is implied in Eq. (7.43) but is so important that it is separately stated here for emphasis. The carrier frequency f c must be much larger than the maximum frequency component f m of the message signal. In practice, f c is greater than f m by a factor of about 100 or more. In the above discussions, we assumed an ideal diode characteristic (see Figure 7.17b). In practice, a diode has a nonlinear characteristic for small values of forward-bias voltage vD , followed by a linear characteristic (as shown in Figure 7.17b) for large values of vD . The effect of the nonlinear characteristic is that diode resistance Rf becomes a function of bias voltage, and this causes the output vo (t) to be distorted in the region of small carrier amplitudes. To avoid this distortion the modulated carrier amplitude must always be above the nonlinear region, which imposes the additional condition that 100% modulation cannot be employed. Figures 7.17c–e were obtained using the following values kalpana fc = 50 kHz (Carrier frequency) fm = 1 kHz
(Message signal frequency)
Rs = 50 Ω
(AM signal source resistance)
Rf = 20 Ω
(Forward-biased diode resistance)
C = 10 nF
(Charging∕discharging capacitor)
RL = 10 kΩ
(Load resistance)
m = 0.8
(Modulation factor)
(7.44)
Note that all four conditions are satisfied by the above selection of values: ●
●
●
●
Condition 1: The charging time constant = (Rf + Rs )C = 0.7 μs. This is much less than the carrier period 1/f c = 20 μs as required. Condition 2: The discharging time constant RL C = 100 μs. As required, this is much less than the period of the maximum frequency component in the message signal 1/f m = 1000 μs and much larger (although by a factor of five only) than the carrier period (20 μs). A choice of carrier frequency f c = 100 kHz would have satisfied this condition better, while at the same time still satisfying condition 1. However, f c = 50 kHz was chosen in order to produce clearer illustrative plots for our discussion. Condition 3: The carrier frequency f c = 50 kHz is much larger than the maximum frequency component (f m = 1 kHz) in the message signal as required. Condition 4: The modulation depth is 80%, which is less than 100% as required.
Figure 7.19 shows the envelope demodulator output using the same values in Eq. (7.44), except that the carrier frequency is changed to f c = 4 kHz. Observe that the fluctuations in the output vo (t) are now more than just small ripples. You should work out which of the above four conditions have been flouted and explain why the capacitor discharges significantly towards zero before being recharged.
7.5.2 Coherent Demodulator Figure 7.20a shows the block diagram of a coherent demodulator. It consists of a multiplier and a lowpass filter (LPF). This excludes the arrangement for obtaining the carrier frequency, which is discussed later. The multiplier
7.5 AM Demodulators
ʋo(t) t
ʋam(t)
1/fm
0 Figure 7.19
Input AM waveform v am (t) and envelope demodulator output v o (t) when f c = 4 kHz in Eq. (7.44).
ʋam(t)
(a)
2/fm
Multiplier
ʋm(t)
ʋo(t) LPF
Carrier ʋc(t)
ʋam(t)
(b)
Phase discriminator
ʋph(t)
Voltage controlled oscillator
Carrier ʋc(t)
ʋc(t)
Figure 7.20
(a) Coherent AM demodulator; (b) phase-locked loop (PLL).
is also called the product modulator, and its implementation is be discussed in Section 7.7. Assuming a sinusoidal message signal, the output of the multiplier is given by 𝜐0 (t) = 𝜐am (t) × 𝜐c (t) = GV c [1 + m cos(2𝜋fm t)] cos(2𝜋fc t) × cos(2𝜋fc t) = GV c cos2 (2𝜋fc t) + mGV c cos(2𝜋fm t)cos2 (2𝜋fc t) GV c GmV c = + cos(2𝜋fm t) 2 2 G V GmV c cos[2𝜋(2fc − fm )t] + m c cos[2𝜋(2fc )t] + 4 4 GmV c cos[2𝜋(2fc + fm )t] + 4 The constant G is a scaling factor that represents the total gain of the transmission link from transmitter output to demodulator input. The signal vo (t) contains the wanted message signal (at frequency f m ) and other (unwanted) components at f = 0, 2f c −f m , 2f c , and 2f c + f m . The DC component (f = 0) is removed by a series capacitor,
497
498
7 Amplitude Modulation
and the higher-frequency components are filtered out by an LPF that has a bandwidth just enough to pass the highest-frequency component of the message signal. The technique of coherent demodulation is very different from that of envelope demodulation. You will recall that AM translates the message spectrum from baseband (centred at f = 0) to bandpass (centred at −f c and +f c ), without any further change to the spectrum except for a scaling factor. Coherent demodulation performs a further translation of the (bandpass) AM spectrum by ±f c . When this is done, the band at −f c is translated to locations f = −2f c and 0, whereas the band at +f c is translated to locations f = 0 and +2f c . The band of frequencies at f = 0 has the same shape as the original message spectrum. It has twice the magnitude of the bands at ±2f c , being the superposition of two identical bands originally located at ±f c . This baseband is extracted by an LPF and provides the message signal vm (t) at the output of Figure 7.20. The use of coherent demodulation requires that the receiver have a local carrier that is at the same frequency and phase as the transmitted carrier. Since the received AM signal already contains such a carrier, what is needed is a circuit that can extract this carrier from the AM signal. One way of achieving this is to use the phase-locked loop (PLL), a block diagram of which is shown in Figure 7.20b. This consists of a voltage-controlled oscillator (VCO), which generates a sinusoidal signal that is fed into a phase discriminator along with the incoming AM signal. The VCO is designed to have an output at the carrier frequency f c when its input is zero. The phase discriminator includes an LPF (not explicitly shown). It produces an output voltage vph (t) that is proportional to the phase difference between its two inputs. If vph (t) is fed as input to the VCO then it will be proportional to the phase difference between the VCO output signal and the carrier component in the AM signal, and will cause the VCO frequency to change in order to minimise this difference. When vph (t) = 0, the loop is said to be locked and the output of the VCO is the required carrier.
7.5.3 AM Receivers A radio receiver performs several important tasks. ●
●
●
It selects one signal from the many reaching the antenna and rejects all the others. The receiver could be more easily designed to receive only one particular carrier frequency, but in most cases it is required to be able to select any one of several carrier signals (e.g. from different broadcast stations) spaced over a wide frequency range. The receiver must therefore be capable of both tuning (to select any desired signal) and filtering (to exclude the unwanted signals). It provides signal amplification at the RF stage to boost the signal to a level enough to properly operate the demodulator circuit and at the baseband frequency stage to operate the information sink – a loudspeaker in the case of audio broadcast. It extracts the message signal through the process of demodulation, which we have already discussed for the case of AM signals.
7.5.3.1 Tuned Radio Frequency (RF) Receiver
Figure 7.21 shows one simple design, which performs the above tasks. This design was used in the first broadcast AM receivers. It consists of a tuneable RF amplifier that amplifies and passes only the desired carrier and its associated sidebands, an envelope demodulator that extracts the audio signal, and an audio amplifier that drives a loudspeaker. Called the tuned radio frequency (TRF) receiver, this implementation has serious drawbacks. It is difficult to design the RF amplifier with the required level of selectivity, since the bandwidth needs to be as small as 10 kHz to separate adjacent AM broadcast carriers. Several stages of tuned circuits are required, and this poses problems with stray capacitance and inductance, which gives rise to unwanted feedback and possible oscillations. Furthermore, each circuit stage is tuned by a separate variable capacitor, and gang-tuning these stages (i.e. arranging for all the variable capacitors to be adjusted by a single control knob) is no small mechanical challenge.
7.5 AM Demodulators
Antenna
Tunable RF amplifier
Demodulator
Audio amplifier Loudspeaker
Figure 7.21
Tuneable radio frequency (TRF) receiver.
The bandwidth of a TRF receiver varies depending on the resonance frequency f c to which the above stages have been tuned. The Q of a tuned circuit (i.e. a BPF) is defined as the ratio of its centre frequency f c to its bandwidth B. This ratio remains roughly the same as the capacitance is varied (to change f c ). The bandwidth of a TRF receiver is therefore B = f c /Q, which changes roughly proportionately with f c . For example, a receiver that has a bandwidth B = 10 kHz at the centre (1090 kHz) of the medium wave band (540–1640 kHz) will have a narrower bandwidth B = 4.95 kHz when tuned to a carrier frequency at the bottom end of the band. This is a reduction by a factor of 1090/540. If this receiver is tuned to a carrier at the top end of the band, its bandwidth will be about 15.05 kHz, representing an increase by a factor of 1640/1090. Thus, at one end we have impaired fidelity due to reduced bandwidth and at the other end we have increased noise and interference from adjacent channels due to excessive bandwidth. 7.5.3.2 Superheterodyne Receiver
The superheterodyne receiver (or superhet for short) is an elegant solution that performs all the required receiver functions without the drawbacks of the TRF receiver discussed above. The desired RF carrier is translated to a fixed intermediate frequency (IF), which is the same irrespective of the carrier frequency selected. The bulk of required amplification and selectivity is performed at the IF stage. This IF amplifier can be more easily designed since it is required to amplify a fixed band of frequencies. All commercial receivers for analogue radio use superhet. Figure 7.22 shows the block diagram of an AM superhet receiver. The operation of the superhet receiver is as follows.
Mixer IF Amplifier and BPF fIF ± B/2 kHz
Tuned RF amplifier
Common tuning
Local oscillator fLO = fc + fIF
Demodulator
Audio amplifier Loudspeaker
10 kHz, AM radio B= 200 kHz, FM radio Figure 7.22
The superheterodyne AM receiver.
499
500
7 Amplitude Modulation
7.5.3.2.1 Frequency Translation
To select the desired transmission at carrier frequency f c , the local oscillator (LO) is jointly tuned (using the same control knob) with the RF section so that when the RF section is tuned to f c the LO frequency is at the same time set at f c + f IF . The frequency f IF is called IF – being lower than the incoming RF carrier frequency and higher than the message baseband frequencies. For AM radio with RF carrier in the frequency range 540–1640 kHz, f IF = 470 kHz (or 455 kHz in North America), and for FM radio with RF carrier in the range 88–108 MHz, f IF is typically 10.7 MHz. Thus (7.45)
fLO = fc + fIF
The required frequency translation could also be achieved with the LO maintained at frequency f IF below the incoming carrier rather than above the carrier frequency as specified in Eq. (7.45). However, this is not done in practice because it would lead to a much larger ratio between the highest oscillator frequency (needed to receive a transmission at the top end of the band) and the lowest oscillator frequency (needed to receive a transmission at the bottom end of the band). Take the example of an AM radio receiver that covers the RF carrier range 540–1640 kHz with f IF = 470 kHz. For f LO above f c fLO min = fc min + fIF = 540 + 470 = 1010 kHz fLO max = fc max + fIF = 1640 + 470 = 2110 kHz This gives a ratio f LOmax /f LO min = 2.1. Whereas for f LO below f c fLO min = fc min + fIF = 540 − 470 = 70 kHz fLO max = fc max + fIF = 1640 − 470 = 1170 kHz This is a ratio f LO max /f LO min = 16.7, which is more difficult to achieve using a variable capacitor in the LO circuit. The mixer is a nonlinear device that multiplies the modulated carrier at f c with the sinusoidal signal generated by the LO at frequency f LO . The result is that the message signal, carried by the carrier at f c , is translated to new frequency locations at the sum and difference frequencies fLO ± fc = fc + fIF ± fc = 2fc + fIF
and
fIF
7.5.3.2.2 IF Amplifier
The IF stage provides tuned amplification with a pass band of 10 kHz (for AM radio) centred at f IF . This amplifier therefore performs the following functions: ● ●
●
It rejects the sum-frequency signal at 2f c + f IF since this is outside its pass-band. It amplifies the difference-frequency signal at f IF to a level required for the correct operation of the demodulator. Note that this signal carries the exact sidebands of the original RF carrier. The mixer merely translated this carrier along with its sidebands without any distortion. It removes any adjacent channels since these also lie outside its pass-band.
7.5.3.2.3 Image Interference
The superhet receiver has the disadvantageous possibility of simultaneous reception of two different transmissions, one at the desired frequency f c and the other at a frequency f i equal to twice the IF above the desired frequency. The frequency f i is referred to as the image frequency of f c . To see how this happens, note that the unwanted transmission at a carrier frequency f i = f c + 2f IF would be translated by the mixer to the sum and
7.6 Merits, Demerits, and Application of AM
difference frequencies fi ± fLO = fc + 2fIF ± fLO = fc + 2fIF ± (fc + fIF ) = 2fc + 3fIF
and
fIF
Whereas the sum frequency at 2f c + 3f IF is of course rejected at the IF stage, the difference frequency equals f IF and is not rejected. The transmission at f i is therefore also received. It interferes with the wanted transmission at f c . This problem is known as image interference. A practical solution to this undesirable simultaneous reception of two transmissions is to use selective stages in the RF section that discriminate against the undesired image signal. Since the image frequency is located far away from the wanted carrier – a separation of 2f IF , the pass-band of the RF section does not need to be as narrow as in the case of the TRF receiver, which is required to pass only the wanted carrier and its sidebands. Worked Example 7.6 (a) Describe how an incoming AM radio transmission at a carrier frequency of 1 MHz would be processed. Assume an IF f IF = 470 kHz. (b) What is the image frequency of this transmission? (a) The action of using a control knob to tune to the transmission at f c = 1000 kHz also sets the LO frequency to fLO = fc + fIF = 1000 + 470 = 1470 kHz The RF amplifier is thus adjusted to have a passband centred on f c . It boosts the carrier at f c and its sidebands, but heavily attenuates other transmissions at frequencies far from f c , including the image frequency at fi = fc + 2fIF = 1940 kHz. The amplified carrier and its sidebands are passed to the mixer, which translates f c (along with the message signal carried by f c in its sidebands) to two new frequencies, one at f LO + f c = 2470 kHz and the other at f LO − f c = 470 kHz. The copy of the signal at 2470 kHz is rejected by the IF BPFs, whereas the other copy at the IF frequency of 470 kHz is amplified and applied to an envelope demodulator, which extracts the message signal. The message signal is amplified in a linear audio amplifier, which drives a loudspeaker that converts the message signal to audible sound. (b) The image frequency f i is given by fi = fc + 2fIF = 1000 + 2 × 470 = 1940 kHz To confirm that this is correct, we check that f i is translated by the mixer to a difference frequency equal to the IF fi − fLO = 1940 − 1470 = 470 kHz
7.6 Merits, Demerits, and Application of AM The most important advantage of AM over other modulation techniques is its circuit simplicity. Specifically ● ●
AM signals are easy to generate using simple circuits such as a switching or square-law modulator. AM signals can also be easily demodulated using the simple envelope demodulator discussed in Section 7.5.1, or a simple square-law demodulator discussed in Question 7.11.
501
502
7 Amplitude Modulation
However, AM has the following drawbacks. ●
●
It is wasteful of power. It is shown in Eq. (7.24) that at least two-thirds of the power is in the transmitted carrier, which carries no information. Only a maximum of one-third of the transmitted power is used to support the information signal (in the sidebands). For example, to transmit a total sideband power of 20 kW, one must transmit a total power above 60 kW, since 100% modulation is never used in practice. In this case, more than 40 kW of power is wasted. AM is also wasteful of bandwidth. We see in Eq. (7.15) that AM requires twice the bandwidth of the message signal. Two sidebands, one a mirror image of the other, are transmitted even though any one of them is enough to reproduce the message signal at the receiver.
AM is suitable for the transmission of low-bandwidth message signals in telecommunication applications that have low-cost receiver implementation as a major design consideration. Its main application is therefore in audio broadcasting where it is economical to have one expensive high-power transmitter and numerous inexpensive receivers.
7.7 Variants of AM So far, we have discussed what may be fully described as double sideband transmitted carrier amplitude modulation (DSB-TC-AM). You may also wish to describe it as double sideband large carrier amplitude modulation (DSB-LC-AM), to emphasise the presence of a large carrier component in the transmitted signal. Various modifications of this basic AM technique have been devised to save power and/or bandwidth, at the cost of increased transmitter and receiver complexity.
7.7.1
DSB
The double sideband suppressed carrier amplitude modulation (DSB-SC-AM), usually simply abbreviated DSB, is generated by directly multiplying the message signal with the carrier signal using a balanced modulator. The carrier is not transmitted (i.e. the carrier is suppressed), and this saves a lot of power, which can be put into the two sidebands that carry the message signal. 7.7.1.1
Waveform and Spectrum of DSB
The DSB waveform is given by 𝜐DSB (t) = 𝜐m (t) × 𝜐c (t) = 𝜐m (t)Vc cos(2𝜋fc t)
(7.46)
And for a sinusoidal message signal 𝜐m (t) = V m cos(2𝜋f m t) 𝜐DSB (t) = Vm cos(2𝜋fm t) × Vc cos(2𝜋fc t) 1 1 = Vm Vc cos 2𝜋(fc − fm )t + Vm Vc cos 2𝜋(fc + fm )t 2 2 ⏟⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏟⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏟ ⏟⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏟⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏟ LSF
(7.47)
USF
The DSB consists of an LSF of amplitude 0.5V m V c , and a USF of the same amplitude. There is no carrier component. Figure 7.23 shows the waveform of a DSB signal for the case f c = 20f m , and V c = 2 V m . Note that the envelope of the DSB is not the same as the message signal, and therefore a simple envelope demodulator cannot be used to extract the message signal as was done in basic AM. Compared to basic AM, we see that a simple (and hence low-cost) receiver circuit has been traded for a saving in transmitted power.
7.7 Variants of AM
ʋm(t) t
(a)
ʋc(t) t
(b)
Envelope
ʋdsb(t)
t
(c)
Phase reversals Figure 7.23
DSB signal waveform: (a) sinusoidal message; (b) carrier; (c) DSB.
Amn
Adsbn DSB fc
(a)
fm
f
fc – fm fc
|Vdsb(f)|
|Vm(f)| DSB fc
(b)
USB
fc + f1
fc
fc + fm
f fc – f1
Figure 7.24
fm
LSB
fc – fm
f1
f
fc + fm
f
DSB spectrum: (a) Sinusoidal message signal; (b) Arbitrary message signal.
Figure 7.24a shows the spectrum of the DSB signal in Eq. (7.47). For an arbitrary message signal consisting of a band of sinusoids in the frequency range from f 1 to f m , each component sinusoid is translated to an LSF and a USF. This results in the LSB and USB shown in Figure 7.24b. DSB therefore has the same bandwidth as AM, which is twice the message bandwidth.
503
504
7 Amplitude Modulation
e
a T1
b
f
D1 D4
.
T2
g ʋs×m(t)
ʋm(t)
BPF fc ± fm
D2
g′
D3 c
e′
d +
ʋDSB(t)
f′
– ʋc(t)
Figure 7.25
7.7.1.2
Ring or lattice implementation of a balanced modulator.
DSB Modulator
A circuit that generates a DSB waveform must be able to prevent the carrier signal from getting to its output while at the same time passing the side frequencies. This cannot be easily achieved by filtering since the side frequencies are very close to the carrier. A balanced modulator can generate a DSB waveform, suppressing the carrier and passing only the side frequencies to the output. It is also called a product modulator. A realisation of balanced modulation using a diode ring is shown Figure 7.25. The diode ring or lattice modulator consists of an input transformer T 1 , an output transformer T 2 , and four diodes connected to form a ring (a–b–c–d–a) in which each diode is traversed in the same direction (from its positive terminal to its negative terminal). The carrier signal vc (t) is connected between the exact centre taps of T 1 and T 2 ; the message signal vm (t) is connected to the primary terminals e–e′ of T 1 ; and the output signal vs × m (t) is taken from the secondary terminals f–f′ of T 2 . It is easy to understand the operation of this popular circuit if we take the discussion in parts. We first look at the effect of the carrier signal and then at how the message signal is transferred from the input terminals e–e′ to the output terminals f–f′ .
7.7.1.2.1
Carrier Suppression
Ignore the message signal for the moment and concentrate on the carrier. Let the polarity of the carrier source be as shown in Figure 7.25 during the positive cycle of the carrier. The polarity will be reversed during the negative half of the carrier cycle. Now consider what happens during the positive half of the carrier cycle. There is a positive voltage at node a, which forward-biases diode D1 so that it is effectively a small resistance RfD1 , and reverse-biases diode D4 so that it is effectively an open circuit. There is also a positive voltage at node c, which forward-biases diode D3 so that it is effectively a small resistance RfD3 , and reverse-biases diode D2 so that it is effectively an open circuit. The circuit (ignoring the message signal) therefore reduces to what is shown in Figure 7.26a. Note that we cut out diodes D2 and D4 (since they are open circuits) and rotated the remaining circuit clockwise through 90∘ to obtain this figure. Z 1 is the impedance of the lower half of the secondary coil of T 1 ; Z 2 is the impedance of the upper half of this coil. Similarly, Z 3 is the impedance of the lower half of the primary coil of T 2 , and Z 2 is the impedance of the upper half of this primary coil. The carrier voltage produces current I t , which divides into I a that flows through Z 2 –RfD1 –Z 4 , and Ic that flows through Z 1 –RfD3 –Z 3 . If the transformers are perfectly balanced
7.7 Variants of AM
Z1
c
T1
Z2
Ic
Ia It ʋc(t)
+
RfD3
a
–
Z1
c
RfD1
T1
Z2
–
RfD2
ʋc(t)
+
Z3
Figure 7.26
Z4
b
b
RfD4
It
Ib d
a
Id Z3
Z4
T2
T2
(a)
(b)
d
Currents due to carrier voltage during (a) positive and (b) negative cycles of carrier.
and the diodes are identical then Z1 = Z2 Z3 = Z4 RfD1 = RfD4 Under this condition, I t divides equally into the two paths and Ia = Ic The result is that the carrier causes two equal currents to flow in opposite directions in the primary of T 2 . These currents induce opposite magnetic fields that cancel out exactly so that no current is induced in the secondary winding of T 2 . Thus, the carrier has been suppressed or balanced out and does not reach the output terminals f–f′ during the positive half-cycle of the carrier. Now consider what happens during the negative half-cycle of the carrier. There is a positive voltage at node b, which forward-biases diode D2 so that it is effectively a small resistance RfD2 , and reverse-biases diode D1 so that it is effectively an open circuit. There is also a positive voltage at node d, which forward-biases diode D4 so that it is effectively a small resistance RfD4 , and reverse-biases diode D3 so that it is effectively an open circuit. The circuit (again ignoring the message signal) therefore reduces to what is shown in Figure 7.26b. To obtain this figure, we cut out diodes D1 and D3 (since they are open circuits), twist the circuit through 180∘ so that paths b–c and a–d no longer cross, and rotate the remaining circuit clockwise through 90∘ . By a similar argument as for the positive cycle, it follows that the carrier is also prevented from reaching the output terminals f–f′ during the negative half of the cycle. Thus, the function of the carrier is to switch the diodes rapidly (at carrier frequency) providing different paths for the message signal to reach the output, as we now discuss, but the carrier itself is prevented from reaching the output. This is a notable achievement. 7.7.1.2.2
Message Signal Transfer
We have seen that during the positive cycle diodes D1 and D3 are forward biased, whereas D2 and D4 are reverse biased. Ignoring the carrier, since it has no further effect on the circuit beyond this switching action, we obtain the circuit of Figure 7.27a for the positive cycle. The message signal is coupled from e–e′ to a–c by transformer T 1 action. It is then applied across b–d, with a small drop across RfD1 and RfD3 (which we ignore). The message signal is coupled from b–d to the output terminal f–f′ by transformer T 2 action. Thus, during the positive half of a carrier cycle 𝜐s×m (t) = 𝜐m (t)
505
506
7 Amplitude Modulation
a
e
b RfD1
T1 (a)
f T2 ʋs × m(t) = ʋm(t)
ʋm(t)
RfD3 c
eʹ e
d
a
b
f T2
T1 ʋm(t)
(b)
fʹ
ʋs × m(t) = –ʋm(t)
RfD4 RfD2
eʹ
c
d
fʹ
Figure 7.27 Transfer of message signal v m (t) from input terminals e–e′ to output terminals f–f′ during (a) positive and (b) negative cycles of carrier.
Now consider what happens during a negative half of the cycle. The effective circuit is as shown in Figure 7.27b. The message signal is coupled to terminal a–c as before. The paths to terminal b–d have been reversed causing the message signal to be applied to b–d with a reversed polarity. That is, signal −vm (t) is applied to b–d. The small drop across RfD2 and RfD4 is again ignored. From b–d this signal is coupled to the output terminal f–f′ . Thus 𝜐s×m (t) = −𝜐m (t) 7.7.1.2.3
DSB Signal
From the foregoing, we see that the signal vs × m (t) is the product of the message signal vm (t) and a bipolar unit-amplitude square wave vs (t) of period equal to the carrier period. This is demonstrated in Figure 7.28 for a sinusoidal message signal. The bipolar square wave has amplitude +1 during the positive half of the carrier cycle and amplitude −1 during the negative half of the cycle. Being a square wave, it contains only odd harmonic frequencies. Furthermore, since its average value is zero, it does not contain a DC component Ao in its Fourier series. So we may write 𝜐s×m (t) = 𝜐m (t) × 𝜐s (t) = 𝜐m (t){A1 cos[2𝜋fc t] + A3 cos[2𝜋(3fc )t] + · · ·} = 𝜐m (t)A1 cos[2𝜋fc t] + 𝜐m (t)A3 cos[2𝜋(3fc )t] + · · ·
(7.48)
The first term is the required DSB signal – being a direct multiplication of the message signal and the carrier. Subsequent terms are the sidebands of the message signal centred at odd harmonics of the carrier frequency, namely 3f c , 5f c , … Therefore, as shown in Figure 7.25, the DSB signal vdsb (t) is obtained by passing vs × m (t) through a BPF of centre frequency f c and bandwidth 2f m , where f m is the maximum frequency component of the message signal. For this
7.7 Variants of AM
ʋm(t) t
Period T = 1/fc
ʋs(t)
t
ʋs × m(t) t
Figure 7.28 Output v s × m (t) of a diode ring modulator is the product of the message signal v m (t) and bipolar unit-amplitude square wave v s (t).
to work, adjacent sidebands in Eq. (7.48) must not overlap. This requires that the lowest lying frequency 3f c − f m in the LSB located just below 3f c must be higher than the highest frequency f c + f m in the USB located just above f c . That is 3fc − fm > fc + fm Or fc > fm
(7.49)
Subsequently, a balanced modulator or product modulator will be treated as comprising a diode ring modulator and an appropriate BPF. This leads to the block diagram of Figure 7.29 in which the product modulator receives two inputs, namely the message signal and the carrier, and produces the DSB signal as output. 7.7.1.3
DSB Demodulator
The original message signal vm (t) can be extracted from a received DSB signal vdsb (t) by a product modulator followed by an LPF, as shown in Figure 7.30. However, this requires that the locally generated carrier should
Message ʋm(t)
Product Modulator
DSB ʋdsb(t) = ʋm(t) × Vccos(2πfct)
Carrier Vccos(2πfct) Figure 7.29 The product modulator consists of a ring modulator followed by a bandpass (BPF) filter of centre frequency f c and bandwidth 2f m .
507
508
7 Amplitude Modulation
Incoming DSB ʋdsb(t) = ʋm(t)Vc cos(2π fct + ϕ)
Product modulator
ʋo(t)
LPF
Demodulated signal ʋ′m (t)
ʋLO(t) = Vcl cos(2π fct) Local oscillator Figure 7.30
Coherent demodulation of DSB.
have exactly the same frequency and phase as the carrier used at the transmitter including the phase perturbations imposed by the transmission medium. This demodulation scheme is therefore referred to as coherent (or synchronous) demodulation. 7.7.1.3.1
Effect of Phase Error
To see why phase synchronisation is important, let us assume in Figure 7.30 that variations in the transmission medium have perturbed the phase of the missing DSB carrier by 𝜙 relative to the locally generated carrier, which is matched in frequency. The incoming DSB signal vdsb (t) and the locally generated carrier vLO (t) are therefore given by 𝜐dsb (t) = 𝜐m (t) × Vc cos(2𝜋 fc t + 𝜙) 𝜐LO (t) = Vcl cos(2𝜋 fc t) The output of the product modulator is therefore 𝜐o (t) = 𝜐DSB (t) × 𝜐LO (t) = [𝜐m (t)Vc cos(2𝜋 fc t + 𝜙)][Vcl cos(2𝜋 fc t)] = 𝜐m (t)Vc Vcl cos(2𝜋 fc t) cos(2𝜋 fc t + 𝜙) = K𝜐m (t) cos(𝜙) + K𝜐m (t) cos(4𝜋 fc t + 𝜙)
(7.50) 1 V V . The 2 c cl
where we have used the trigonometric identity for the product of two cosines and set the constant K = last term in Eq. (7.50) is a bandpass signal centred at 2f c , and is rejected by the LPF, which therefore yields the demodulated signal 𝜐′m (t) = K cos(𝜙) 𝜐m (t)
(7.51)
Thus, the received message signal is proportional to the cosine of the phase error 𝜙. If 𝜙 has a constant value other than ±90∘ then the message signal is simply scaled by a constant factor and is not distorted. However, in practice the phase error will vary randomly leading to random variations (i.e. distortions) in the received signal. The received signal is maximum when 𝜙 = 0 and is zero when 𝜙 = ±90∘ . This means that a DSB signal carried by a cosine carrier cannot be demodulated by a sine carrier, and vice versa – the so-called quadrature null effect. 7.7.1.3.2
Phase Synchronisation
The Costas loop shown in Figure 7.31 is an arrangement that keeps the locally generated carrier synchronised in phase and frequency with the missing carrier of the incoming DSB signal. It employs two coherent demodulators. A coherent demodulator consists of a product modulator fed with the incoming DSB and a locally generated carrier and followed by an LPF. Check that you can identify the two coherent demodulators in Figure 7.31. Both
7.7 Variants of AM
In-phase Incoming DSB
Product modulator
ʋm(t)Vc cos(2πfct)
LPF
Demodulated K cos(ϕ)ʋm(t) signal
Vcl cos(2πfct + ϕ) Voltage controlled oscillator
ʋph(t)
Phase discriminator
90° Phase shift Vcl sin(2πfct + ϕ) Product modulator
LPF
K sin(ϕ)ʋm(t)
Quadrature-phase Figure 7.31
Demodulation of DSB: Costas loop.
demodulators are fed with a carrier generated by the same VCO. However, one of the demodulators, referred to as the quadrature-phase demodulator, has its carrier phase reduced by 90∘ (to make it a sine carrier), whereas the other (in-phase) demodulator is fed with the VCO-generated cosine carrier that is ideally in-phase with the missing carrier of the DSB signal. The demodulated signal is taken at the output of the in-phase demodulator. When there is no phase error, the quadrature-phase demodulator output is zero – recall the quadrature null effect – and the in-phase demodulator provides the correct demodulated signal. Now consider what happens when there is a phase difference or error 𝜙 between the VCO output and the missing carrier in the DSB signal. Treating this missing carrier as the reference, its initial phase is zero and the incoming DSB signal is 𝜐dsb (t) = 𝜐m (t)Vc cos(2𝜋 fc t) The carrier fed to the in-phase and quadrature-phase demodulators is therefore, respectively, V cl cos(2𝜋 f c t + 𝜙) and V cl sin(2𝜋 f c t + 𝜙). The in-phase demodulator output is then K cos(𝜙)𝜐m (t) as earlier derived, whereas the quadrature-phase demodulator output is K sin(𝜙)𝜐m (t). Thus, a phase error causes the in-phase demodulator output to drop and the quadrature-phase demodulator output to increase from zero. Phase synchronisation, which maintains the value of the phase error 𝜙 around zero, is achieved by the combined action of the phase discriminator and the VCO. The two demodulator outputs are fed into the phase discriminator, which produces an output voltage vph (t) proportional to 𝜙 and causes the VCO output frequency to change slightly in such a way that the phase error is reduced towards zero. 7.7.1.4
DSB Applications
The advent of integrated circuits made it possible for the problem of phase synchronisation in DSB reception to be overcome using affordable receivers and paved the way for the use of DSB in several areas. ●
The now obsoleted analogue television systems NTSC (National Television System Committee) (in North America) and PAL (Phase Alternating Line) (in Europe) employed DSB to transmit colour (or chrominance) information. Two colour-difference signals are transmitted on two subcarriers that have the same frequency but differ
509
510
7 Amplitude Modulation
●
in phase by 90∘ . You may wish to view it this way: One signal is carried on an in-phase cosine carrier of frequency f c , and the other on a (quadrature-phase) sine carrier of the same frequency. It can be shown (Question 7.13) by virtue of the quadrature null effect that the two signals will be separated at the receiver without mutual interference. This modulation strategy may be described in full as double sideband suppressed carrier quadrature amplitude modulation. DSB is also used for transmitting stereo information in FM sound broadcast at very high frequency (VHF). Sending two different audio signals vL (t) and vR (t) termed the left and right channels, respectively, representing, for example, sound from different directions entering two sufficiently spaced microphones at a live music concert, greatly enriches the reproduction of the concert’s sound at a receiver. However, this stereo transmission must be on a single carrier in order not to exceed the bandwidth already allocated to FM. Furthermore, it must be sent in such a way that nonstereo receivers can give normal mono-aural reproduction. These stringent conditions are satisfied as follows: At the FM stereo transmitter (Figure 7.32a)
●
●
●
●
Sum signal vL + R (t) and difference signal vL − R (t) are generated by, respectively, summing and subtracting the two channels. vL − R (t) is DSB modulated using a 38 kHz carrier obtained by doubling the frequency of a 19 kHz crystal oscillator. Let’s denote this DSB signal vdsb (t). The signals vL + R (t), vdsb (t), and the 19 kHz oscillator frequency are summed to give a composite signal vm (t). The spectrum V m (f ) of the composite signal vm (t) is shown in Figure 7.32b. Clearly, there is no mutual interference between the three signals that are summed, since each lies in a separate frequency band, vL + R (t) in the baseband from 0 to 15 kHz, vdsb (t) in the passband from 23 to 53 kHz, and of course the pilot carrier at 19 kHz. This is an application of frequency division multiplexing, which is discussed in detail in Chapter 13. The composite signal vm (t) is transmitted as vfm (t) using frequency modulation, the subject of the next chapter. At the receiver (Figure 7.32c)
1. vm (t) is extracted from vfm (t) by frequency demodulation. 2. An LPF of 15 kHz bandwidth extracts vL + R (t) from vm (t). A nonstereo receiver plays vL + R (t) on a loudspeaker and that completes its signal processing. A stereo receiver, however, goes further through the next steps (3)–(5). 3. The 19 kHz pilot is extracted using a narrow BPF, and vdsb (t) is extracted using a BPF of bandwidth 30 kHz centred on 38 kHz. 4. The 19 kHz pilot is doubled in frequency. This provides a phase synchronised 38 kHz carrier that is used to demodulate vdsb (t) to yield vL−R (t). In this way, the sum and difference signals have been recovered. 5. The left and right channels are obtained by taking the sum and difference of vL + R (t) and vL − R (t). You may wish to check that this is the case. The two channels can now be played back on separate loudspeakers to give stereo reproduction.
7.7.2
SSB
DSB provides a power-saving improvement over basic AM. However, it still requires twice the bandwidth of the message signal since both the lower and USBs are transmitted. It is obvious in Figure 7.24b that these sidebands (LSB and USB) are mirror images of each other about the carrier frequency f c . That is, measuring outward from f c , the LSB contains the same frequencies at the same amplitudes as the USB. They therefore represent the same information – that contained in the message signal, and it is wasteful of bandwidth to send two copies of the same information. Single sideband suppressed carrier amplitude modulation – abbreviated SSB – transmits only one sideband. As in DSB, the carrier is also not transmitted. Figure 7.33 shows the spectrum of an SSB signal.
7.7 Variants of AM
+
ʋR(t)
Σ
ʋL + R(t)
+ – (a)
ʋL(t)
Σ
ʋL – R(t)
+
DSB ʋdsb(t) modulator
Σ
ʋm(t)
FM modulator
2fc
ʋfm(t) To antenna
f×2 fc = 19 kHz Crystal oscillator Vm(f)
Pilot
L+R
(b) 0
L–R 15 19 23
ʋm(t) ʋfm(t) FM From demodulator antenna
ʋL + R(t)
LPF (0 → 15 kHz)
+
Σ
2ʋL(t)
Σ
2ʋR(t)
+ Narrow BPF (19 kHz)
(c)
f ×2 +
BPF (23 → 53 kHz) Figure 7.32
7.7.2.1
f, kHz
53
ʋdsb(t)
DSB demodulator
– ʋL – R(t)
FM stereo: (a) Transmitter; (b) Spectrum V m (f ) of composite signal v m (t); (c) Receiver.
Merits and Demerits of SSB
SSB provides numerous benefits. ●
●
●
The bandwidth of an SSB signal is the same as that of the original message signal, and is therefore half the bandwidth that would be required to transmit the same message by DSB or basic AM. Thus, SSB doubles spectrum utilisation in that it allows twice as many signals to be packed into the same frequency range as could be done with DSB or AM. The passband of an SSB receiver is half that of AM and DSB receivers. As a result, noise power – proportional to bandwidth – is reduced by a factor of two. This yields a 3 dB improvement in signal-to-noise ratio. Power that was spread out over two sidebands (and a carrier in the case of AM) is now concentrated into one sideband. So, for the same output power, the SSB signal can be received with a higher signal power per unit message
511
512
7 Amplitude Modulation
Assbn
LSF
Amn (a)
fm
SSB fc
f
f
fc – fm fc OR Assbn
USF
fc
f
fc + fm
|Vssb(f)| LSB |Vm(f)| fc – fm SSB fc
(b) f1
fm
f
fc – f1 fc OR
f |Vssb(f)| USB
fc fc + f1 Figure 7.33
●
●
fc + f m
f
Amplitude spectrum of SSB for (a) sinusoidal message signal and (b) arbitrary message signal.
bandwidth than AM and DSB. For the same reason, SSB transmission can be received at greater distances than AM and DSB transmission of the same power. The SSB transmitter produces a nonzero power output only when a message signal is present, unlike an AM transmitter, which continues to radiate a high-power carrier during those time intervals when there is a pause in the message signal. The SSB (and DSB) transmitter is therefore more efficient. SSB transmission is less susceptible to the phenomenon of selective fading than AM and DSB. Under selective fading, different frequency components of a signal will arrive at the receiver having undergone different amounts of propagation delay. This may arise in the case of sky wave propagation because these frequencies have been effectively reflected from different layers of the ionosphere. It can be shown (see Question 7.13) that for AM and DSB transmission to be correctly received the LSB, USB, and carrier must have the same initial phases. Selective
7.7 Variants of AM
Received AM signal envelope Received AM signal (a)
Original message signal Received message signal (b)
Figure 7.34 Effect of selective fading that shifts the phases of the side-frequencies by 60∘ relative to the carrier. In this plot, the original AM signal is an 80% modulation of a 1 MHz carrier by a 25 kHz sinusoidal message signal: (a) received AM signal; (b) original and received message signals compared.
●
● ●
●
fading causes the phases of these three components to be altered by amounts that are not proportional to their frequency. As a result, they appear at the receiver to have different initial phases, and this causes distortion in the received signal. Figure 7.34a demonstrates an example of the potential effect of selective fading on AM signals. The (undistorted) AM signal results from 80% modulation of a 1 MHz carrier by a 25 kHz sinusoidal message signal. We assume that selective fading causes the side frequencies to be shifted in phase by 60∘ relative to the carrier frequency. The received AM signal will have the waveform shown in Figure 7.34a. An envelope demodulator would then produce the output shown in Figure 7.34b, which is compared with the original message signal on the same plot. It is apparent that signal distortion occurs when carrier and side frequencies have unequal initial phases. In exceptional circumstances, complete signal cancellation may occur at certain time instants. SSB transmission has only one sideband, which is demodulated using a locally generated carrier. The effect of selective fading is therefore greatly reduced. The SSB technique reduces the effect of amplifier nonlinearity in frequency division multiplex (FDM) systems. We saw in Chapter 4 that the effect of a nonlinear transmission medium is to generate intermodulation products, the amplitudes of which increase with signal power. In FDM systems, where many independent signals (or channels) are transmitted in adjacent frequency bands on the same link, some of these products will fall in
513
514
7 Amplitude Modulation
bands occupied by other signals, giving rise to crosstalk. Carrier suppression allows the use of a smaller signal power in SSB (and to a lesser extent in DSB), and therefore minimises the effect of nonlinearity. The main disadvantage of SSB is that it requires complex and expensive circuits since a local carrier signal must be generated that is synchronised in frequency and phase with the missing carrier in the incoming SSB signal. 7.7.2.2
SSB Modulators
SSB can be generated in two different ways, namely filtering (i.e. frequency discrimination) and phase discrimination. 7.7.2.2.1
Frequency Discrimination Method
The most obvious method of generating an SSB signal is to first generate a DSB signal using a product modulator and then to filter out one of the sidebands. This filtering method is shown in Figure 7.35. The main difficulty with this method is that the required selectivity of the BPFs may be very high, making the filters expensive and uneconomical. If f 1 is the lowest-frequency component of the message signal then the frequency gap between the two sidebands that must be separated by the filter is 2f 1 . In telephony with voice baseband standardised at 300–3400 Hz, this gap is 600 Hz, whereas in TV with f 1 = 0, there is no gap between the sidebands. Obviously, therefore this method cannot be employed with TV signals. Even for voice telephony, a BPF that has a transition width of only about 600 Hz is uneconomical at RF frequencies. For example, if the carrier frequency is, say, 3 MHz, the filter would require a steep enough frequency response slope within the transition band so that it goes from pass band at 3 000 300 Hz to stop band at 2 999 700 Hz. The transition width in this case would be only about 0.02% of the centre frequency. If, however, a lower-frequency carrier is used, say f c = 100 kHz, the required transition width becomes 0.6% of the centre frequency and the filter can be realised more cheaply. The generation of a high-frequency carrier SSB is therefore usually done in two stages. First, a lower-frequency carrier f c1 is employed to generate the SSB. The gap between the sidebands is only 2f 1 , where f 1 is the lowest-frequency component of the message signal. However, because f c1 is small, an affordable BPF can be used to separate the sidebands. The output of this filter enters a second product modulator that operates at a higher carrier frequency f c2 . This again generates two sidebands. However, the gap between the sidebands is now 2(f c1 − f m ), where f m is the maximum frequency component of the message signal. Since f c1 is much larger than f m , the gap between the sidebands in this second stage is large enough for an affordable BPF to separate the two sidebands. The overall result of this two-stage procedure is the generation of an SSB signal at a carrier frequency f c1 + f c2 . To transmit the LSB, which occupies the frequency interval from f c2 + f c1 − f m to f c2 + f c1 − f 1 , one selects the LSB in the first stage and the USB in the second stage. On the other hand, the USB, which occupies the frequency interval from f c2 + f c1 + f 1 to f c2 + f c1 + f m , is transmitted by selecting the USB in both stages. Message ʋm(t)
Product Modulator
Carrier (fc)
Figure 7.35
ʋdsb(t)
BPF f c – fm → f c – f1
ʋssb(t) (LSB)
BPF fc + f1 → fc + fm
ʋssb(t) (USB)
SSB generation by filtering method.
7.7 Variants of AM
fc2 + fc1 – f1 fc2 + fc1 – fm
fm f1 Message
fc1 Product modulator
fc1 BPF and amplifier
Lower-frequency carrier fc1 = 100 kHz Figure 7.36
Antenna
fc2 Product modulator
BPF and power amp.
High-power SSB
Higher-frequency carrier fc2 = e.g. 10 MHz
SSB transmitter.
Figure 7.36 shows a block diagram of an SSB transmitter that uses the method just described to transmit the LSB. Follow the spectrum sketches in the diagram and you will observe how the LSB is transmitted at carrier frequency f c2 + f c1 by selecting the LSB in the first DSB modulation, and the USB in the second. Most high-power SSB transmission in the HF band (3–30 MHz) employs the filter method. 7.7.2.2.2
Phase Discrimination Method
A different scheme for SSB generation is based on phase discrimination. Two signals are summed whose phase relationship is such that one of the sidebands is eliminated. It is easier to demonstrate how this method works by using a sinusoidal message signal 𝜐m (t) = Vm cos(2𝜋 fm t) With a carrier of frequency f c , the SSB signal is given by { A cos[2𝜋(fc − fm )t], LSB 𝜐ssb (t) = A cos[2𝜋(fc − fm )t], USB
(7.52)
(7.53)
where A represents the amplitude of the SSB signal. Using the trigonometric identities B.3 and B.4 in Appendix B, the expressions for the SSB signals can be expanded as follows { A cos(2𝜋 fc t) cos(2𝜋 fm t) + A sin(2𝜋fc t) sin(2𝜋 fm t), LSB 𝜐ssb (t) = (7.54) A cos(2𝜋fc t) cos(2𝜋 fm t) − A sin(2𝜋fc t) sin(2𝜋 fm t), USB Equation (7.54) is very important. It shows that we may obtain a single sideband signal by adding together two double sideband signals. Consider first the expression for the LSB. The first term is the output of a product modulator that takes the carrier and the message signal as inputs. The second term is the output of another product modulator whose inputs are (i) the same carrier delayed in phase by 90∘ (to change it from cosine to sine) and (ii) the message signal delayed in phase by 90∘ . The sum of the outputs of these two product modulators yields the lower sideband SSB signal, whereas the difference yields the USB SSB signal. Eq. (7.54) therefore suggests the block diagram shown in Figure 7.37 for the generation of SSB. A transformation that changes the phase of every positive frequency component of a signal by −90∘ and the phase of every negative frequency component of the signal by +90∘ but does not alter the amplitude of any of these components is known as the Hilbert transform. Thus, the inputs to the second product modulator are the Hilbert transform of the carrier signal, and the Hilbert transform of the message signal. It is easy to obtain the Hilbert transform of the carrier signal – a circuit that changes the phase of this single frequency by exactly −90∘ is readily available. However, an accurate hardware implementation of the Hilbert transform of the message signal is more difficult because of the wide range of frequency components that must each be shifted in phase by exactly 90∘ .
515
516
7 Amplitude Modulation
Message ʋm(t)
Product modulator
Carrier + Σ 90° Phase shift
90° Phase shift
–
(USB) SSB signal ʋssb(t)
Product modulator Figure 7.37 SSB generation by phase discrimination: Hartley modulator. USB output is shown. For an LSB output, change subtraction to addition in the summing device.
The SSB generator based on phase discrimination implemented as shown in Figure 7.37 is known as the Hartley modulator. It has several advantages over the frequency discrimination or filtering technique of Figure 7.36. ●
● ●
The SSB signal is generated directly at the required RF frequency without the need for an intermediate lower-frequency stage. Bulky and expensive BPFs have been eliminated. It is very easy to switch from a lower sideband to a USB SSB output. The former is obtained by adding the outputs of the two product modulators, and the latter by subtraction of the two outputs.
The main disadvantage of the Hartley modulator is that it requires the Hilbert transform of the message signal. This transform changes the phase of each positive frequency component in the message signal by exactly −90∘ . If the wide-band phase shifting network shifts the phase of any frequency component in the message signal by an amount not equal to −90∘ , it causes a small amplitude of the unwanted side frequency of this component to appear at the output. Complete suppression of the unwanted sideband is achieved only with a phase difference of exactly −90∘ between corresponding frequency components at the inputs of the two product modulators. In practice, it is easier to achieve this by using two phase-shifting networks. One network shifts the message signal input to one modulator by 𝜙1 , and the other shifts the message signal input to the other modulator by 𝜙2 , with 𝜙1 − 𝜙2 = 90∘ 7.7.2.3
(7.55)
SSB Demodulator
Just as for DSB reception, coherent demodulation is necessary to extract the original message signal from an incoming SSB signal. Figure 7.38 shows the block diagram of an SSB demodulator. To understand the operation of this circuit we assume for simplicity a sinusoidal message signal: Eq. (7.52). Then for USB transmission, the SSB signal at the input of the demodulator is 𝜐ssb (t) = Vssb cos[2𝜋 (fc + fm )t]
(7.56)
7.7 Variants of AM
SSB signal ʋssb(t)
Product modulator
ʋo(t)
Demodulated signal LPF
ʋ′m (t)
Carrier (fc)
Figure 7.38
SSB demodulator.
Let us assume that the locally generated carrier vc (t) has a phase error 𝜙 compared to the missing carrier in vssb (t). The output of the product modulator is then given by 𝜐o (t) = 𝜐ssb (t) × 𝜐c (t) = Vssb cos[2𝜋(fc + fm )t] × Vc cos(2𝜋 fc t + 𝜙) = K cos(2𝜋fm t − 𝜙) + K cos[2𝜋(2fc + fm )t + 𝜙]
(7.57)
where we have set K = 0.5V ssb V c . The last term in Eq. (7.57) is a high-frequency component at 2f c + f m . This component is eliminated by passing vo (t) through an LPF. The output of this filter gives the demodulated signal 𝜐′m (t) = K cos(2𝜋 fm t − 𝜙)
(7.58)
Comparing this to the original message signal in Eq. (7.52), we see that, apart from a constant gain factor, which is not a distortion, the demodulated signal differs from the original message signal by a phase distortion equal to the phase error in the LO. The receiver changes the phase of each frequency component in the original message signal by a constant amount 𝜙. From our discussion in Section 4.7, this causes a phase distortion, which is unacceptable in data and video transmission as well as in music, where some harmonic relationships could be destroyed by a small shift in the demodulated frequency components from their original frequency values. However, it may be tolerated in speech transmission because the human ear is relatively insensitive to phase distortion. Older systems minimised the phase error by inserting a low-level pilot carrier into the transmitted SSB signal, which is then used to periodically lock the LO at the receiver. Modern systems use a crystal-controlled oscillator along with a frequency synthesiser to generate a local carrier with good frequency stability, e.g. one part in 106 . The need to generate a carrier of highly stable frequency is the main factor in the complexity and cost of SSB receivers. 7.7.2.4
Applications of SSB
SSB has numerous applications. ●
●
●
It is employed for two-way radio communication to conserve spectrum and transmitted power requirements. Thus, more users can be accommodated in a given bandwidth, and battery life can be prolonged. SSB is used for marine and military communication, and is very popular with radio amateurs, allowing them to maximise signal range with a minimum of transmitted signal power. SSB is universally employed in the implementation of frequency division multiplexing. It allows twice as many independent channels to be packed into a given frequency band as can be done with, say, DSB. Although not usually identified as such, SSB is the technique of frequency up-conversion in numerous telecommunication systems. For example, in satellite communications, the message signal modulates a lower-frequency carrier signal using, for example, phase modulation. The phase-modulated carrier is then translated or up-converted to the required up-link frequency using what is essentially the filter method of SSB modulation. At the satellite, the signal is amplified (but not phase demodulated), translated to a different downlink frequency, and transmitted back towards the earth.
517
518
7 Amplitude Modulation
|Vm1(f)|
|Visb(f)|
fm
f1
Message 1 in LSB
f ISB fc
|Vm2(f)|
●
fc + fm
Figure 7.39
fc
fc + f1
fm
fc – f1
fc – fm
f1
Message 2 in USB
f
f
ISB spectrum V isb (f ) resulting from two independent input signals with spectra V m1 (f ) and V m2 (f ).
It is important to point out that the ubiquitous nonlinear device, termed a mixer, is realised as the SSB demodulator shown in Figure 7.38. The mixer is found in all receivers based on the superheterodyne principle (see Figure 7.22) and employed for frequency down-conversion to translate a bandpass RF signal to a lower centre frequency.
7.7.3
ISB
Independent sideband amplitude modulation, abbreviated ISB, and sometimes called twin sideband suppressed carrier amplitude modulation, is a type of SSB. Two different message signals are transmitted on the two sidebands of a single carrier, with one message carried in the lower sideband and the other carried in the USB. Figure 7.39 shows the spectrum of an ISB signal that carries two arbitrary and independent message signals on a single carrier of frequency f c . Each message signal contains a band of sinusoids (or spectrum) in the frequency range from f 1 to f m . The spectrum of one message signal has been translated to form a lower sideband below the carrier frequency, whereas the spectrum of the other signal forms the USB. There are therefore two sidebands around the carrier, but each corresponds to a different message signal. 7.7.3.1
ISB Modulator
Figure 7.40 shows the block diagram of an ISB modulator. It consists of two SSB modulators (filter method) fed with the same carrier signal of frequency f c . One modulator generates a lower sideband SSB signal vlsb (t) of one message signal, whereas the other generates a USB SSB signal vusb (t) of the other message signal. The sum of these two signals gives the ISB signal. A reduced level of the carrier (termed pilot carrier) may be inserted for use in coherent demodulation at the receiver. 7.7.3.2
ISB Demodulator
One way to demodulate an ISB signal is shown in block diagram form in Figure 7.41. The incoming ISB signal (centred on an RF carrier f c ) is translated down to a lower IF f IF = say 100 kHz, where affordable BPFs can be used to separately extract the two sidebands. The frequency translation is accomplished in a mixer that uses an oscillator frequency f LO = f c + f IF and generates the sum frequency f LO + f c = 2f c + f IF , and the difference frequency f LO − f c = f IF .
7.7 Variants of AM
LSB Product modulator
ʋm1(t)
BPF ʋlsb(t) +
Σ
Carrier
ʋisb(t)
ISB
+ ʋusb(t) Product modulator
ʋm2(t)
Figure 7.40
USB
BPF
ISB modulator.
ISB
and
fIF 2fc + fIF
fc Incoming ISB
Mixer
fIF BPF fIF – fm → fIF – f1
Local oscillator fLO = fc + fIF
BPF fIF + f1 → fIF + fm fIF
Figure 7.41
f1
fm
SSB Demod. fIF
ʋm1(t)
SSB Demod. fIF
ʋm2(t)
f1
fm
ISB demodulator.
The output of the mixer is fed into two BPFs with pass bands f IF − f m → f IF − f 1 and f IF + f 1 → f IF + f m , respectively. The first BPF passes the lower sideband of the frequency translated ISB signal. This sideband contains one message signal. The other BPF passes the USB that contains the other message signal. Both filters reject the sum frequency output of the mixer. The output of each BPF is passed through an SSB demodulator, which extracts the original message signal. The operation of the SSB demodulator is discussed in Section 7.7.2.3 using Figure 7.38. The only point that needs to be added in this case is that two sinusoidal signals are used at the receiver, namely f LO (at the mixer) and f IF (at the SSB demodulator). It is important for the sum of these two signals to be matched in frequency and phase with the missing carrier of the incoming ISB signal. A low-level pilot carrier f c may be inserted at the transmitter and used at the receiver to maintain this synchronisation.
519
520
7 Amplitude Modulation
Bandwidth = 6.8 kHz Message 1 in LSB
Message 2 in USB
(a)
fc – 3.4
fc – 0.3 fc
fc + 0.3
fc + 3.4
f, kHz
Bandwidth = 7.1 kHz Message 1 in LSB of carrier 1
Message 2 in LSB of carrier 2
(b)
fc – 3.4
fc – 0.3 fc
fc + 0.6
fc + 4
f, kHz
Figure 7.42 Bandwidth required to transmit two voice channels of baseband frequency from f 1 = 300 Hz to f m = 3400 Hz using (a) ISB and (b) SSB at carrier spacing 4 kHz.
7.7.3.3
ISB Merits, Demerit, and Application
ISB offers all the advantages of SSB. In addition, it allows closer packing of sidebands than is possible with SSB. A more efficient spectrum utilisation can therefore be achieved in transmitting multiple signals. As an example, let us consider analogue speech communication. The baseband frequency is in the range f 1 = 300 Hz to f m = 3400 Hz. The DSB bandwidth is therefore 6800 Hz, and with ISB two independent channels can be transmitted within a bandwidth of 6800 Hz, as shown in Figure 7.42a. The separation between the two sidebands is twice the minimum frequency component of the baseband speech signal, or 600 Hz. In SSB modulation, each channel is carried on a separate subcarrier, and two subcarriers are required to transmit two independent channels. These subcarriers must be allowed enough guard band to avoid mutual interference between the two channels when realisable filters of nonzero transition width are employed to separate them. A carrier spacing of 4 kHz (in FDM telephony) would be a realistic minimum. Thus, as shown in Figure 7.42b, the two channels would occupy a bandwidth of 7.1 kHz in SSB. This is larger than the bandwidth requirement of ISB by 300 Hz. A spectrum saving of 150 Hz/channel is therefore achieved with ISB. It is important to emphasise that this saving in bandwidth has resulted simply from the closer spacing of sidebands that is possible with ISB. It can be seen in Figure 7.42 that the sideband spacing is 600 Hz with ISB and 900 Hz with SSB. The modulated spectrum of each of the two signals remains the same (3.1 kHz) for both SSB and ISB. The main demerit of ISB is that its per-channel circuit requirement for transmission is about the same as that of SSB, but its receiver circuit is more extensive, and therefore more expensive. The ISB technique would only be considered in situations where two or more independent signals must be transmitted on a link of very limited bandwidth. ISB has only a few applications, mostly in military communication.
7.7.4
VSB
Vestigial sideband modulation, abbreviated VSB, was employed mainly in the now obsoleted analogue television for the transmission of the luminance signal. The baseband TV signal contains frequency components down to
7.7 Variants of AM
DC. This made SSB impractical since there is no frequency separation between the upper and lower sidebands to allow the use of a realisable filter for separating them. At the same time, the bandwidth requirement of DSB was excessive. For example, the bandwidth of a luminance signal was 4.2 MHz in the NTSC (CCIR M) TV standard used in North America, and 5.5 MHz in the PAL standard used in Western Europe. Double sideband transmission would have required bandwidths of 8.4 MHz and 11 MHz, respectively, for these standards, well above the total RF bandwidth of 6 and 8 MHz, respectively, which were allocated in the two standards for transmitting one complete television channel (including audio and colour signals). VSB provides a compromise in which one almost complete sideband and a portion, or vestige, of the other sideband are transmitted. The bandwidth requirement of VSB is typically about 1.25 times the message signal bandwidth. This is larger than that of SSB, but a significant saving on DSB and AM requirements. Figure 7.43 shows the spectrum of a VSB signal. A representative rectangular message spectrum has been adopted in order to demonstrate the filtering of sidebands more clearly. The USB is retained in full except for a small portion that has been filtered off. A vestige of the LSB equal to an inverted mirror image of the missing USB portion is retained. The width of this vestigial LSB has been denoted Bv . It can be seen, measuring outward from the carrier frequency, that all components in the LSB from Bv to f m have been eliminated. Compare this with SSB, where all LSB components would be removed – not just a portion – and with DSB, where all LSB components would be retained. The bandwidth of a VSB signal is therefore (7.59)
VSB bandwidth = fm + B𝜐
where f m is the bandwidth of the message signal and Bv is the width of the vestigial sideband. Note that, although Figure 7.43 shows a linear slope clipping of the sidebands, all that is required is that the vestigial LSB be an inverted mirror image of the missing portion in the USB. This allows a lot of flexibility in the choice of a suitable filter to generate the VSB signal. 7.7.4.1
VSB Modulator
The discussion of VSB thus far suggests the simple modulator shown in Figure 7.44a. A product modulator generates a DSB signal, which is passed through a BPF that has an appropriate frequency response to attenuate one of the sidebands and reduce it to a vestige, while passing the other sideband with only a small attenuation of a portion of the frequency band. This filter is known as the VSB filter, and its normalised frequency response (with maximum gain equal to unity) is shown in Figure 7.44b. It is assumed in our discussion that it is the lower sideband that is reduced to a vestige. The frequency response of the VSB filter must be such that it allows the original message signal to be recovered at the receiver using coherent demodulation. A filter with the normalised frequency response H vsb (f ) shown in Figure 7.44b satisfies this requirement. The response of this filter is asymmetric about the carrier frequency. However, the filter can have any arbitrary response outside the interval f = f c ± f m , where the DSB signal is |Vvsb(f)|
|Vm(f)| VSB fc f1 = 0 Figure 7.43 LSB.
fm
f
carrier vestige of LSB
USB
fc – Bʋ fc fc + Bʋ
fc + fm
f
VSB spectrum for a message signal of bandwidth f m . Bv is the width of the remaining portion (or vestige) of the
521
7 Amplitude Modulation
(a)
Message ʋm(t)
ʋdsb(t)
Product modulator
VSB filter
ʋʋsb(t)
Carrier (fc) Hʋsb(f) 1.0
0.5
0 Figure 7.44
fc – fm
Don’t care
(b)
Don’t care
522
fc – Bʋ fc
fc + Bʋ
fc + fm
f
(a) VSB modulator; (b) normalised frequency response of VSB filter.
zero. These ‘don’t care’ regions are shown shaded in Figure 7.44b, and the filter response is continued within these regions in a completely arbitrary manner. The requirement of asymmetry about f c is satisfied by filters with a variety of response slopes. A linear-slope filter is shown in Figure 7.44b for simplicity, but this is not required. If the width of the vestigial sideband is Bv , the filter response must be zero in the frequency range f c − f m → f c − Bv in order to eliminate this portion of the LSB. It follows from the asymmetry condition that the response must be unity in the interval f c + Bv → f c + f m so that this portion of the USB is passed with no attenuation. Thus ⎧1 ⎪ ⎪0 ⎪ H𝜐sb (f ) = ⎨0.5 ⎪ ⎪Asymmetric ⎪Arbitrary ⎩ 7.7.4.2
fc + Bv ≤ f ≤ fc + fm fc − fm ≤ f ≤ fc − B𝜐 f = fc
(7.60)
fc − B𝜐 ≤ f ≤ fc + B𝜐 Otherwise
VSB Demodulator
If the normalised frequency response of the VSB filter used in the modulator satisfies Eq. (7.60) then a VSB signal can be extracted by coherent demodulation, as shown in Figure 7.45. The operation of a coherent demodulator has been discussed in previous sections and will not be repeated here, except to remind the reader that the sum frequency output of the product modulator is rejected by the LPF. We therefore concentrate on the difference frequency, which is the output of the LPF. Our concern here will be to demonstrate how the vestigial LSB compensates perfectly for the attenuated portion of the USB, resulting in the recovery of the original message spectrum. We need to use the double-sided VSB spectrum shown in Figure 7.46a. The difference frequency output is the result of the band of positive frequencies in the VSB signal being translated downwards by f c , and the band of negative frequencies being translated upwards by f c . You will recall from Chapter 4 that the amplitude spectrum
7.7 Variants of AM
VSB signal ʋʋsb(t)
Product modulator
ʋo(t)
Demodulated signal LPF
ʋm(t)
Carrier (fc)
Figure 7.45
Coherent demodulation of VSB.
|Vʋsb(f)|
(a) –fc – fm
–fc – Bʋ fc –fc + Bʋ
fc – Bʋ fc fc + Bʋ
|Vm(f)|
fc + fm
|Vm(f)|
(b)
= –fm
Figure 7.46
–Bʋ
Bʋ
fm
–fm
fm
(a) Double-sided spectrum of VSB signal; (b) output of LPF.
of a real signal has even symmetry. Thus, if a positive frequency component is shifted to the left (i.e. downwards in frequency) then the corresponding negative frequency component is shifted to the right (i.e. upwards in frequency) by the same amount. Figure 7.46b shows the double-sided spectrum of the LPF output. The negative band of frequencies −f c − f m → −f c + Bv in the VSB signal vvsb (t) has been translated to −f m → Bv , whereas the corresponding positive band f c − Bv → f c + f m has been translated to −Bv → f m . These bands overlap as shown and the resultant spectrum, which is the spectrum of the LPF output, is the double-sided equivalent of the original message spectrum shown earlier in Figure 7.43. The condition of asymmetry about f c is required so that the sum of the bands in the region of overlap equals a constant – normalised to unity. In commercial television receivers, where the cost of the receiver is an important design consideration, the use of envelope demodulation (as in AM) is preferred to coherent demodulation. To accomplish this the carrier signal is added to the VSB signal before transmission. A cheap diode demodulator (discussed in Section 7.5.1) can then be used to recover the message signal. However, the recovered signal is distorted in that frequency components (from DC to Bv ) that appear in both sidebands are doubled in amplitude compared to components in only one sideband (from Bv to f m ). To compensate for this the diode demodulator is preceded with a tuned amplifier, the gain of which varies asymmetrically about the carrier frequency. Figure 7.47a shows a basic block diagram of an envelope demodulator for a VSB signal, with the frequency response of the VSB compensation amplifier shown in Figure 7.47b.
523
524
7 Amplitude Modulation
VSB signal (with carrier) (a)
ʋʋsb(t)
VSB compensation tuned amplifier
Envelope demodulator
Demodulated signal ʋm(t)
Gain carrier
A
(b)
A/2
0
Figure 7.47
fc – fm
fc – Bʋ
fc
fc + Bʋ
fc + fm
f
(a) Envelope demodulation of VSB; (b) gain response of VSB compensation tuned amplifier.
7.8 Summary In this chapter, we have studied amplitude modulation and its four variants in detail. The basic scheme, DSB-TC-AM, is referred to simply as AM. The amplitude of a sinusoidal carrier signal is varied proportionately with the message signal between a nonnegative minimum value and a maximum value. The result is that the AM signal consists of the carrier signal and two sidebands. The carrier signal does not carry any information, but its presence allows a simple envelope demodulation scheme to be employed at the receiver to recover the message signal. The main advantage of AM is the simplicity of the circuits required for transmission and reception. AM can be generated using a nonlinear device followed by a suitable filter. It can be demodulated using an envelope demodulation circuit that consists of a diode, capacitor, and load resistor. However, AM is very wasteful of power. At least two-thirds of the transmitted power are in the carrier. To put, say, 10 kW of power in the information-bearing sidebands, one must generate a total power of at least 30 kW. AM is therefore employed mainly in sound broadcasting, where it is advantageous to have numerous cheap receivers and one expensive high-power transmitter. To save power the AM carrier may be suppressed, leading to the first variant of AM known as double sideband suppressed carrier amplitude modulation, abbreviated simply as DSB. There is, however, a penalty. The simple envelope modulator can no longer be used at the receiver because the envelope of the missing carrier does not correspond with the message signal. Coherent demodulation is required, which introduces increased circuit complexity. The Costas loop was discussed, which allows the locally generated carrier to be synchronised in phase and frequency with the missing carrier in the incoming DSB signal. Another way of achieving synchronisation is to insert a low-level pilot carrier at the transmitter, which is extracted at the receiver and scaled in frequency to generate the required carrier. DSB is used in those radio communication applications involving low-bandwidth message signals that must be transmitted over a long range with a minimum of power. It was noted that AM and DSB both transmit two sidebands, one a mirror image of the other, which carry identically the same information. The bandwidth requirement can be halved, with a further reduction in the required transmitted power level, if both the carrier and one sideband are removed. This leads to what is known as single sideband suppressed carrier amplitude modulation, abbreviated simply as SSB. Telecommunication applications
Questions
that favour SSB are those with significant bandwidth constraints, limited transmit power capability, and comparable numbers of transmitters and receivers. Independent sideband modulation (ISB) places two message signals onto one carrier, with one message in each of the sidebands. We showed that it makes extra spectrum saving by allowing the two sidebands to be more closely packed than is possible with SSB. However, it requires more extensive circuitry than SSB, and is therefore less commonly used. When a message signal has a combination of a large bandwidth and significant frequency components near DC then neither DSB nor SSB is suitable. DSB would involve excessive bandwidth, and SSB would require impractical filters to separate the sidebands. A compromise technique, called vestigial sideband amplitude modulation (VSB), sends one nearly complete sideband plus a vestige of the other sideband. This is achieved at the transmitter by filtering the DSB signal with a filter whose response is asymmetric about the carrier frequency. The original message signal can be recovered at the receiver by straightforward coherent demodulation. A cheaper envelope demodulator is used in analogue TV receivers. To do this the carrier must be inserted into the VSB signal at the transmitter, and the incoming VSB signal must first be passed through a tuned amplifier that compensates for those frequency components appearing in only one sideband. This completes our study of amplitude modulation applied to analogue message signals. We shall briefly return to it in Chapter 11, when we apply it to digital signals. In the next chapter, we take another important step in our study of communication systems by examining in some detail the techniques of frequency modulation and phase modulation (jointly referred to as angle modulation) as applied to analogue message signals.
Questions 7.1
A carrier signal 𝜐c (t) = 5 sin(2𝜋 × 106 t) V is modulated in amplitude by the message signal vm (t) shown in Figure Q7.1. (a) Sketch the waveform of the AM signal that is produced. (b) What is the modulation factor? (c) Determine the modulation sensitivity that would give 100% modulation.
7.2
An oscilloscope display of an AM waveform involving a sinusoidal message signal shows a maximum peak-to-peak value of 10 V and a minimum peak-to-peak value of 2 V. Calculate (a) Carrier signal amplitude (b) Message signal amplitude (c) Modulation index. ʋm(t), volt
4 2 3 –2 –4 Figure Q7.1
Question 7.1
9
15
18
t, μs
525
526
7 Amplitude Modulation
7.3
An AM signal is analysed by a spectrum analyser, which shows that there are the following frequency components: ● 998 kHz of amplitude 10 V ● 1 MHz of amplitude 50 V ● 1.002 MHz of amplitude 10 V. (a) Specify the carrier and message signals as functions of time, assuming that the initial phase of each is zero. (b) Determine the depth of modulation. (c) What power would be dissipated by this AM signal in a 50 Ω load?
7.4
An engineering student wishing to improve AM power utilisation generates an AM signal in which the carrier and each sideband have equal power. By calculating the modulation factor, determine whether a conventional (noncoherent) AM receiver would be able to demodulate this AM signal.
7.5
A carrier of amplitude V c is modulated by a multitone message signal, which is made up of sinusoids of amplitudes V 1 , V 2 , V 3 , … Starting from Eq. (7.25), show that the modulation factor is given by √ m = m12 + m22 + m32 + · · · where m1 = V 1 /V c , m2 = V 2 /V c , m3 = V 3 /V c , …
7.6
The carrier vc (t) = 120sin(4 × 106 𝜋t) V is amplitude modulated by the message signal vm (t) = 80 sin(40 × 103 𝜋t) + 50 sin(80 × 103 𝜋t) + 30 sin(100 × 103 𝜋t) V. (a) Sketch the AM signal spectrum. (b) Determine the fraction of power in the sidebands. (c) Determine the modulation index.
7.7
A 75% modulated AM signal has 1 kW of power in the lower sideband. The carrier component is attenuated by 4 dB before transmission, but the sideband components are unchanged. Calculate: (a) The total transmitted power (b) The new modulation index.
7.8
A 1.2 MHz carrier of amplitude 10 V is amplitude modulated by a message signal containing two frequency components at 1 and 3 kHz, each having amplitude 5 V. Determine (a) The bandwidth of the message signal. (b) The width of each of the sidebands of the AM signal. (c) The condition under which the bandwidth of the AM signal is twice the width of each sideband. (d) The modulation index. (e) The total power in the AM signal.
7.9
The output voltage vo and input voltage vi of a nonlinear device are related by vo = vi + 0.02v2i A series connection of a carrier signal source of amplitude 20 V and frequency 100 kHz, and a message signal source of amplitude 10 V and frequency 5 kHz provides the input to this device.
Questions
Figure Q7.11
iD
Question 7.11
ʋam(t)
RL
ʋo(t)
(a) Sketch the amplitude spectrum of the output signal vo . (b) Specify the frequency response of a filter required to extract the AM signal component of vo without distortion. (c) Determine the modulation index of the AM signal. 7.10
The envelope demodulator in an AM superheterodyne receiver consists of the diode demodulator circuit shown in Figure 7.17a. Determine suitable values of load resistance RL and capacitance C, assuming a forward-bias diode resistance Rf = 20 Ω, and IF amplifier output impedance Rs = 50 Ω. Note that carrier frequency is f IF = 470 kHz and the message signal is audio of frequencies 50 Hz to 5 kHz.
7.11
A square-law demodulator is shown in Figure Q7.11, where the input voltage vam (t) is the incoming AM signal. For small input voltage levels, the diode current is a nonlinear function of vam (t) so that the output voltage vo (t) is given approximately by 𝜐o = 𝛼1 𝜐am + 𝛼2 𝜐2am (a) Evaluate the output voltage vo (t) and show that it contains the original message signal. (b) Discuss how the message signal may be extracted from vo (t) and the conditions that must be satisfied to minimise distortion.
7.12
.(a) Show that the block diagram in Figure Q7.12a generates an AM waveform vam (t). This connection consists of a linear adder, a unity gain multiplier, and two signal generators (Sig. Gen.), one generating a sinusoidal signal of amplitude A1 at frequency f m and the other generating a sinusoidal carrier signal of amplitude A2 and frequency f c > > f m . (b) Determine the amplitude and frequency settings on the signal generators in (a) required to produce the AM waveform sketched in Figure Q7.12b. (c) Sketch the amplitude spectrum of the AM signal of Figure Q7.12b. (d) Determine the rms value of the AM signal of Figure Q7.12b.
7.13
Quadrature amplitude modulation (QAM) is used to transmit two DSB signals within the same bandwidth that would be occupied by one DSB signal. One message signal vm1 (t) is carried on a carrier of frequency f c , whereas the other (independent) message signal vm2 (t) is carried on a carrier of the same frequency but with a 90∘ phase difference. The QAM signal is therefore given by 𝜐qam (t) = 𝜐m1 (t)Vc cos(2𝜋 fc t) + 𝜐m2 (t)Vc sin(2𝜋 fc t) (a) Draw the block diagram of a QAM modulator that generates vqam (t). (b) Show that the message signal vm1 (t) can be extracted at the receiver by passing vqam (t) through a coherent demodulator that uses an oscillator generating a synchronised carrier Vc′ cos(2𝜋 fc t), and that another coherent demodulator operating with carrier signal Vc′ sin(2𝜋 fc t) extracts vm2 (t). (c) Draw the block diagram of a QAM demodulator.
527
528
7 Amplitude Modulation
Sig. Gen. fm A1
(a)
Sig. Gen. fc A2
Linear adder
ʋam(t)
ʋc(t)
1 μs
8V 0
Figure Q7.12
7.14
Unity gain multiplier
20 μs
72 V
(b)
ʋm(t)
Question 7.12
Let a sinusoidal message signal of frequency f m be transmitted on a carrier of frequency f c . Assume that, due to selective fading, the lower side frequency reaches the receiver with a phase 𝜙1 relative to the carrier, whereas the upper side frequency has a phase 𝜙2 relative to the carrier. (a) Obtain an expression for the coherently demodulated signal, given that the transmission is DSB and the local carrier is perfectly synchronised with the missing carrier in the incoming signal. Discuss the distortion effects of 𝜙1 and 𝜙2 and specify the condition under which there is complete cancellation so that the demodulated signal is zero. (b) Sketch the AM waveform for 𝜙1 = 60∘ and 𝜙2 = 0∘ . How does the selective fading affect the output of an envelope demodulator? (c) Use a sketch of the AM waveform to examine the effect on the AM envelope when selective fading attenuates the carrier signal by 3 dB more than the attenuation of the side frequencies. Assume that the side frequencies remain in phase with the carrier.
529
8 Frequency and Phase Modulation
A journey across the Atlantic is impossible if you swim, dangerous if you canoe, long if you sail, but safe and fun if you fly. In this Chapter ✓ Basic concepts of frequency modulation (FM) and phase modulation (PM): a simplified and lucid approach to the theory of angle modulation. ✓ FM and PM waveforms: a detailed discussion of all types of angle modulated waveforms. You will be able to sketch these waveforms and to identify them by visual inspection. ✓ Spectrum and power of FM and PM: you will be able to solve various problems on the spectrum, bandwidth, and power of FM and PM signals. Narrowband FM and PM are discussed with comparisons to amplitude modulation (AM). ✓ FM and PM modulators: a detailed discussion of various methods of applying the theory of previous sections to generate FM and PM signals. ✓ FM and PM demodulators: a discussion of how to track the frequency variations in received angle modulated signals and to convert these frequency variations to voltage variations. ✓ FM transmitters and receivers: a discussion of the building blocks and signal processing in a complete FM communication system, including the tasks of pre-emphasis and de-emphasis. A simplified treatment of the trade-off between transmission bandwidth and signal-to-noise power ratio (SNR) is also presented. ✓ Noise effect: a phasor approach in the analysis of noise effect in FM receivers. ✓ Features overview: merits, demerits, and applications of angle modulation.
8.1 Introduction Modulation plays very significant roles in telecommunication, as briefly discussed in Chapter 1. AM, treated in detail in the previous chapter, is the oldest modulation technique and is obtained by varying the amplitude of a high-frequency sinusoidal carrier in sync with the message signal. One of the problems with AM is that additive noise in the transmission medium will also impose variations on the amplitude of the modulated carrier in a manner that is impossible for the receiver to separate from the legitimate variations caused by the message signal. The received message signal will therefore be corrupted to some extent by noise.
Communication Engineering Principles, Second Edition. Ifiok Otung. © 2021 John Wiley & Sons Ltd. Published 2021 by John Wiley & Sons Ltd. Companion Website: www.wiley.com/go/otung
530
8 Frequency and Phase Modulation
An alternative modulation technique that is less susceptible to additive noise was first proposed in 1931. In this technique, which is given the generic name angle modulation, the amplitude V c of the sinusoidal carrier is kept constant, but the angle 𝜃 c of the carrier is varied in sync with the message signal. The angle 𝜃 c of a sinusoidal carrier at any instant of time t depends on the frequency f c and initial phase 𝜙c of the carrier, according to the relation 𝜃c (t) = 2𝜋fc t + 𝜙c
(8.1)
Thus, the angle of the carrier may be varied in one of two ways: (i) By varying the frequency f c of the carrier, giving rise to what is known as frequency modulation; or (ii) By varying the initial phase 𝜙c of the carrier, resulting in phase modulation. Some textbooks refer to 𝜙c as the phase shift. This chapter is devoted to the study of the related techniques of FM and PM. First, we introduce FM and PM using a simple staircase signal, and define common terms such as frequency sensitivity, phase sensitivity, frequency deviation, and percent modulation. Next, we study the angle modulation of a carrier by a sinusoidal message signal. It is shown that FM and PM are implicitly related in that FM transmission of a message signal is equivalent to the PM transmission of a suitably lowpass filtered version of the message signal. Narrowband and wideband angle modulations are discussed in detail, including two of the most common definitions of the bandwidth of FM and PM. The modulator and demodulator circuits that implement FM and PM schemes are discussed in block diagram format. The presentation also includes FM transmitter and receiver systems, and concludes with a discussion of the applications of angle modulation and its merits and demerits compared to AM.
8.2 Basic Concepts of FM and PM Angle modulation (FM or PM) represents a message signal using nonlinear variations in the angle of a sinusoidal carrier signal. The amplitude of the carrier signal is not varied, and therefore any noise-imposed fluctuations in carrier amplitude are simply ignored at the receiver. This gives FM and PM a greater robustness against noise and interference than is possible with AM. The information contained in the variations of the message signal is perfectly preserved in the variations of the carrier angle, and this information is recovered at the receiver by tracking the carrier angle. As with AM, there is a great deal of freedom in the choice of the frequency of the unmodulated carrier signal, allowing information transmission through a variety of transmission media and communication systems. The angle 𝜃 c of an unmodulated sinusoidal carrier of frequency f c is given by Eq. (8.1), which shows that 𝜃 c increases linearly with time at the constant rate 2𝜋f c rad/s. The amplitude of this carrier remains constant at V c . Figure 8.1 is a plot of both parameters (the angle and amplitude) of an unmodulated carrier as functions of time. We see in the previous chapter that the effect of AM is to cause the amplitude to deviate from the constant value V c in accordance with the instantaneous value of the message signal. The angle of an amplitude-modulated carrier is not varied by the message signal, so it continues to be a linear function of time. What happens in the case of angle modulation is that the carrier amplitude remains constant with time, but the carrier angle is caused to deviate from being a linear function of time by variations of the message signal. The carrier angle has two components, namely
8.2 Basic Concepts of FM and PM
Unmodulated angle, θc
Amplitude Vc (volt)
Unmodulated amplitude Δθc Initial phase ϕ (rad) c
Slope =
Δθc = 2πfc Δt
Δt t
t=0 Figure 8.1
Amplitude and angle of an unmodulated carrier as functions of time.
frequency f c and initial phase 𝜙c , and the manner of the deviation depends on which of these two components is varied by the message signal. As an example, Figure 8.2a shows a simple staircase message signal that has the following values ⎧ 0 V, ⎪ ⎪ 2 V, ⎪ 1 V, vm (t) = ⎨ ⎪−1 V, ⎪−2 V, ⎪ 0 V, ⎩
0 ≤ t < 1 ms 1 ≤ t < 2 ms 2 ≤ t < 3 ms 3 ≤ t < 4 ms 4 ≤ t < 5 ms 5 ≤ t ≤ 6 ms
(8.2)
Let this signal modulate the angle of the carrier vc (t) = Vc cos[𝜃c (t)] = Vc cos(2𝜋fc t + 𝜙c )
(8.3)
In the following discussion we use the values amplitude V c = 2 V, frequency f c = 3 kHz, and initial phase 𝜙c = −𝜋/2 rad to demonstrate the angle modulation process. Of course, any other set of values could have been selected. The modulation can be performed by varying either the frequency or the initial phase of the carrier, as discussed below.
8.2.1 Frequency Modulation Concepts In FM, the frequency of the carrier at any given time is changed from its unmodulated value f c by an amount that depends on the value of the message signal at that time. The instantaneous frequency f i of the carrier is given by fi = fc + kf vm (t)
(8.4)
531
532
8 Frequency and Phase Modulation
2
ʋm(t), V
1 (a)
0
2
4
6
t (ms)
–1 –2 θc(t), rad
71π/2 59π/2 47π/2 (b)
35π/2
Phase modulated angle Unmodulated angle
23π/2
Frequency modulated angle 11π/2 –π/2 0
1
2
3
4
5
6
t (ms)
Figure 8.2 (a) Staircase message signal; (b) angle of sinusoidal carrier. (Frequency sensitivity k f = 1 kHz/volt; Phase sensitivity k p = 𝜋/2 rad/V.)
where kf is the change in carrier frequency per volt of modulating signal, called frequency sensitivity and expressed in units of Hz/V. In our example, with kf = 1 kHz/V and the message and carrier signals specified in Eqs. (8.2) and (8.3), the instantaneous frequency of the frequency modulated carrier is as follows ⎧ ⎪3 kHz, ⎪5 kHz, ⎪ ⎪4 kHz, fi = fc + kf vm (t) = ⎨ ⎪2 kHz, ⎪1 kHz, ⎪ ⎪3 kHz, ⎩
0 ≤ t < 1 ms 1 ≤ t < 2 ms 2 ≤ t < 3 ms 3 ≤ t < 4 ms
(8.5)
4 ≤ t < 5 ms 5 ≤ t ≤ 6 ms
In the interval 0 ≤ t < 1 ms the message signal is 0 V, so the carrier frequency is unchanged from its unmodulated value f c = 3 kHz. However, in the interval 1 ≤ t < 2 ms the message signal is 2 V, and the carrier frequency is changed from f c by an amount kf × vm (t) = (1 kHz/V) × 2 V = 2 kHz. The instantaneous carrier frequency during this interval is therefore fi = 5 kHz. The values of fi in other time intervals are similarly determined. Now consider the effect of this variable carrier frequency on the angle 𝜃 c (t) of the carrier. The carrier angle 𝜃 c starts at the initial value 𝜙c (= −𝜋/2 in this example). During the first 1 ms interval when the modulating signal vm (t) = 0 V, f i = f c (= 3 kHz in this example), and 𝜃 c increases at the rate of 2𝜋f i rad/s = 6𝜋 rad/ms, reaching the value 𝜃 c = 11𝜋/2 at t = 1 ms. At this time instant vm (t) increases to 2 V, so the instantaneous carrier frequency
8.2 Basic Concepts of FM and PM
is increased to f i = 5 kHz, as already shown, and the carrier angle then increases at a faster rate 2𝜋f i rad/s or 10𝜋 rad/ms, reaching the value 𝜃 c = 31𝜋/2 at t = 2 ms. During the third 1 ms interval the instantaneous frequency f i = 4 kHz, and 𝜃 c increases from 31𝜋/2 at the rate 8𝜋 rad/ms to reach 𝜃 c = 47𝜋/2 at t = 3 ms. Following the above reasoning, the values of carrier angle at all time instants are obtained. This is plotted in Figure 8.2b. The carrier angle is constantly increasing and the instantaneous value of the carrier changes cyclically (i.e. oscillates), taking on the same value at carrier angles that differ by integer multiples of 2𝜋. However, unlike the unmodulated carrier, the rate of increase of carrier angle (≡ 2𝜋f i rad/s) is not constant but changes in sync with the message signal. The carrier oscillates faster (than f c ) for positive message values, and more slowly for negative message values. If, as in this example, the message signal has zero average value (i.e. equal amounts of negative and positive values) then the carrier oscillation (or increase in angle) is slowed down by as much as it is speeded up at various times. The final value of 𝜃 c is then the same as that of the unmodulated carrier. However, the modulated carrier has increased from its initial angle value (𝜙c ) to its final angle value using a variable rate of increase, which is what conveys information. On the other hand, the unmodulated carrier maintains a constant rate of increase of angle in going between the same endpoints and therefore conveys no information. Let us now introduce several important terms that are associated with FM. We show later that these terms are also applicable to PM. It is obvious from Eq. (8.4) that the instantaneous frequency f i increases as the modulating signal vm (t) increases. The minimum instantaneous frequency f imin occurs at the instant that vm (t) has its minimum value V min , and the maximum instantaneous frequency f imax occurs when vm (t) has its maximum value V max fimin = fc + kf Vmin fimax = fc + kf Vmax
(8.6)
The difference between f imax and f imin gives the range within which the carrier frequency is varied, and is known as the frequency swing f p−p fp−p = fimax − fimin = kf (Vmax − Vmin ) = kf Vmp−p
(8.7)
where V mp−p is the peak-to-peak value of the modulating signal vm (t). The maximum amount by which the carrier frequency deviates from its unmodulated value f c is known as the frequency deviation f d . It depends on the frequency sensitivity kf of the modulator, and the maximum absolute value |vm (t)|max of the modulating signal. Thus fd = kf |vm (t)|max
(8.8)
A maximum allowed frequency deviation is usually set by the relevant telecommunication regulatory body in order to limit bandwidth utilisation. This maximum allowed frequency deviation is called the rated system deviation F D ⎧75 kHz, FM radio (88 − 108 MHz band) ⎪ ⎪25 kHz, TV sound broadcast FD = ⎨ ⎪ 5 kHz, 2-way mobile radio (25 kHz bandwidth) ⎪2.5 kHz, 2-way mobile radio (12.5 kHz bandwidth) ⎩
(8.9)
The ratio (expressed as a percentage) of actual frequency deviation f d in an FM implementation to the maximum allowed deviation is known as the percent modulation m m=
fd × 100% FD
(8.10)
533
534
8 Frequency and Phase Modulation
Note that this concept of percent modulation is different from that associated with AM, where there is no regulatory upper limit on the deviation of the carrier amplitude from its unmodulated value V c . The ratio of frequency deviation f d of the carrier to the frequency f m of the modulating signal is known as the modulation index 𝛽 𝛽=
fd fm
(8.11)
The modulation index has a single value only in those impractical situations where the modulating signal is a sinusoidal signal – this contains a single frequency f m . For the practical case of an information-bearing message signal, a band of modulating frequencies from f 1 to f m is involved. The modulation index then ranges in value from a minimum 𝛽 min to a maximum 𝛽 max , where 𝛽min =
fd ; fm
𝛽max =
fd f1
(8.12)
In this case, the more useful design parameter is 𝛽 min , which is also called the deviation ratio D. It is the ratio between frequency deviation and the maximum frequency component of the message signal. Deviation ratio replaces modulation index when dealing with nonsinusoidal message signals. It corresponds to a worst-case situation where the maximum frequency component f m of the message signal has the largest amplitude and therefore causes the maximum deviation of the carrier from its unmodulated frequency f c . Thus, if an arbitrary message signal of maximum frequency component f m modulates a carrier and causes a frequency deviation f d , then D=
fd fm
(8.13)
Worked Example 8.1 A message signal vm (t) = 2sin(30 × 103 𝜋t) volt is used to frequency modulate the carrier vc (t) = 10sin(200 × 106 𝜋t) volt. The frequency sensitivity of the modulating circuit is kf = 25 kHz/V. Determine (a) The frequency swing of the carrier. (b) The modulation index. (c) The percent modulation. (a) The message signal has amplitude V m = 2 V, and peak-to-peak value V mp–p = 2V m = 4 V. From Eq. (8.7), the frequency swing is given by fp−p = kf Vmp−p = (25 kHz∕V) × 4 V = 100 kHz (b) The maximum absolute value |vm (t)|max of this (sinusoidal) message signal is its amplitude V m = 2 V. Multiplying this by the frequency sensitivity gives the frequency deviation f d = 50 kHz. From the expression for vm (t), the modulating signal frequency f m = 15 kHz. Thus, modulation index 𝛽=
fd 50 = 3.33 = 15 fm
(c) Since the carrier frequency is 100 MHz, we may safely assume that the application is FM radio with a rated system deviation of 75 kHz. The percentage modulation is then m=
fd 50 × 100% = × 100% = 66.7% FD 75
8.2 Basic Concepts of FM and PM
8.2.2 Phase Modulation Concepts In PM, the phase of the carrier at any given time is changed from its unmodulated value 𝜙c by an amount that depends on the value of the message signal at that time. Recall the expression for the unmodulated carrier given in Eq. (8.3) vc (t) = Vc cos[𝜃c (t)] = Vc cos(2𝜋fc t + 𝜙c ) With the variations in phase imposed in sympathy with the message signal, we can now refer to the instantaneous phase 𝜙i of the carrier given by 𝜙i = 𝜙c + kp vm (t)
(8.14)
where kp is the change in carrier phase per volt of modulating signal, called phase sensitivity and expressed in units of rad/V. The unmodulated carrier phase 𝜙c is usually set to zero without any loss of generality, since the message signal is carried in the variation of the carrier phase rather than its absolute magnitude. Then 𝜙i = kp vm (t)
(8.15)
Let us now examine the variation of the angle of the carrier in Eq. (8.3) when it is phase modulated by the staircase message signal vm (t) in Eq. (8.2). We assume a phase modulator sensitivity kp = 𝜋∕2 rad/V. The unmodulated phase of the carrier is 𝜙c = −𝜋/2, and Eq. (8.14) gives the instantaneous phase of the phase modulated carrier ⎧ −𝜋∕2 rad, ⎪ ⎪ 𝜋∕2 rad, ⎪ 0 rad, 𝜙i = 𝜙c + kp vm (t) = ⎨ −𝜋 rad, ⎪ ⎪−3𝜋∕2 rad, ⎪ −𝜋∕2 rad, ⎩
0 ≤ t < 1 ms 1 ≤ t < 2 ms 2 ≤ t < 3 ms 3 ≤ t < 4 ms 4 ≤ t < 5 ms 5 ≤ t ≤ 6 ms
(8.16)
It would pay to check that you agree with the above values. The carrier angle 𝜃 c is given at any time t by the sum of two terms 𝜃c (t) = 2𝜋fc t + 𝜙i
(8.17)
The first term is a component that increases at the constant rate 2𝜋f c rad/s or 6𝜋 rad/ms – since the frequency f c (= 3 kHz) is not altered in PM. It has the values 0, 6𝜋, 12𝜋, 18𝜋, 24𝜋, 30𝜋, and 36𝜋, at t = 0, 1, 2, 3, 4, 5, and 6 ms, respectively. The second term is the instantaneous phase, which varies according to the modulating signal, and has the values given in Eq. (8.16). The sum of these two terms gives the phase modulated angle 𝜃 c (t) plotted against time in Figure 8.2b. Observe that in those intervals where the modulating signal is constant, the phase modulated carrier angle increases at the same constant rate (= 2𝜋f c ) as the unmodulated carrier angle. Then at the instant that vm (t) changes by, say, ΔV volt, the phase modulated angle undergoes a step change equal to kp ΔV. Herein lies an important difference between FM and PM. In PM, the carrier angle deviates from the unmodulated rate of change (= 2𝜋f c ) only when the modulating signal is changing, whereas in FM the carrier angle changes at a different rate than 2𝜋f c whenever the message signal is nonzero. It should be observed that PM does indirectly produce FM. For example, PM may cause the angle of a carrier to jump from a lower to a higher value, which is equivalent to the carrier oscillating at a faster rate, i.e. with a higher frequency. Similarly, a drop in carrier angle is equivalent to a reduction in oscillation frequency. FM and PM are therefore very closely related. In Figure 8.2b, the phase modulated angle deviates by discrete jumps from the constant rate of increase that is followed by unmodulated angles. This is because the modulating signal is staircase and changes in steps. If a smoothly varying (analogue) signal phase modulates a carrier, the deviations of
535
536
8 Frequency and Phase Modulation
the rate of change of carrier angle from 2𝜋f c will also be smooth, and FM and PM will differ only in the value and time of occurrence of the carrier frequency variations. We explore the relationship between FM and PM further in the next section. We could define the terms phase swing, phase deviation, and so on, for PM in analogy with the corresponding definitions for FM. But these terms are rarely used, since it is conventional to treat PM in terms of frequency variations, making the FM terms applicable to PM. However, the term phase deviation, denoted 𝜙d , merits some discussion. It is defined as the maximum amount by which the carrier phase – or the carrier angle – deviates from its unmodulated value. Thus 𝜙d = kp |vm (t)|max
(8.18)
Remember that a sinusoidal signal completes one full cycle of oscillation when its angle advances through 2𝜋 rad. Any further advance in angle merely causes the carrier to repeat its cycle. Using the standard convention of specifying positive angles as anticlockwise rotation from the +x axis direction and negative angles as clockwise rotation from the same direction, the range of angles that covers a complete cycle is −𝜋 to +𝜋. See Figure 8.3a. Any phase change by an amount Φ that is outside this range can be shown to be equivalent to some value 𝜙 within this range, where 𝜙 = Φ + 2n𝜋,
n = · · · − 3, −2, −1, 1, 2, 3, · · ·
(8.19)
That is, 𝜙 is obtained by adding to or subtracting from Φ an integer number of 2𝜋’s in order to place it within the interval −𝜋 to +𝜋. For example, Figure 8.3b shows that the phase change Φ = 3𝜋/2 rad is equivalent to 𝜙 = Φ − 2𝜋 = 3𝜋∕2 − 2𝜋 = −𝜋∕2 rad And the phase change Φ = −7𝜋/2 rad is equivalent to 𝜙 = Φ + 4𝜋 = −7𝜋∕2 + 4𝜋 = 𝜋∕2 rad π/2 Positive angles 0 → π (a)
π or –π
0
x direction
Negative angles 0 → –π –π/2 π/2 –7π/2 (b)
3π/2
π or –π
0
–π/2 Figure 8.3 (a) Positive and negative angles; (b) examples of equivalent angles: 3𝜋/2 is equivalent to −𝜋/2, and −7𝜋/2 is equivalent to 𝜋/2.
8.2 Basic Concepts of FM and PM
It therefore follows that the maximum possible value of phase deviation 𝜙d is 𝜋 rad. A value of 𝜙d having absolute value in excess of 𝜋 rad is equivalent to a smaller value in the interval −𝜋 to +𝜋, according to Eq. (8.19), and would lead to an error at a PM demodulator that tracks the carrier phase. A large message value (that gives a phase deviation >𝜋 rad) would be mistaken for a smaller message value that would cause an equivalent phase deviation between −𝜋 and 𝜋. Note that this problem only applies in direct wideband phase demodulation, which is never used in analogue communication systems. We define phase modulation factor m as the ratio of actual phase deviation in a PM implementation to the maximum possible deviation 𝜙 (8.20) m= d 𝜋 Worked Example 8.2 A message signal vm (t) = 5sin(30 × 103 𝜋t) volt is used to phase modulate the carrier vc (t) = 10sin(200 × 106 𝜋t) volt, causing a phase deviation of 2.5 rad. Determine (a) The phase sensitivity of the modulator (b) The phase modulation factor. (a) The maximum absolute value |vm (t)|max of the message signal is its amplitude V m = 5 V. From Eq. (8.18), phase sensitivity is 𝜙d 2.5 rad = 0.5 rad∕V = kp = |vm (t)|max 5V (b) Phase modulation factor 𝜙 2.5 m= d = = 0.796 𝜋 𝜋
8.2.3 Relationship Between FM and PM Let us now obtain a quantitative indication of the relationship between frequency and PMs. We first explore the type of carrier frequency variation that occurs in PM and then consider phase variations in FM signals. 8.2.3.1 Frequency Variations in PM
Any message signal vm (t) can be approximated by a staircase signal vst (t), as shown in Figure 8.4a. Time is divided into small intervals Δt, and the curve of vm (t) in each interval is approximated by two line segments. One segment is horizontal and has a length Δt. The other is vertical and of length Δvm , which gives the change undergone by vm (t) during the time interval Δt. In the limit when Δt is infinitesimally small the approximation becomes exact – i.e. vst (t) and vm (t) become identical. Consider the result of phase modulating a carrier of frequency f c by the signal vst (t). The variation of the carrier angle 𝜃 c during one time interval Δt is shown in Figure 8.4b. vst (t) has a constant value until just before the end of the interval, so the carrier angle increases at the unmodulated rate of 2𝜋f c from point A to point B. When vst (t) makes a step change of Δvm at the end of the interval, this causes the carrier phase, and hence its angle, to increase by kp Δvm from point B to point C. The result is that the carrier angle has increased from 𝜃 1 to 𝜃 2 in a time Δt. The average rate of change of carrier angle is therefore 2𝜋fc Δt + kp Δvm 𝜃2 − 𝜃1 = Δt Δt Δvm = 2𝜋fc + kp Δt
2𝜋fa =
537
538
8 Frequency and Phase Modulation
ʋm(t)
(a)
ʋst(t)
Δʋm Δt t θc(t) θ2
C kpΔʋm B
(b)
2πfcΔt θ1
A t
Δt
Figure 8.4
(a) Staircase approximation of an arbitrary message signal; (b) increase in carrier angle during time interval Δt.
where f a is the average carrier frequency during the interval Δt. By making the time interval Δt infinitesimally small, vst (t) becomes the message signal vm (t), the ratio of Δvm to Δt becomes the derivative or slope of the message signal, and f a becomes the instantaneous carrier frequency f i . Thus f i = fc +
1 dvm (t) k 2𝜋 p dt
(8.21)
Equation (8.21) is a remarkable result that reveals how PM varies the carrier frequency. Compare it with Eq. (8.4), repeated below for convenience, which gives the instantaneous frequency of this carrier when frequency modulated by the same message signal vm (t) in a modulator of frequency sensitivity kf fi = fc + kf vm (t) PM and FM are therefore related as follows: (i) Both PM and FM vary the carrier frequency – but not the carrier amplitude of course. FM’s frequency variation is achieved directly, as expressed in Eq. (8.4). However, PM’s frequency variation is indirect and occurs because a (direct) change in the carrier phase causes the carrier’s angle to change at a different rate. In other words, it alters the carrier’s angular frequency, and hence frequency. (ii) PM varies the carrier frequency from its unmodulated value f c only at those instants when the message signal is changing. The carrier frequency is unchanged whenever the message signal has a constant value, say V 1 . During this time, the phase of the carrier is simply held at a level kp V 1 above its unmodulated value 𝜙c . FM causes the carrier frequency to deviate from its unmodulated value whenever the message signal has a nonzero value, irrespective of whether that value is constant or changing. (iii) In PM, the maximum instantaneous carrier frequency f imax occurs at the instant where the message signal is changing most rapidly, i.e. at the point where the derivative of the message signal is at a maximum. An
8.2 Basic Concepts of FM and PM
FM carrier has its maximum instantaneous frequency at the instant that the message signal is at a maximum value. This feature provides a sure way of distinguishing between FM and PM waveforms when the message signal is a smoothly varying analogue signal. This is discussed further in Section 8.3.2. (iv) Frequency deviation f d in PM depends on both the amplitude and frequency of the message signal, whereas in FM it depends only on the message signal amplitude. To see that this is the case, assume a sinusoidal message signal vm (t) = V m sin(2𝜋f m t) in Eqs. (8.4) and (8.21). It follows that { fc + kf Vm sin(2𝜋fm t), FM fi = (8.22) fc + kp fm Vm cos(2𝜋fm t), PM { FM fc + kf Vm , fimax = fc + kp fm Vm , PM The frequency deviation is therefore { kf Vm , FM fd = fimax − fc = kp fm Vm , PM
(8.23)
which agrees with the previous statement. Thus, given two frequency components of the same amplitude in a modulating signal, the higher frequency component produces a larger carrier frequency deviation than the deviation caused by the lower frequency component if PM is used, whereas in FM both frequency components produce the same frequency deviation. For this reason, FM has a more efficient bandwidth utilisation than PM in analogue communication. (v) PM can be obtained using an FM modulator. A block diagram of the arrangement is shown in Figure 8.5. The operation of the FM modulator results in the signal vo (t) frequency-modulating a carrier of frequency f c . This produces an FM signal with instantaneous frequency f i given by Eq. (8.4) as fi = fc + kf vo (t) Since vo (t) is the output of the differentiator circuit whose input is the message signal vm (t), we have vo (t) =
dvm (t) dt
Substituting this identity for vo (t) in the expression for f i , and using a modulator of frequency sensitivity kf = kp /2𝜋, we have the following result for the instantaneous frequency at the output of the FM modulator fi = fc +
kp dvm (t) 2𝜋 dt
You will recognise this as Eq. (8.21) – the instantaneous frequency of a PM signal. That is, the message signal has phase modulated the carrier, and this was achieved by passing the message signal through a differentiator and then using the differentiator output to frequency modulate a carrier. The differentiator essentially performs highpass filtering that boosts high frequency components of vm (t) relative to the lower frequency components. When this filtered signal vo (t) is fed into an FM modulator (whose frequency deviation, you will
Message ʋm(t) signal Figure 8.5
Differentiator d dt
ʋo(t)
PM generation using FM modulator.
FM modulator kf = kp/2π
PM signal
539
540
8 Frequency and Phase Modulation
recall, is proportional to modulating signal amplitude), the (boosted) higher frequency components will produce a larger frequency deviation than the deviation produced by lower frequency components of originally similar amplitude. This is a PM characteristic, and therefore the overall result is that the carrier has been phase modulated by vm (t). 8.2.3.2 Phase Variations in FM
We have explored in detail the relationship between FM and PM by looking at the frequency variations in both signals. An alternative approach that concentrates on phase variations is also very illuminating. We already know that PM varies the phase of a carrier signal. Let us show that FM also varies the carrier phase, albeit indirectly, and explore the relationships between the phase variations in both techniques. Figure 8.6a shows an arbitrary message signal vm (t) approximated by a staircase signal. We want to determine the instantaneous phase 𝜙i of the carrier signal vc (t) = Vc cos(2𝜋fc t + 𝜙i ) when it is frequency modulated by vm (t). The angle of the modulated carrier at any time is given by Eq. (8.17), where 𝜙i is the instantaneous carrier phase that we are after. We already know from Eq. (8.14) that PM gives 𝜙i = 𝜙c + kp vm (t). To determine 𝜙i in the case of FM, consider how the modulated carrier angle increases in the time from 0 to t. We have divided this duration into N infinitesimal intervals, numbered from 0 to N−1, each of width Δt. The message signal, approximated by the staircase signal vst (t), has a constant value during each interval Δt, which leads to a constant instantaneous frequency of the carrier during that interval. Let us denote the instantaneous frequency of the nth infinitesimal
ʋm(NΔt) ʋm(nΔt) ʋst
(a)
ʋm
ʋm(2Δt) ʋm(Δt) ʋm(0) Interval
t = NΔt
Δt 0
t 1
……….
2
θc(t)
n
………. N–1
etc. 2πf2Δt
(b)
2πf1Δt
fn = fc + kfʋm(nΔt)
2πf0Δt ϕc
t
Figure 8.6 (a) Message signal v m (t) and staircase approximation v st (t); (b) increase in angle of carrier, of unmodulated frequency f c , when modulated in frequency by v st (t).
8.2 Basic Concepts of FM and PM
interval as f n , so that the carrier frequency is f 0 during the interval numbered 0, f 1 during the interval numbered 1, and so on. The carrier angle increases by 2𝜋f n Δt inside the nth interval, so that the angle of the carrier after time t = NΔt is 𝜃c (t) = 𝜙c + 2𝜋f0 Δt + 2𝜋f1 Δt + 2𝜋f2 Δt + · · · ∑
N−1
= 𝜙c + 2𝜋
(8.24)
fn Δt
n=0
Observe in Figure 8.6b that the modulating signal has a constant value vm (nΔt) during the nth interval. Thus, Eq. (8.4) gives the instantaneous frequency f n in this interval as fn = fc + kf vm (nΔt) Substituting this expression in Eq. (8.24) yields ∑
N−1
𝜃c (t) = 𝜙c + 2𝜋
(fc + kf vm (nΔt))Δt
n=0
∑
N−1
= 𝜙c + 2𝜋fc t + 2𝜋kf
vm (nΔt)Δt
(8.25)
n=0
In the limit Δt → 0, the staircase approximation becomes exact, and the above summation becomes an integration from 0 to t, so that t
𝜃c (t) = 2𝜋fc t + 𝜙c + 2𝜋kf
∫0
(8.26)
vm (t)dt
It follows from Eq. (8.17) that the instantaneous phase of the frequency modulated carrier is t
𝜙i = 𝜙c + 2𝜋kf
∫0
(8.27)
vm (t)dt
Now compare Eq. (8.27) with Eq. (8.14), repeated below for convenience, which gives the instantaneous phase of the same carrier when it is phase modulated by the same message signal vm (t) 𝜙i = 𝜙c + kp vm (t) Therefore, the relationships between FM and PM stated in terms of carrier phase variation are as follows: (i) Both PM and FM vary the carrier phase from its unmodulated value 𝜙c , which is usually set to zero. Phase variation in FM is, however, indirect and results from the fact that a change in the carrier frequency causes the carrier angle to rise to a level that is different from its unmodulated value, and this difference in angle can be accounted as a phase change. (ii) PM changes the carrier phase anytime that the message signal is nonzero, whereas FM changes the carrier phase at a given instant only if the average of all previous values of the signal is nonzero. (iii) The maximum instantaneous carrier phase occurs in PM at the instant that the modulating signal has its maximum value, whereas in FM it occurs at the instant that the average of all previous values of the signal is at a maximum. (iv) Phase deviation 𝜙d in FM depends on both the amplitude and frequency of the message signal, increasing with amplitude but decreasing with frequency. In PM, on the other hand, phase deviation depends only on the amplitude of the message signal. It can be shown that this is the case by substituting a sinusoidal message signal vm (t) = V m cos(2𝜋f m t) in Eqs. (8.14) and (8.27). This yields the result ⎧ ⎪𝜙c + kp Vm cos(2𝜋fm t), 𝜙i = ⎨ Vm ⎪𝜙c + kf f sin(2𝜋fm t), m ⎩
PM FM
541
542
8 Frequency and Phase Modulation
Integrator
Message ʋm(t) signal
ʋo(t)
t
∫ ʋm(t)dt 0
Figure 8.7
PM Modulator kp = 2πkf
FM signal
FM generation using PM modulator.
and hence 𝜙d =
{ kp Vm , kf
PM
Vm , fm
(8.28)
FM
Higher frequency components therefore produce smaller phase deviations in FM. (v) FM can be generated from a PM modulator using the arrangement shown in Figure 8.7. The PM modulator causes the signal vo (t) to phase modulate a carrier. This produces a PM signal with instantaneous phase 𝜙i = 𝜙c + kp vo (t) However t
vo (t) =
∫0
vm (t)dt
So that t
𝜙i = 𝜙c + kp
∫0
t
vm (t)dt = 𝜙c + 2𝜋kf
∫0
vm (t)dt
where we have used a modulator of phase sensitivity kp = 2𝜋kf . You will no doubt recognise this as Eq. (8.27) – the instantaneous phase of an FM signal. That is, the overall result of the arrangement in Figure 8.7 is that the message signal vm (t) has frequency modulated the carrier. FM has been achieved by passing the message signal through an integrator and then using the integrator output to phase modulate a carrier. In more practical terms, the integrator performs lowpass filtering that attenuates high-frequency components of vm (t) relative to the lower-frequency components. When this filtered signal vo (t) is fed into a PM modulator (whose phase deviation, you will recall, is proportional to modulating signal amplitude), the (attenuated) higher-frequency components will produce a smaller phase deviation than the deviation produced by lower-frequency components of (originally) similar amplitude. We showed above that this is a feature of FM, and therefore the overall result is that the carrier has been frequency modulated by vm (t). Worked Example 8.3 Determine (a) The phase deviation that arises in the FM implementation in Worked Example 8.1. (b) The frequency deviation experienced by the PM carrier in Worked Example 8.2. (a) What is required here is the maximum amount 𝜙d by which the phase of the frequency modulated carrier deviates from the phase 𝜙c of the unmodulated carrier. By Eq. (8.28) 𝜙d =
kf Vm fm
=
fd =𝛽 fm
(8.29)
This is an important result, which shows that the phase deviation (in rad) of an FM carrier equals its modulation index. Therefore, from Worked Example 8.1, 𝜙d = 3.33 rad.
8.3 FM and PM Waveforms
Note that in this example 𝜙d exceeds 𝜋 rad. There will, however, be no error at the receiver, as discussed earlier for PM, because an FM demodulator will be used, which only responds to the rate of change of carrier phase and not the phase magnitude or amount of phase change. (b) We wish to determine the maximum amount f d by which the frequency of the phase modulated carrier deviates from the frequency f c of the unmodulated carrier. By Eq. (8.23) fd = kp Vm fm = 𝜙d fm = 2.5 × 15 kHz = 37.5 kHz We see that the frequency deviation of a PM carrier is given by the product of the carrier phase deviation and the message signal frequency.
8.3 FM and PM Waveforms We studied AM waveforms in the previous chapter and saw that it was very easy to sketch them: you simply draw a sinusoid with the same number of cycles per second as the carrier but with its amplitude varied to match the message signal waveform. FM and PM waveforms are not as easy to sketch by hand except for the simple cases involving staircase message signals. However, armed with the basic concepts of angle modulation discussed above you can easily distinguish between oscilloscope displays of FM and PM waveforms. You can also readily plot the FM and PM waveforms for an arbitrary message signal with the aid of a computer.
8.3.1 Sketching Simple Waveforms When the message signal has a staircase shape then the FM and PM waveforms can be easily sketched, following the procedure discussed below through a specific example. It should be emphasised that the following approach is only valid for staircase message waveforms. However, this condition is not as restrictive as may first appear. Many message signals, including all digital signals (binary, ternary, quaternary, etc.) belong in this category. Consider, then, the simple staircase message signal vm (t) and carrier vc (t) = V c cos(2𝜋f c t + 𝜙c ) introduced in Eqs. (8.2) and (8.3). The carrier signal has amplitude V c = 2 V, frequency f c = 3 kHz, and initial phase 𝜙c = −𝜋/2 rad. We wish to sketch the FM and PM waveforms obtained when vm (t) modulates the frequency and phase of vc (t), respectively. We assume frequency sensitivity kf = 1 kHz/V and phase sensitivity kp = 𝜋/2 rad/V. Figure 8.8 shows the waveforms involved. The message signal is sketched at the top in (a). The unmodulated carrier is sketched in (b) over the duration of the message signal from t = 0 to t = 6 ms. This is easily done by noting that a frequency of 3 kHz means that the sinusoid completes three cycles in each 1 ms interval, and that an initial phase of −𝜋/2 rad simply converts the cosine sinusoid to a sine. The FM signal is sketched in (c). To do this, we determine, using Eq. (8.4), the instantaneous frequency f i of the modulated carrier within each interval over which the message signal has a constant value. The simplification made possible by the staircase nature of the message signal is that we can treat the FM signal as having constant amplitude and phase over all intervals and having a single value of instantaneous frequency in each interval. During the first 1 ms, vm (t) = 0 and f i = f c = 3 kHz. So, we sketch three cycles of a cosine sinusoid of amplitude 2 V, and phase −𝜋/2 – the amplitude and phase remain unchanged throughout. In the next 1 ms interval, vm (t) = 2 V and fi = fc + kf × 2 V = 3 kHz + 1 kHz∕V × 2 V = 5 kHz Therefore, we sketch five cycles of the sinusoid. Proceeding in this manner the sketch shown in (c) is completed. Eq. (8.5) gives a complete list of values of the instantaneous frequency. The staircase nature of the message signal again greatly simplifies the procedure for sketching the PM waveform shown in Figure 8.8d. We treat not only the amplitude but also the frequency of the carrier as constant throughout.
543
544
8 Frequency and Phase Modulation
2 ʋm(t), V 1
t
(a) –1 –2 ʋc(t), V
2 1
t
(b) –1 –2 ʋfm(t), V
2 1
t
(c) –1 –2 ʋpm(t), V
2 1
t
(d) –1 –2
0
Figure 8.8
1
2
3
4
5
6 → t (ms)
(a) Staircase modulating signal; (b) carrier; (c) frequency modulated carrier; (d) phase modulated carrier.
So we sketch in each 1 ms interval three cycles (because f c = 3 kHz) of a sinusoid of amplitude 2 V, but the phase 𝜙i in each interval is determined by the message signal value during that interval, according to Eq. (8.14). During the first 1 ms, vm (t) = 0 and 𝜙i = 𝜙c = −𝜋/2 rad. So, we sketch three cycles of a cosine sinusoid of amplitude 2 V and phase −𝜋/2 rad. In the next 1 ms interval, vm (t) = 2 V and 𝜙i = 𝜙c + kp × 2 V = −𝜋∕2 + 𝜋∕2 rad∕V × 2 V = 𝜋∕2 rad So we sketch the (cosine) sinusoid with a phase 𝜋/2 rad. Proceeding in this manner we complete the sketch shown in (d) using the complete list of values of 𝜙i given in Eq. (8.16). You may wish to review much of Section 2.7 if you have any difficulty with sketching sinusoids of different phases.
8.3.2 General Waveform Let an arbitrary message signal vm (t) modulate the angle of the sinusoidal carrier, given by Eq. (8.3), of amplitude V c , frequency f c , and initial phase 𝜙c vc (t) = Vc cos(2𝜋fc t + 𝜙c ) Based on the concepts discussed earlier, the resulting angle modulated signal is v(t) = Vc cos(2𝜋fc t + 𝜙i )
(8.30)
where 𝜙i is the instantaneous phase given by Eqs. (8.14) and (8.27). Specifically t ⎧ v (t)dt, ⎪𝜙c + 2𝜋kf ∫0 m 𝜙i = ⎨ ⎪𝜙c + kp vm (t), ⎩
FM PM
(8.31)
8.3 FM and PM Waveforms
Substituting Eq. (8.31) in Eq. (8.30) yields the following general expressions for the waveforms of a frequency modulated signal vfm (t) and a phase modulated signal vpm (t) [ ] t vfm (t) = Vc cos 2𝜋fc t + 𝜙c + 2𝜋kf vm (t)dt ∫0 vpm (t) = Vc cos[2𝜋fc t + 𝜙c + kp vm (t)]
(8.32)
Eq. (8.32) gives a complete specification of FM and PM signals in the time domain. It shows that they have the same constant amplitude V c as the unmodulated carrier, but a variable rate of completing each cycle, caused by the dependence of the instantaneous phase on the modulating signal. The unmodulated carrier, on the other hand, completes each cycle at a constant rate f c (cycles per second). Given a specification of the carrier and message signals, a reliable hand sketch of vfm (t) and vpm (t) is generally not possible. However, an accurate waveform can be displayed by using Eq. (8.32) to calculate the values of vfm (t) and vpm (t) at a sufficient number of time instants, and plotting these points on a graph. We will examine the result of this procedure applied to four different message signals, namely bipolar, sinusoidal, triangular, and arbitrary. The same carrier signal is used in the first three examples, with amplitude V c = 2 V, frequency f c = 10 kHz, and initial phase 𝜙c = 0 rad. The bipolar signal has a staircase waveform. It is used here to demonstrate that Eq. (8.32) is applicable to all types of message signals, including the simple staircase waveforms discussed earlier. One of our aims is to show that, although FM and PM both involve carrier frequency variation, the difference between their waveforms can be spotted by visual inspection. Figure 8.9 shows the plots for a bipolar message signal vm (t) of duration 2 ms and amplitude 1 V; using modulations of frequency sensitivity kf = 5 kHz/V and phase sensitivity kp = −𝜋/2 rad/V. A plot of vm (t) computed at a large number N of time instants ranging from t = 0 to 2 ms is shown in (a). The phase modulated signal vpm (t) is easily calculated at each time instant using Eq. (8.32), and this is plotted in (c) with the unmodulated carrier plotted in (b) for comparison. Calculating the frequency modulated signal vfm (t) is a little more involved. Specifically, to determine the value of vfm (t) at the time instant t = 𝜏, one numerically integrates vm (t) from t = 0 to t = 𝜏, obtaining a value of, say, V(𝜏), and then determines vfm (𝜏) as vfm (𝜏) = Vc cos[2𝜋fc 𝜏 + 𝜙c + 2𝜋kf V(𝜏)] The computation is done for all the N time instants in the range from 0 to 2 ms, where N must be large to make the numerical integration accurate. The signal vfm (t) computed as described here is plotted in Figure 8.9d. A detailed discussion of numerical integration is outside our scope. But if you have access to MATLAB, you can perform the entire computation for vfm (t) using the following single line of code vfm = Vc*cos(2*pi*fc*t + phic + 2*pi*kf*cumtrapz(t, vm))
(8.33)
To use Eq. (8.33), you must first create a vector of time instants t and a vector of corresponding message values vm, and assign values to the following variables: carrier amplitude Vc, carrier frequency fc, carrier initial phase phic, and frequency sensitivity kf. A typical code that would precede Eq. (8.33) could be N = 4096; t = linspace(0, 2, N)′ ; %N time instants in ms vm = [ones(N∕2, 1); -ones(N∕2, 1)]; %Values of bipolar signal Vc = 2; fc = 10; %Carrier amplitude (volt) and freq (kHz) phic = 0; kf = 5; %Initial phase (rad) and freq sensitivity (kHz∕V)
(8.34)
Observe in Figure 8.9 that the unmodulated carrier completes 10 cycles per ms, as expected of a 10 kHz sinusoidal signal. The difference between the waveforms of the phase modulated signal vpm (t) and the frequency modulated signal vfm (t) is obvious. The frequency of vpm (t) is the same (f c = 10 kHz) in each interval where vm (t) is constant,
545
546
8 Frequency and Phase Modulation
ʋm(t), V
1
t
(a) –1
ʋc(t), V
2
t
(b) –2
ʋpm(t), V
2
t
(c) –2
ʋfm(t), V
2
t
(d) –2 0
1
2 → t (ms)
Figure 8.9 Angle modulation by bipolar message signal. Waveforms of (a) message signal; (b) sinusoidal carrier; (c) PM signal; (d) FM signal.
whereas the frequency of vfm (t) is different from f c wherever vm (t) is nonzero, which in this case is at all time instants. Following the procedure discussed above, angle modulated waveforms are plotted in Figures 8.10–8.12 for different message signals. The carrier is the same as in Figure 8.9b, so its plot is omitted. Figure 8.10 shows a sinusoidal message signal of amplitude V m = 2 V and frequency f m = 1 kHz in (a). The PM waveform is shown in (b), for phase sensitivity kp = 4 rad/V, which according to Eq. (8.23) gives a frequency deviation of 8 kHz. The FM waveform shown in Figure 8.10c was calculated with frequency sensitivity kf = 4 kHz/V, which also yields a frequency deviation of 8 kHz. These values of kf and kp were chosen specifically to achieve a large and equal frequency deviation in both FM and PM. This gives a pronounced variation in instantaneous frequency that aids our discussion and allows us to demonstrate that the two waveforms have distinguishing features even when their frequency deviations are equal. However, the value of kp here is larger than would be used in practice, since it leads to a phase deviation 𝜙d = 8 rad, and hence a PM factor m larger than unity (m = 𝜙d /𝜋 = 2.55). You may wish to verify that, when FM and PM signals are obtained using the same carrier and the same message signal (of frequency f m ), the condition for their frequency deviations to be equal is given by kf = kp fm
(8.35)
It is interesting to observe how the instantaneous frequency of both the PM and FM waveforms in Figure 8.10 changes smoothly, because the modulating signal is non-staircase. The question then arises of how to identify which waveform is PM and which is FM. The identification can be done very straightforwardly using any one of several tests, all based on the discussion in Section 8.2.3.1. The simplest of these tests are as follows: (i) If the instantaneous frequency f i of the modulated waveform is maximum at the same instants in which the message signal is maximum then it is FM. Otherwise, it is PM. In Figure 8.10, the message signal is
8.3 FM and PM Waveforms
ʋm(t), V
2
t
(a) –2 ʋpm(t), V
2
t
(b) –2 ʋfm(t), V
2
t
(c) –2
0
Figure 8.10
0.25
0.5
0.75
1
1.25
1.5
1.75
2 → t (ms)
Angle modulation by sinusoidal message signal. Waveforms of (a) message signal; (b) PM signal; (c) FM signal.
maximum at t = 0, 1, and 2 ms. The waveform in (b) is not oscillating most rapidly at these instants, making it PM, whereas the waveform in (c) has a maximum oscillation rate at these instants, making it definitely FM. (ii) If f i is minimum wherever the modulating signal vm (t) is minimum then the waveform is FM; otherwise, it is PM. Looking again at Figure 8.10 we see that vm (t) is minimum at t = 0.5 and 1.5 ms. The waveform in (c) has the lowest rate of oscillation at these instants, making it FM, whereas waveform (b) does not, and is therefore PM. (iii) If f i is maximum wherever vm (t) is increasing most rapidly then the waveform is PM; otherwise, it is FM. Note that in Figure 8.10 waveform (b) has the largest oscillation rate at the time instants t = 0.75 and 1.75 ms, when the modulating signal is increasing most rapidly. (iv) Finally, if f i is minimum wherever vm (t) is decreasing most rapidly then the waveform is PM; otherwise, it is FM. Again, note that waveform (b) in Figure 8.10 has a minimum oscillation rate at t = 0.25 and 1.25 ms when vm (t) has the largest negative slope. Another interesting observation in Figure 8.10 is that both modulated waveforms complete the same number of cycles as the unmodulated carrier in any 1 ms interval. This is not a coincidence. As a rule, the average frequency of an FM signal is equal to the unmodulated carrier frequency f c in any time interval over which the modulating signal has a mean value of zero. A PM signal, on the other hand, has an average frequency equal to f c in any interval in which the modulating signal has an average rate of change (i.e. mean slope) equal to zero. Figure 8.11 shows the results for a triangular modulating signal. It is recommended that you verify that each of the waveform identification tests discussed above is valid in this case. Can you explain why the PM waveform has the same number of cycles as the unmodulated carrier over the interval t = 0 to 1 ms, whereas the FM waveform has a larger number of cycles? Figure 8.12 shows the modulated waveforms for an arbitrary and non-staircase modulating signal obtained with modulation parameters kf = 8 kHz/V, kp = 𝜋/2 rad/V, f c = 20 kHz, and 𝜙c = 0. You may wish to verify that the
547
548
8 Frequency and Phase Modulation
ʋm(t), V
2 (a)
1 t
0 ʋpm(t), V
2
t
(b) –2 ʋfm(t), V
2
t
(c) –2
0
Figure 8.11
0.25
0.5
0.75
1→ t (ms)
Angle modulation by triangular message signal. Waveforms of (a) message signal, (b) PM signal, (c) FM signal.
ʋm(t), V
2
t
(a) –2 ʋpm(t), V
2
t
(b) –2 ʋfm(t), V
2
t
(c) –2
0
0.25
0.5
0.75
1 → t (ms)
Figure 8.12 Angle modulation by an arbitrary non-staircase message signal. Waveforms of (a) message signal, (b) PM signal, and (c) FM signal.
8.4 Spectrum and Power of FM and PM
waveform identification tests are again valid. Neither the PM nor FM waveform has the same number of cycles (= 20) as the unmodulated carrier over the time interval from 0 to 1 ms. Can you explain why? The carrier frequency in the above modulation examples was selected to be low enough to allow clear illustrations of the cycles, but high enough to be sufficiently larger than the highest significant frequency component f m of the message signal. The latter condition is important to avoid distortion, as explained in the next section.
8.4 Spectrum and Power of FM and PM We are now able to discuss the important issues of spectrum and power in angle modulated signals, which leads naturally to bandwidth considerations. To simplify the discussion, let us assume a sinusoidal modulating signal vm (t) = V m cos(2𝜋f m t), and a carrier of initial phase 𝜙c = 0. Substituting these in Eq. (8.32) gives [ ] kf Vm vfm (t) = Vc cos 2𝜋fc t + sin(2𝜋fm t) fm vpm (t) = Vc cos[2𝜋fc t + kp Vm cos(2𝜋fm t)] But, from Eq.(8.29), the term kf V m /f m is the FM modulation index 𝛽; and, from Eq. (8.28), the term kp V m is the PM phase deviation 𝜙d . Therefore vfm (t)
= Vc cos[2𝜋fc t + 𝛽 sin(2𝜋fm t)]
vpm (t)
= Vc cos[2𝜋fc t + 𝜙d cos(2𝜋fm t)]
(8.36)
The time domain relationships between FM and PM have been discussed at length already. This and the striking similarity of the two expressions in Eq. (8.36) suggest that FM and PM will also be strongly related in the frequency domain. Let us explore this further by first assuming that the modulation index 𝛽 and phase deviation 𝜙d are small, which leads us to what is known as narrowband angle modulation. When the restriction on the size of 𝛽 and 𝜙d is removed, we have what is termed wideband angle modulation.
8.4.1 Narrowband FM and PM 8.4.1.1 Frequency Components
Let us expand the right-hand side of Eq. (8.36) using the compound angle identity of Eq. B.3 (Appendix B) vfm (t) = Vc cos(2𝜋fc t) cos[𝛽 sin(2𝜋fm t)] − Vc sin(2𝜋fc t) sin[𝛽 sin(2𝜋fm t)] vpm (t) = Vc cos(2𝜋fc t) cos[𝜙d cos(2𝜋fm t)] − Vc sin(2𝜋fc t) sin[𝜙d cos(2𝜋fm t)]
(8.37)
For sufficiently small angles 𝜃 (in rad), we may use the approximation cos(𝜃) ≈ 1 sin(𝜃) ≈ 𝜃 tan(𝜃) ≈ 𝜃
⎫ ⎪ ⎬, ⎪ ⎭
𝜃→0
(8.38)
Table 8.1 shows the error involved in this approximation for several small angles from 𝜃 = 0 to 𝜃 = 0.4 rad. The error increases as 𝜃 increases, and the cut-off point for using the approximations in Eq. (8.38) is usually taken as
549
550
8 Frequency and Phase Modulation
Errors in the approximations cos(𝜃) ≈ 1, sin(𝜃) ≈ 𝜃, and tan(𝜃) ≈ 𝜃, for small values of 𝜃 (rad).
Table 8.1 𝜃 (rad)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
cos𝜃
1
0.999
0.995
0.989
0.980
0.969
0.955
0.939
0.921
Approximation (x)
1
1
1
1
1
1
1
1
1
100(x−cos 𝜃) cos 𝜃
0
0.125
0.502
1.136
2.034
3.209
4.675
6.454
8.570
sin𝜃
0
0.050
0.010
0.149
0.199
0.247
0.296
0.343
0.389
Approximation (y)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
100(y−sin 𝜃) sin 𝜃
% error =
% error =
0
0.042
0.167
0.376
0.670
1.049
1.516
2.071
2.717
tan𝜃
0
0.050
0.100
0.151
0.203
0.255
0.309
0.365
0.423
Approximation (z) | 𝜃) | % Error = | 100(z−tan | | tan 𝜃 |
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0
0.083
0.334
0.751
1.337
2.092
3.018
4.117
5.391
𝜃 = 0.25 rad. At this point the approximation cos(𝜃) ≈ 1 introduces an error of 3.209%, sin(𝜃) ≈ 𝜃 produces a 1.049% error, and tan(𝜃) ≈ 𝜃 causes an error of 2.092%. It follows that in Eq. (8.37) when the conditions 𝛽 ≤ 0.25,
FM
𝜙d ≤ 0.25,
PM
(8.39)
are satisfied we may use the substitutions cos[𝛽 sin(2𝜋fm t)] ≈ 1 sin[𝛽 sin(2𝜋fm t)] ≈ 𝛽 sin(2𝜋fm t) cos[𝜙d cos(2𝜋fm t)] ≈ 1 sin[𝜙d cos(2𝜋fm t)] ≈ 𝜙d cos(2𝜋fm t)
(8.40)
to obtain vfm (t) ≈ Vc cos(2𝜋fc t) − 𝛽Vc sin(2𝜋fc t) sin(2𝜋fm t) ≡ vnbfm (t)
(8.41)
vpm (t) ≈ Vc cos(2𝜋fc t) − 𝜙d Vc sin(2𝜋fc t) cos(2𝜋fm t) ≡ vnbpm (t)
(8.42)
Using the trigonometric identities of Appendix B to expand the product terms on the right-hand side of Eqs. (8.41) and (8.42), replacing the sine terms in the resulting expression for vnbpm (t) with cosine according to the identity sin(𝜃) = cos(𝜃 − 𝜋/2), and then absorbing any negative sign that precedes a cosine term by using the identity −cos(𝜃) = cos(𝜃 + 𝜋), we finally obtain vnbfm (t) = Vc cos(2𝜋fc t) +
𝛽Vc cos[2𝜋(fc − fm )t + 𝜋] 2
𝛽Vc cos[2𝜋(fc + fm )t] 2 𝜙 V vnbpm (t) = Vc cos(2𝜋fc t) + d c cos[2𝜋(fc − fm )t + 𝜋∕2] 2 𝜙d Vc cos[2𝜋(fc + fm )t + 𝜋∕2] + 2 +
(8.43)
(8.44)
8.4 Spectrum and Power of FM and PM
The above signals are referred to as narrowband frequency modulation (NBFM) and narrowband phase modulation (NBPM), and result from angle modulation using a small modulation index 𝛽 (for FM) and a small phase deviation 𝜙d (for PM) as specified by Eq. (8.39). Under this condition, and for a sinusoidal modulating signal, the resulting angle modulated signal has only three frequency components, namely the carrier at f c , a lower side frequency (LSF) at f c − f m , and an upper side frequency (USF) at f c + f m . You will recall from the previous chapter that an AM signal vam (t) consists precisely of these same components vam (t) = Vc cos(2𝜋fc t) mV c mV c cos[2𝜋(fc − fm )t] + cos[2𝜋(fc + fm )t] (8.45) + 2 2 It is worthwhile to take a moment to examine Eqs. (8.43–8.45) in order to have a thorough understanding of narrowband angle modulation and its relationship with AM. 8.4.1.2 Comparing AM, NBFM, and NBPM 8.4.1.2.1 Amplitude Spectra
Figure 8.13 shows the amplitude spectrum of AM, NBFM, and NBPM. We see that they have the same bandwidth BAM = BNBFM = BNBPM = 2fm
(8.46)
where f m is the frequency of the message signal. The amplitude of side frequencies in each spectrum is VSF
⎧mV c ∕2, ⎪ = ⎨𝛽Vc ∕2, ⎪ ⎩𝜙d Vc ∕2,
AM NBFM
(8.47)
NBPM
The goal in AM design is to make modulation factor m close to but not exceed unity. However, the maximum value of modulation index 𝛽 (for NBFM) and phase deviation 𝜙d (for NBPM) is 0.25. Therefore, the maximum amplitude of each side frequency is
VSFmax
⎧Vc ∕2, ⎪ = ⎨Vc ∕8, ⎪ ⎩Vc ∕8,
AM NBFM
(8.48)
NBPM
8.4.1.2.2 Waveforms
The waveforms of AM, NBFM, and NBPM signals obtained with parameter values m = 𝛽 = 𝜙d = 0.25, V c = 1 V, f c = 40 kHz, and f m = 2 kHz are shown in Figure 8.14. It is interesting to note how very different the AM waveform vam (t) is from the angle modulated waveforms even though all three waveforms contain the same frequency components of the same amplitude. The amplitude of vam (t) varies markedly between (1 ± m)V c , whereas the angle modulated waveforms exhibit only a slight amplitude variation. This peculiar behaviour is due to differences in the phase relationships of the frequency components in each signal, which can be readily understood using phasors. 8.4.1.2.3 Phase Considerations
Equations (8.43)–(8.45) reveal that at the initial time (t = 0) AM has both the LSF and USF in phase with the carrier. NBFM has its USF in phase with the carrier, but its LSF is 180∘ out of phase. NBPM, on the other hand, has both its LSF and USF leading the carrier by 90∘ . This information is summarised in the phase spectra of Figure 8.15, which gives the initial phases (i.e. angle at t = 0) of all frequency components in the modulated signals. The implication of these phase relationships on the instantaneous amplitude of each signal is very significant. The best way to illustrate this is by the method of phasors introduced in Chapter 2 (Section 2.7.2).
551
552
8 Frequency and Phase Modulation
Vc (a)
Aam
Carrier m≤1 LSF
mVc /2
USF
fc – fm
Vc
A nbfm
fc
f c + fm
f
Carrier β ≤ 0.25
(b) LSF
βVc /2
USF
fc – fm
Vc
A nbpm
fc
f c + fm
f
Carrier ϕd ≤ 0.25
(c) ϕdVc /2
LSF fc – fm
Figure 8.13
USF fc
fc + fm
f
Single-sided amplitude spectrum of (a) AM, (b) NBFM, and (c) NBPM.
Figure 8.16 uses the phasor addition of the carrier, LSF, and USF to obtain the AM, NBFM, and NBPM signals in both amplitude and phase at various time instants. The amplitudes of the carrier, LSF, and USF are, respectively, denoted V c , V l , and V u . At t = 0, the LSF and USF of the AM signal both have zero phase, so their phasors point in the 0∘ direction. In the NBFM signal, the phase of the LSF is 𝜋 rad, and that of the USF is 0 rad, so their phasors point in the 180∘ and 0∘ directions, respectively. Similarly, the LSF and USF phasors of the NBPM signal both point in the 90∘ direction at t = 0. The phasor addition in each diagram involves drawing the phasors of the carrier, LSF, and USF in turn, one phasor starting from the endpoint of the previous one. The resultant signal is the phasor obtained by joining the start point of the first drawn phasor to the endpoint of the last drawn phasor. Note that phasors that would lie on top of each other in Figure 8.16 have been displaced to keep each one distinct and therefore visible in the diagram. To better understand Figure 8.16, you may wish to consider a useful analogy of three cars – L (for LSF), C (for carrier), and U (for USF) – starting at the same time (t = 0) and travelling anticlockwise on a circular road at different speeds. Speed is specified in units of cycles per second, i.e. how many rounds of the circular road a car
8.4 Spectrum and Power of FM and PM
1.25Vc Vc AM
t
0
–Vc –1.25Vc Vc
NBFM
t
0
–Vc Vc
NBPM
t
0
–Vc Figure 8.14 V c = 1 V.
1 ms duration of AM, NBFM and NBPM waveforms. Parameters: m = 𝛽 = 𝜙d = 0.25; f c = 40 kHz; f m = 2 kHz;
completes in 1 s. Assume that car L travels at speed f c − f m , car C at speed f c , and car U at speed f c + f m . Let car C be chosen as a reference point from which the other two cars are observed. The result will be that C appears stationary, L appears to be travelling clockwise at speed f m , and U appears to be travelling anticlockwise at the same speed f m . That is, L is falling behind at the same speed that U is edging ahead. One final point in our analogy is to imagine that there are three different starting-position scenarios, with car C always starting (at t = 0) from the starting point, called the 0 rad point. In what we may call the AM scenario, all three cars start together from the starting point. In the NBFM scenario, car L starts from a half-cycle ahead – at the 𝜋 rad point; and in the NBPM case, both cars L and U start at the 𝜋/2 rad point, which is one-quarter cycle ahead of the starting point. Returning then to the three frequency components – LSF, carrier, and USF – of the modulated signals, we choose the carrier (of frequency f c ) as reference, which allows the carrier to be represented as a zero-phase sinusoid always. This means that its phasor will always point horizontally to the right. On the other hand, the phasor of the USF (of frequency f c + f m ) will change its direction with time as though it were rotating anticlockwise at the rate of f m cycles per second. The phasor of the LSF (of frequency f c − f m ) will also change its direction with time, only that the change occurs as though the LSF were rotating clockwise at the rate of f m cycles per second. It takes the USF a time T m = 1/f m to complete one cycle or 2𝜋 rad – relative to the carrier of course. An arbitrary angle 𝜃 is therefore traversed in a time t=
𝜃 𝜃 T = 2𝜋 m 2𝜋fm
(8.49)
Figure 8.16 shows the phasor diagrams and resultant amplitudes for the three modulated waveforms at intervals of 1/4T m starting at t = 0. It follows from Eq. (8.49) that the USF advances (anticlockwise) by 𝜋/2 rad relative to
553
554
8 Frequency and Phase Modulation
ϕam 180°
(a)
90°
0°
fc – fm
fc
fc + fm
fc – fm
fc
fc + fm
fc – fm
fc
fc + fm
f
ϕnbfm 180° (b) 90°
0°
f
ϕnbpm 180° (c)
90°
0° Figure 8.15
f
Single-sided phase spectra of (a) AM, (b) NBFM, and (c) NBPM.
the carrier during this interval, whereas the LSF retreats (clockwise) by the same angle. By studying Figure 8.16 carefully, you will see why AM has a significant amplitude variation, whereas both NBFM and NBPM have a much smaller amplitude variation. In AM, the initial phases of the LSF and USF relative to the carrier are such that they add constructively in phase with the carrier at t = nT m , and in opposition to the carrier at t = (n + 1/2)T m , for n = 0, 1, 2, …, giving the AM signal a maximum amplitude V am max and minimum amplitude V am min , respectively, where Vam
max
= Vc + Vl + Vu = Vc +
mV c mV c + 2 2
= Vc (1 + m) Vam
min
= Vc − (Vl + Vu ) = Vc (1 − m)
(8.50)
Furthermore, the AM signal is always in phase with the carrier because the initial phases of the LSF and USF are such that their quadrature components are always equal and opposite.
8.4 Spectrum and Power of FM and PM
(a) AM
t
(b) NBFM
Vl = Vu = mVc/2
(c) NBPM
Vl = Vu = βVc/2
Vl = Vu = ϕdVc/2 Vl
Vc
Vl
Vc
ʋam = Vc(1 + m)∠0°
Tm 4
Vc
Vl
ʋam = Vc ∠0°
Tm 2
3T m 4
Tm
Vl 2
Vc
)∠tan
–1β
Vu
Vl
ʋam = Vc(1 + m)∠0°
Vu
)
Vu
2
)∠–tan
–1β
ʋnbpm = Vc ∠0°
Vl
Vu
Vl Vu
Vc
Vu
V
Vu
ʋnbfm = Vc ∠0°
Vl
Vl
(
Vl
Vc
Vu
ʋnbpm = V∠–θ = Vc 1 + ϕd 2 ∠–tan–1ϕ d
V
(
Vc
V
Vl
Vc
ʋnbfm = V∠–θ = Vc 1 + β
Vl
)
ʋnbpm = Vc ∠0° θ
ʋnbfm = Vc ∠0° θ
Vl
Vc
Vu
Vu
ʋam = Vc ∠0° Vc
(
Vc
(
ʋam = Vc(1 – m)∠0°
Vc
Vu
ʋnbfm = V∠θ = Vc 1 + β
Vc
Vc
θ
ʋnbfm = V∠ θ = Vc 1 +ϕd 2 ∠tan–1ϕd
V θ
Vl
Vu
ʋnbfm = Vc ∠0° Vu
Vu
V
Vu
0
Vc
θ
(
Vl
)
ʋnbfm = V∠θ = Vc 1 + ϕd 2 ∠tan–1ϕ d
Figure 8.16 Phasor diagrams at intervals of T m /4 for (a) AM, (b) NBFM and (c) NBPM signals. Parameters: Message signal frequency f m (= 1/T m ); carrier amplitude V c ; AM modulation factor m; FM modulation index 𝛽; and PM phase deviation 𝜙d .
In NBFM and NBPM, on the other hand, whenever one side frequency is in phase with the carrier, the other is exactly in opposition. The minimum amplitude of the resultant signal is therefore equal to the carrier amplitude V c . The maximum amplitude occurs when the LSF and USF add constructively at 90∘ to the carrier. Therefore Vnbfm min = Vnbpm min = Vc √ ) ( √ 𝛽Vc 𝛽Vc 2 + Vnbfm max = Vc 2 + = Vc 1 + 𝛽 2 2 2 √ Vnbpm max = Vc 1 + 𝜙2d
555
(8.51)
Note in Figure 8.16 that the NBFM and NBPM waveforms do not attain maximum or minimum together, but at different time instants separated by 1/4T m .
8 Frequency and Phase Modulation
We see that the maximum peak-to-peak amplitude variation in NBFM or NBPM is only 3% of the carrier amplitude, when the modulation index or phase deviation is at the maximum value of 0.25 allowed by the narrowband criterion of Eq. (8.39). This should be compared with AM, which has a peak-to-peak amplitude variation given by (200 × m)% of the carrier. This is 50% at m = 0.25, and 200% when m has its maximum allowed value m = 1, at which point the modulated carrier amplitude varies between 0 and 2V c . Thus, the effect of the above differences in phase relationships is to cause the amplitude variation in an AM waveform to be at least 16 times greater than in NBFM and NBPM even when the three signals have identical amplitude spectra. 8.4.1.3 Amplitude Variations in NBFM and NBPM
General expressions for the envelope of the three modulated waveforms can be obtained through phasor addition. Considering Figure 8.17, we see that the resultant phasor representing each modulated waveform at a time t when each side frequency has rotated through angle 𝜃 (relative to the carrier) is given by Vam = (Vc + 2x)∠0 = Vc [1 + m cos 𝜃] ∠0 ] [√ [ √ ] 2 2 2 2 Vc + (2y) ∠𝜙fm = Vc 1 + 𝛽 sin 𝜃 ∠tan−1 (𝛽 sin 𝜃) Vnbfm = ] [√ [ √ ] Vnbpm = Vc2 + (2z)2 ∠𝜙pm = Vc 1 + 𝜙2d cos2 𝜃 ∠tan−1 (𝜙d cos 𝜃) Using Eq. (8.49) to express 𝜃 as a function of time, and employing the approximation 𝜃 ≈ tan−1 (𝜃) suggested in Eq. (8.38), we obtain fm carrier
(a)
F
US
V θ x θ
Vc
x
x = V cos θ =
mVc cos θ 2
F
LS
fm
fm
F
(b)
ϕnbfm carrier
Vc
θ
y
fm
V θ
βVc sin θ 2 –1 ϕnbpm = tan (2y/Vc) = tan–1 (β sin θ) y = V sin θ =
m
V nbp U
F
y
m
V nbf LS
SF
(c)
ϕnbpm carrier
z fm z θθ V
LS F
Vam
US
556
fm
ϕdVc cos θ 2 –1 ϕnbpm = tan (2z/Vc) = tan–1 (ϕd cos θ) y = V cos θ =
Figure 8.17 LSF, carrier and USF phasors at t = 1/(8f m ) for (a) AM, (b) NBFM and (c) NBPM. The carrier remains fixed (as a reference) while the LSF and USF phasors rotate at the same speed but in opposite directions as indicated.
8.4 Spectrum and Power of FM and PM
] [ Vam = Vc 1 + m cos(2𝜋fm t) ∠0 [ √ ] Vnbfm = Vc 1 + 𝛽 2 sin2 (2𝜋fm t) ∠𝛽 sin(2𝜋fm t) [ √ ] Vnbpm = Vc 1 + 𝜙2d cos2 (2𝜋fm t) ∠𝜙d cos(2𝜋fm t)
(8.52)
Equation (8.52) gives the AM, NBFM, and NBPM signals in phasor form. It states, for example, that vnbfm (t) √ has amplitude Vc 1 + 𝛽 2 sin2 (2𝜋fm t), and phase 𝛽 sin(2𝜋fm t) relative to the carrier signal used as reference in Figure 8.17. We may therefore write vam (t) = Vc [1 + m cos(2𝜋fm t)] cos(2𝜋fc t) [ √ ] 2 2 vnbfm (t) = Vc 1 + 𝛽 sin (2𝜋fm t) cos[2𝜋fc t + 𝛽 sin(2𝜋fm t)] [ √ ] 2 2 vnbpm (t) = Vc 1 + 𝜙d cos (2𝜋fm t) cos[2𝜋fc t + 𝜙d cos(2𝜋fm t)]
(8.53)
Equation (8.53) reveals that in AM only the carrier amplitude is varied by the modulating signal, with up to 200% peak-to-peak variation possible. NBFM and NBPM involve up to a maximum of about 3% peak-to-peak amplitude variation, as well as angle modulation in which the instantaneous phase 𝜙i of the carrier is a function of the modulating signal given by { 𝛽 sin(2𝜋fm t), NBFM (8.54) 𝜙i = 𝜙d cos(2𝜋fm t), NBPM You may wish to verify that Eqs. (8.43–8.45) are equivalent to Eq. (8.53) by evaluating them at selected time instants. However, Eq. (8.53) explicitly gives the envelope and phase of the modulated waveforms at any given instant. The AM signal conveys information in its envelope, whereas NBFM and NBPM convey information in the variation of their angles beyond the linear increase of unmodulated carriers. Note that the variation in NBFM and NBPM amplitudes is due to the error introduced by the approximations of Eq. (8.40).
8.4.2 Wideband FM and PM When the modulation index 𝛽 (in FM) and phase deviation 𝜙d (in PM) exceed about 0.25, the narrowband angle modulation approximations are no longer valid. To determine the bandwidth requirements of a carrier that is angle modulated by a sinusoidal message signal, we must know the frequency, amplitude, and phase of each sinusoidal component contained in the angle modulated signals vfm (t) and vpm (t) in Eq. (8.36). It turns out by Fourier analysis (see Question 8.9) that vfm (t) and vpm (t) can be written as a sum of sinusoids as follows vfm (t) = Vc Jo (𝛽) cos(2𝜋fc t) + Vc J1 (𝛽){cos[2𝜋(fc + fm )t] + cos[2𝜋(fc − fm )t + 𝜋]} + Vc J2 (𝛽){cos[2𝜋(fc + 2fm )t] + cos[2𝜋(fc − 2fm )t]} + Vc J3 (𝛽){cos[2𝜋(fc + 3fm )t] + cos[2𝜋(fc − 3fm )t + 𝜋]} + Vc J4 (𝛽){cos[2𝜋(fc + 4fm )t] + cos[2𝜋(fc − 4fm )t]} + Vc J5 (𝛽){cos[2𝜋(fc + 5fm )t] + cos[2𝜋(fc − 5fm )t + 𝜋]} +···
(8.55)
557
558
8 Frequency and Phase Modulation
vpm (t) = Vc Jo (𝜙d ) cos(2𝜋fc t) + Vc J1 (𝜙d ){cos[2𝜋(fc + fm )t + 𝜋∕2] + cos[2𝜋(fc − fm )t + 𝜋∕2]} + Vc J2 (𝜙d ){cos[2𝜋(fc + 2fm )t + 𝜋] + cos[2𝜋(fc − 2fm )t + 𝜋]} + Vc J3 (𝜙d ){cos[2𝜋(fc + 3fm )t − 𝜋∕2] + cos[2𝜋(fc − 3fm )t − 𝜋∕2]} + Vc J4 (𝜙d ){cos[2𝜋(fc + 4fm )t] + cos[2𝜋(fc − 4fm )t]} + Vc J5 (𝜙d ){cos[2𝜋(fc + 5fm )t + 𝜋∕2] + cos[2𝜋(fc − 5fm )t + 𝜋∕2]} +···
(8.56)
Equations (8.55) and (8.56) are applicable to both narrowband (𝛽 ≤ 0.25, 𝜙d ≤ 0.25) and wideband angle modulation (𝛽 > 0.25, 𝜙d > 0.25). They provide a complete insight into the spectrum of FM and PM signals involving a sinusoidal modulating signal. It is worthwhile to examine in some detail their implications. The expressions for vfm (t) and vpm (t) reveal that FM and PM have the same amplitude spectrum, with phase deviation 𝜙d playing the same role in PM that modulation index 𝛽 does in FM. If PM and FM are implemented with 𝜙d = 𝛽, and with the same values of carrier amplitude V c and modulating frequency f m , then their amplitude spectra are identical. However, PM differs from FM in its phase spectrum, which accounts for the waveform differences discussed earlier. Ignore for the moment the sign of the amplitude of each frequency component – this will introduce a phase shift of 𝜋 rad if negative. We see from Eq. (8.55) that in FM, the USFs, namely f c + f m , f c + 2f m , … and the even-numbered LSFs, e.g. f c − 2f m , all have zero initial phase, and only the odd-numbered LSFs, differ in phase by 𝜋 rad. In PM, Eq. (8.56) shows that each USF and the corresponding LSF (e.g. f c + 2f m and f c − 2f m ) have the same phase, which increases in steps of 𝜋/2 rad starting from zero phase at the carrier. That is, the phase of the nth pair of side frequencies is n𝜋/2. For example, the phase of the fourth pair of side frequencies is 4 × 𝜋/2 = 2𝜋 ≡ 0 rad, and the phase of the third pair of side frequencies is 3 × 𝜋/2 = 3𝜋/2 ≡ −𝜋/2 rad. In view of the above similarities, the following discussion is focussed entirely on FM. The results on spectrum, bandwidth, and power are applicable to PM, with phase deviation 𝜙d replacing modulation index 𝛽. Recall, however, that, to avoid distortion in phase demodulators that track absolute phase, 𝜙d is limited to values not exceeding 𝜋 rad, whereas 𝛽 has no theoretical restriction on its values, except that transmission bandwidth requirement increases as 𝛽 is increased. Phase spectrum is not discussed further, but this can be easily obtained where desired, by following the above comments. 8.4.2.1 Spectrum
It follows from Eq. (8.55) that an FM signal vfm (t) contains the carrier frequency f c and an infinite set of side frequencies, which occur in pairs on either side of f c . The spectral components are spaced apart by the modulating signal frequency f m , and the nth pair of side frequencies consists of the nth LSF at f c − nf m and the nth USF at f c + nf m , both of which have the same amplitude. Figure 8.18 gives an example of an FM spectrum, which should be compared with the spectrum in AM. AM contains only the carrier and the first pair of side frequencies, whereas FM contains an infinite set of side frequencies in addition to the carrier. So, does FM have infinite bandwidth? It will become obvious in the following discussion that the amplitude of the side frequencies f c ± nf m at sufficiently high n is negligible, so that FM bandwidth is indeed finite. In fact, an FM signal can be realised with little distortion if only the first few side frequencies are retained. For example, Figure 8.19 shows the synthesis of an FM signal, of modulation index 𝛽 = 3, using only the carrier and the first few side frequencies. Observe that the distortion is significant when only the first two pairs of side frequencies are included, but negligible when up to the first five pairs are retained. The number of side frequencies required to attain negligible distortion in the synthesised waveform increases as 𝛽 is increased.
8.4 Spectrum and Power of FM and PM
2nd LSF
A fm
2nd USF 1st LSF
3rd LSF
1st USF
3rd USF
carrier
4th LSF
fc – 4fm Figure 8.18
4th USF
fc – 3fm
fc – 2fm
fc – fm
fc
fc + fm
fc + 2fm
fc + 3fm
fc + 4fm
f
Example of single-sided amplitude spectrum of FM signal.
(a)
FM signal
Synthesised signal
(b)
Figure 8.19 Synthesis of FM waveform using carrier plus (a) first two pairs of side frequencies, and (b) first five pairs of side frequencies. FM waveform parameters: 𝛽 = 3.0, f c = 10 kHz, f m = 1 kHz.
The amplitude of the nth pair of side frequencies is given by the unmodulated carrier amplitude V c scaled by the factor J n (𝛽), which is the nth order Bessel function of the first kind evaluated with argument 𝛽. That is, an FM signal generated with modulation index 𝛽 will contain several frequency components, namely (i) The unmodulated carrier frequency f c , which may be viewed as the 0th (zeroth) side frequency (f c ± 0f m ), of amplitude V c J 0 (𝛽). (ii) The first pair of side frequencies (f c + f m and f c − f m ) of amplitude V c J 1 (𝛽). (iii) The second pair of side frequencies (f c + 2f m and f c − 2f m ) of amplitude V c J 2 (𝛽). (iv) And so on.
559
560
8 Frequency and Phase Modulation
1
J0(β)
J1(β) J2(β)
0.5
J3(β)
J4(β) J (β) 5 J6(β) J (β) J (β) 7 J9(β) 8
J10(β)
β
0
–0.4
0
5 β = 2.4048
Figure 8.20
β = 5.5201
15
10 β = 8.6537
β = 11.7915
β = 14.9309
Graph of Bessel functions Jn (𝛽) versus 𝛽 for various values of n.
It is therefore important to understand the characteristic of the Bessel function. Figure 8.20 shows a plot of J n (𝛽) as a function of 𝛽 for various integer values of n. To understand this diagram, assume a normalised carrier of unit amplitude (V c = 1). Then the curve labelled J 0 (𝛽) gives the amplitude of the carrier component (0th side frequency) in the FM signal as a function of the modulation index 𝛽. If you follow this curve through the plot, you will observe that J 0 (𝛽) = 0 at modulation index 𝛽 = 2.4048, 5.5201, 8.6537, 11.7915, 14.9309, etc. At these values of modulation index the FM spectrum does not contain a component at the unmodulated carrier frequency f c . Energy in the unmodulated carrier has been entirely distributed to the side frequencies. We will have more to say about this later. Note also that J 0 (𝛽) is negative for certain values of modulation index, e.g. for 𝛽 between 2.4048 and 5.5201. The effect of a negative value of J n (𝛽) is that the phase of the nth pair of side frequencies is advanced by 𝜋 rad from that indicated in Eq. (8.55). Therefore, when dealing with the amplitude spectrum, only the absolute value of J n (𝛽) is considered, and its sign is ignored since that only affects the phase spectrum. Considering the other J n (𝛽) curves in Figure 8.20, a trend can be observed that, as n increases, the curves remain near zero until a larger value of 𝛽 is reached. For example, the value of J 4 (𝛽) is negligible until 𝛽 has increased to about 1.1, whereas J 10 (𝛽) is negligible up to 𝛽 ≈ 5.5. This is an important characteristic of J n (𝛽), which signifies that the higher-side frequencies (n large) are insignificant in the FM spectrum when the modulation index is low. As modulation index increases, more side frequencies become significant. In other words, FM bandwidth increases with modulation index 𝛽. In fact, Figure 8.20 shows that when 𝛽 ≤ 0.25 then only J 0 (𝛽) and J 1 (𝛽) have significant values. This means that only the carrier and the first pair of side frequencies f c ± f m are noticeably present in the spectrum. You should recognise this as the special case of NBFM discussed earlier. Values of J n (𝛽) can be more easily read from tables such as Table 8.2, which is given accurate to four decimal places, although an accuracy of two decimal places is sufficient for practical calculations. 8.4.2.2 Power
Equation (8.30) gives the general expression for an FM signal, and shows that it is a sinusoidal signal of constant amplitude V c but variable or modulated instantaneous phase. Bearing in mind that the power in a sinusoidal signal
Table 8.2
Bessel function to four decimal places.
𝜷
J0 (𝜷)
0
1.0000
0.10
0.9975
0.20
0.9900
0.25
J2 (𝜷)
J3 (𝜷)
J4 (𝜷)
J5 (𝜷)
J6 (𝜷)
J7 (𝜷)
J8 (𝜷)
J9 (𝜷)
J10 (𝜷)
J11 (𝜷)
J12 (𝜷)
J13 (𝜷)
J14 (𝜷)
J15 (𝜷)
J16 (𝜷)
—
—
—
—
—
—
—
—
—
—
—
—
—
—
—
—
0.0499
0.0012
—
—
—
—
—
—
—
—
—
—
—
—
—
—
0.0995
0.0050
0.0002
—
—
—
—
—
—
—
—
—
—
—
—
—
0.9844
0.1240
0.0078
0.0003
—
—
—
—
—
—
—
—
—
—
—
—
—
0.50
0.9385
0.2423
0.0306
0.0026
0.0002
—
—
—
—
—
—
—
—
—
—
—
—
1.0
0.7652
0.4401
0.1149
0.0196
0.0025
0.0002
—
—
—
—
—
—
—
—
—
—
—
1.5
0.5118
0.5579
0.2321
0.0610
0.0118
0.0018
0.0002
—
—
—
—
—
—
—
—
—
—
2.0
0.2239
0.5767
0.3528
0.1289
0.0340
0.0070
0.0012
0.0002
—
—
—
—
—
—
—
—
—
—
0.5192
0.4318
0.1990
0.0647
0.0164
0.0034
0.0006
0.0001
—
—
—
—
—
—
—
—
3.0
−0.2601
0.3391
0.4861
0.3091
0.1320
0.0430
0.0114
0.0025
0.0005
0.0001
—
—
—
—
—
—
—
4.0
−0.3971 −0.0660
0.3641
0.4302
0.2811
0.1321
0.0491
0.0152
0.0040
0.0009
0.0002
—
—
—
—
—
—
5.0
−0.1776 −0.3276
0.0466
0.3648
0.3912
0.2611
0.1310
0.0534
0.0184
0.0055
0.0015
0.0004
0.0001
—
—
—
—
−0.3403 −0.1233
0.2509
0.3960
0.3230
0.1891
0.0881
0.0344
0.0116
0.0035
0.0009
0.0002
—
—
—
—
6.0
0.1506 −0.2767 −0.2429
0.1148
0.3576
0.3621
0.2458
0.1296
0.0565
0.0212
0.0070
0.0020
0.0005
0.0001
—
—
—
7.0
0.3001 −0.0047 −0.3014 −0.1676
0.1578
0.3479
0.3392
0.2336
0.1280
0.0589
0.0235
0.0083
0.0027
0.0008
0.0002
0.0001
—
8.0
0.1717
0.2346 −0.1130 −0.2911 −0.1054
0.1858
0.3376
0.3206
0.2235
0.1263
0.0608
0.0256
0.0096
0.0033
0.0010
0.0003
0.0001
9.0
−0.0903
0.2453
0.1448 −0.1809 −0.2655 −0.0550
0.2043
0.3275
0.3051
0.2149
0.1247
0.0622
0.0274
0.0108
0.0039
0.0013
0.0004
10
−0.2459
0.0435
0.2546
0.0584 −0.2196 −0.2341 −0.0145
0.2167
0.3179
0.2919
0.2075
0.1231
0.0634
0.0290
0.0120
0.0045
0.0016
11
−0.1712 −0.1768
0.1390
0.2273 −0.0150 −0.2383 −0.2016
2.4048
5.5201
—
J1 (𝜷)
0.0184
0.2250
0.3089
0.2804
0.2010
0.1216
0.0643
0.0304
0.0130
0.0051
12
0.0477 −0.2234 −0.0849
0.1951
0.1825 −0.0735 −0.2437 −0.1703
0.0451
0.2304
0.3005
0.2704
0.1953
0.1201
0.0650
0.0316
0.0140
13
0.2069 −0.0703 −0.2177
0.0033
0.2193
0.1316 −0.1180 −0.2406 −0.1410
0.0670
0.2338
0.2927
0.2615
0.1901
0.1188
0.0656
0.0327
14
0.1711
0.1334 −0.1520 −0.1768
0.0762
0.2204
0.0812 −0.1508 −0.2320 −0.1143
0.0850
0.2357
0.2855
0.2536
0.1855
0.1174
0.0661
15
−0.0142
0.2051
0.0416 −0.1940 −0.1192
0.1305
0.2061
0.0345 −0.1740 −0.2200 −0.0901
0.1000
0.2367
0.2787
0.2464
0.1813
0.1162
16
−0.1749
0.0904
0.1124
0.2368
0.2724
0.2399
0.1775
17
−0.1699 −0.0977
0.1537 −0.0429 −0.1991 −0.1914 −0.0486
0.1228
0.2364
0.2666
0.2340
18
−0.0134 −0.1880 −0.0075
0.1316
0.2356
0.2611
19
0.1466 −0.1057 −0.1578
0.0916 −0.0984 −0.2055 −0.1612 −0.0151
0.1389
0.2345
20
0.1670
0.0668 −0.1603 −0.0989
0.1307
21
0.0366
0.1711 −0.0203 −0.1750 −0.0297
0.1637
22
−0.1207
0.1172
23
−0.1624 −0.0395
0.1590
24
−0.0562 −0.1540
0.0434
25
0.1862 −0.0438 −0.2026 −0.0575
0.1667
0.1825 −0.0070 −0.1895 −0.2062 −0.0682
0.1584
0.0007
0.1875
0.1863
0.0696 −0.1554 −0.1560
0.0514
0.1959
0.1228 −0.0732 −0.2041 −0.1762 −0.0309
0.0725
0.1806
0.0036 −0.1788 −0.1165
0.0929
0.1947
0.1512 −0.0551 −0.1842 −0.0739
0.1349 −0.1107 −0.1870
0.1251
0.1865
0.0614 −0.1190 −0.2041 −0.1464 −0.0008
0.1452
0.1076 −0.1022 −0.1757 −0.0318
0.1485
0.1732
0.0120
0.0363
0.1733
0.0582 −0.1362 −0.1573
0.0075
0.1641
0.1566
0.0067 −0.1487 −0.1959 −0.1185
0.0672 −0.1415 −0.1164
0.0909
0.1638
0.0088 −0.1576 −0.1322
0.0427
0.1730
0.1379 −0.0172 −0.1588 −0.1899
0.1613 −0.0031 −0.1623 −0.0645
0.1300
0.1404 −0.0364 −0.1677 −0.1033
0.0730
0.1763
0.1180 −0.0386 −0.1663
0.1081 −0.0752 −0.1682 −0.0729
0.0983
0.1751
0.1313 −0.0933 −0.1568
0.0963 −0.1254 −0.1063
0.1083
0.1323 −0.0660 −0.1587 −0.0102
0.1530
0.0329 −0.1356 −0.2008 −0.1321
0.0978 −0.0577 (Continued)
Table 8.2 𝜷
(Continued) J17 (𝜷)
J18 (𝜷)
J19 (𝜷)
J20 (𝜷)
J21 (𝜷)
J22 (𝜷)
J23 (𝜷)
J24 (𝜷)
J25 (𝜷)
J26 (𝜷)
J27 (𝜷)
J28 (𝜷)
J29 (𝜷)
J30 (𝜷)
0
—
—
—
—
—
—
—
—
—
—
—
—
—
—
0.10
—
—
—
—
—
—
—
—
—
—
—
—
—
—
0.20
—
—
—
—
—
—
—
—
—
—
—
—
—
—
0.25
—
—
—
—
—
—
—
—
—
—
—
—
—
—
0.50
—
—
—
—
—
—
—
—
—
—
—
—
—
—
1.0
—
—
—
—
—
—
—
—
—
—
—
—
—
—
1.5
—
—
—
—
—
—
—
—
—
—
—
—
—
—
2.0
—
—
—
—
—
—
—
—
—
—
—
—
—
—
2.4048
—
—
—
—
—
—
—
—
—
—
—
—
—
—
3.0
—
—
—
—
—
—
—
—
—
—
—
—
—
—
4.0
—
—
—
—
—
—
—
—
—
—
—
—
—
—
5.0
—
—
—
—
—
—
—
—
—
—
—
—
—
—
5.5201
—
—
—
—
—
—
—
—
—
—
—
—
—
—
6.0
—
—
—
—
—
—
—
—
—
—
—
—
—
—
7.0
—
—
—
—
—
—
—
—
—
—
—
—
—
—
8.0
—
—
—
—
—
—
—
—
—
—
—
—
—
—
9.0
0.0001
—
—
—
—
—
—
—
—
—
—
—
—
—
10
0.0005
0.0002
—
—
—
—
—
—
—
—
—
—
—
—
11
0.0019
0.0006
0.0002
0.0001
—
—
—
—
—
—
—
—
—
—
12
0.0057
0.0022
0.0008
0.0003
0.0001
—
—
—
—
—
—
—
—
—
13
0.0149
0.0063
0.0025
0.0009
0.0003
0.0001
—
—
—
—
—
—
—
—
14
0.0337
0.0158
0.0068
0.0028
0.0010
0.0004
0.0001
—
—
—
—
—
—
—
15
0.0665
0.0346
0.0166
0.0074
0.0031
0.0012
0.0004
0.0002
0.0001
—
—
—
—
—
16
0.1150
0.0668
0.0354
0.0173
0.0079
0.0034
0.0013
0.0005
0.0002
0.0001
—
—
—
—
17
0.1739
0.1138
0.0671
0.0362
0.0180
0.0084
0.0037
0.0015
0.0006
0.0002
0.0001
—
—
—
18
0.2286
0.1706
0.1127
0.0673
0.0369
0.0187
0.0089
0.0039
0.0017
0.0007
0.0003
0.0001
—
—
19
0.2559
0.2235
0.1676
0.1116
0.0675
0.0375
0.0193
0.0093
0.0042
0.0018
0.0007
0.0003
0.0001
—
20
0.2331
0.2511
0.2189
0.1647
0.1106
0.0676
0.0380
0.0199
0.0098
0.0045
0.0020
0.0008
0.0003
0.0001
21
0.1505
0.2316
0.2465
0.2145
0.1621
0.1097
0.0677
0.0386
0.0205
0.0102
0.0048
0.0021
0.0009
0.0004
22
0.0236
0.1549
0.2299
0.2422
0.2105
0.1596
0.1087
0.0677
0.0391
0.0210
0.0106
0.0051
0.0023
0.0010
23
−0.1055
0.0340
0.1587
0.2282
0.2381
0.2067
0.1573
0.1078
0.0678
0.0395
0.0215
0.0110
0.0054
0.0025
24
−0.1831
−0.0931
0.0435
0.1619
0.2264
0.2343
0.2031
0.1550
0.1070
0.0678
0.0399
0.0220
0.0114
0.0056
25
−0.1717
−0.1758
−0.0814
0.0520
0.1646
0.2246
0.2306
0.1998
0.1529
0.1061
0.0678
0.0403
0.0225
0.0118
8.4 Spectrum and Power of FM and PM
depends only on its amplitude, and is independent of frequency and phase, it follows that power in an FM signal equals the power in the unmodulated carrier. Working with normalised power, we may write 1 2 (8.57) V 2 c Total power in an FM signal can also be obtained by summing the power in all spectral components in the FM spectrum. Thus 1 Pfm = Vc2 J02 (𝛽) + Vc2 J12 (𝛽) + Vc2 J22 (𝛽) + Vc2 J32 (𝛽) + · · · 2 [ ] 1 (8.58) = Vc2 J02 (𝛽) + J12 (𝛽) + J22 (𝛽) + J32 (𝛽) + · · · 2 Obviously, the power in the side frequencies is obtained only at the expense of the carrier power. Equating the two expressions for Pfm in Eqs. (8.57) and (8.58), we obtain the mathematical identity Pfm =
∑
n=∞
J02 (𝛽) + 2
Jn2 (𝛽) = 1
(8.59)
n=1
It follows that the fraction of power r N contained in the carrier frequency and the first N pairs of side frequencies is given by ∑
n=N
rN = J02 (𝛽) + 2
Jn2 (𝛽)
(8.60)
n=1
8.4.2.3 Bandwidth
Two different definitions of FM bandwidth are in common use, namely Carson’s bandwidth and 1% bandwidth. Carson’s bandwidth is an empirical definition, which gives the width of the band of frequencies centred at f c that contains at least 98% of the power in the FM signal. The 1% bandwidth is defined as the separation between the pair of side frequencies of order nmax beyond which there is no spectral component with amplitude up to 1% of the unmodulated carrier amplitude V c . In Figure 8.21, we employ Eq. (8.60) to show, at various integer values of modulation index 𝛽, the fraction of power r N carried by spectral components that include f c and N pairs of side frequencies. It can be observed that with N = 𝛽 + 1 the minimum fraction of power is r N = 0.9844. With N = 𝛽, r N = 0.9590 minimum (at 𝛽 = 6). With N = 𝛽 + 2, r N = 0.9905 (at 𝛽 = 361, a point not reached on the plot). Thus, N = 𝛽 + 1 is the minimum number of pairs of side frequencies that, along with f c , account for at least 98% of the total FM power. Therefore, Carson’s bandwidth BC of an FM signal is given by the frequency interval from the (𝛽 + 1)th LSF to the (𝛽 + 1)th USF. Since the spacing of the spectral components is f m , it follows that BC = 2(𝛽 + 1)fm = 2(fd + fm )
(8.61)
where we have used the definition of modulation index given in Eq. (8.11). Note that, in the limit 𝛽 < < 1, Eq. (8.61) yields BC = 2f m . Thus, the definition of Carson’s bandwidth is consistent with our earlier discussion of NBFM, which also has a bandwidth of 2f m . The 1% bandwidth BP is given by the width of the frequency interval (centred on the carrier frequency) from the (nmax )th LSF to the (nmax )th USF beyond which the amplitude of every spectral component is less than 1% of the unmodulated carrier amplitude V c . Thus BP = 2nmax fm
(8.62)
The value of nmax is usually read from a table of Bessel functions such as Table 8.2 by looking along the row specified by the modulation index 𝛽 for the furthest column (or side frequency n) with the smallest value ≥0.01.
563
8 Frequency and Phase Modulation
100 N=β+2 99 N=β+1 rN(%)
564
98 N=β 97
96 95.5
0
10
20
30
40 50 60 Modulation index, β
70
80
90
100
Figure 8.21 Fraction r N (expressed as a percentage) of power in FM signal carried by spectral components at the unmodulated carrier frequency f c and N pairs of side frequencies, as a function of integer values of modulation index 𝛽.
For example, if 𝛽 = 5.0 we look in Table 8.2 along the row 𝛽 = 5.0 and find that the highest column with the smallest entry ≥0.01 is the J 8 (𝛽) column. Therefore, nmax = 8 in this case and BP = 16f m , whereas (for comparison) BC = 12f m . At practical values of modulation index 𝛽 ≤ 100, BP accounts for at least 99.97% of the total FM power, compared to a minimum of 98.44% in Carson’s bandwidth BC . Figure 8.22 shows the normalised (i.e. f m = 1 Hz) Carson’s and 1% bandwidths as functions of modulation index 𝛽. It can be observed that BP exceeds BC for 𝛽 > 0.5. However, as 𝛽 → ∞, the two bandwidths become equal (at 𝛽 ≈ 78 474), and eventually BC exceeds BP , since the carrier power is increasingly distributed in smaller fractions to a larger number of side frequencies. One disadvantage of the 1% bandwidth definition is that, unlike Carson’s bandwidth, it is not equal to the NBFM bandwidth of 2f m in the limit 𝛽 → 0. Rather, BP = 0 for 𝛽 ≤ 0.02. The above discussion of FM bandwidth assumes a sinusoidal modulating signal. However, the bandwidth definitions in Eq. (8.61) and Eq. (8.62) can be applied to the more general case in which the modulating signal consists of frequencies up to a maximum f m . To do this, the modulation index 𝛽 is replaced by the deviation ratio D defined in Eq. (8.13). Worked Example 8.4 An audio signal that contains frequencies in the range of 30 Hz to 15 kHz is used to frequency modulate a 100 MHz carrier causing a frequency deviation of 45 kHz. Determine the Carson’s and 1% transmission bandwidths. This is an example of a nonsinusoidal modulating signal, in which deviation ratio D plays the role that modulation index 𝛽 plays when the modulating signal is sinusoidal. The maximum frequency component of the message signal is f m = 15 kHz and the frequency deviation f d = 45 kHz, giving D = fd ∕fm = 3.0
8.4 Spectrum and Power of FM and PM
215 200
Normalised bandwidth
180 160 140
dth
wi
120
1%
100
d ban ’s
on
rs Ca
80
dth
wi
d ban
60 40 20 0
0
Figure 8.22
10
20
30
40 50 60 Modulation index, β
70
80
90
100
Normalised bandwidth (f m = 1).
Thus, Carson’s bandwidth BC = 2(D + 1)fm = 2(4)(15) = 120 kHz To determine the 1% bandwidth BP , we look in Table 8.2 along row 𝛽 = 3.0 (since D = 3.0), where we find that all entries beyond J 6 (𝛽) are less than 0.01. Thus, nmax = 6, and BP = 2nmax fm = 2(6)(15) = 180 kHz Observe that BP > BC as discussed earlier. Worked Example 8.5 A sinusoidal signal of frequency 15 kHz modulates the frequency of a 10 V 100 MHz carrier, causing a frequency deviation of 75 kHz. (a) Sketch the amplitude spectrum of the FM signal, including all spectral components of amplitude larger than 1% of the amplitude of the unmodulated carrier frequency. (b) Determine the fraction of the total power contained in the frequency band 99.93–100.07 MHz. (a) Modulation index 𝛽 = fd ∕fm = 75∕15 = 5.0, f c = 100 MHz, and V c = 10 V. Using row 𝛽 = 5.0 in Table 8.2 we obtain the amplitudes of the frequency components in the FM spectrum as follows. Carrier frequency f c = 100 MHz, of amplitude V c |J 0 | = 10(0.1776) = 1.78 V. First pair of side frequencies f c ± f m = 99 985 and 100 015 kHz, of amplitude V c |J 1 | = 10(0.3276) = 3.28 V. Second pair of side frequencies f c ± 2f m = 99 970 and 100 030 kHz, of amplitude V c |J 2 | = 10(0.0466) = 0.47 V. Third pair of side frequencies f c ± 3f m = 99 955 and 100 045 kHz, of amplitude V c |J 3 | = 10(0.3648) = 3.65 V. Fourth pair of side frequencies f c ± 4f m = 99 940 and 100 060 kHz, of amplitude V c |J 4 | = 10(0.3912) = 3.91 V. Fifth pair of side frequencies f c ± 5f m = 99 925 and 100 075 kHz, of amplitude V c |J 5 | = 10(0.2611) = 2.61 V.
565
566
8 Frequency and Phase Modulation
Afm, volt 3.65 3.28
3.91
3.91
4th LSF
4th USF
2.61 carrier
1.78 1.31 0.47
0.53 0.18 99.88
99.91
99.94
99.97
0.47 100
f, MHz
100.03 100.06 100.09 100.12
Band 99.93 to 100.07 MHz Figure 8.23
Single-sided amplitude spectrum of FM signal in Worked Example 8.5.
Sixth pair of side frequencies f c ± 6f m = 99 910 and 100 090 kHz, of amplitude V c |J 6 | = 10(0.1310) = 1.31 V. Seventh pair of side frequencies f c ± 7f m = 99 895 and 100 105 kHz, of amplitude V c |J 7 | = 10(0.0534) = 0.53 V. Eighth pair of side frequencies f c ± 8f m = 99 880 and 100 120 kHz, of amplitude V c |J 8 | = 10(0.0184) = 0.18 V. The ninth and higher side frequencies are ignored since they correspond to table entries > 1/f c ensures that there are zero crossings within the interval T, whereas the second condition T < < 1/f m avoids excessive averaging of the message signal.
8.6.2 Indirect Demodulator The indirect method of FM demodulation uses a PLL to match the frequency of a VCO to the frequency variations in the FM signal. When the VCO correctly tracks the FM signal in frequency then it follows that its control voltage corresponds exactly to the original message signal. Figure 8.33a shows a block diagram of the PLL (FM) demodulator.
577
578
8 Frequency and Phase Modulation
ʋ'fm(t)
(a)
Limiter
ʋfm(t)
Frequency discriminator
ʋafm(t)
Envelope demodulator
ʋm(t)
ʋ'fm(t) (b)
ʋfm(t) (c)
ʋafm(t) (d)
ʋm(t) (e)
Figure 8.32
Direct FM demodulator using frequency discriminator: (a) Block diagram; (b)–(e) Waveforms.
The PLL consists of a phase discriminator with a VCO in its feedback path, which is arranged such that the output vv (t) of the VCO tends towards the same frequency as the FM signal input to the phase discriminator. The phase discriminator consists of a multiplier or product modulator followed by a lowpass filter (LPF). The LPF rejects the sum-frequency output of the multiplier but passes the difference-frequency output, denoted vp (t), which serves as the control voltage of the VCO. The VCO is set so that when its input or control voltage is zero its output vv (t) has the same frequency as the unmodulated carrier, but with a phase difference of 𝜋/2 rad. Thus, if the (normalised) unmodulated carrier is vc (t) = cos(2𝜋fc t) then, from Eq. (8.32), the VCO output at any time is given by vv (t) = cos(2𝜋fc t + 𝜋∕2 + 𝜙v )
(8.79)
where t
𝜙v = 2𝜋kv
∫0
vp (t)dt
kv is the frequency sensitivity of the VCO and 𝜙v = 0 when the VCO is free running.
(8.80)
8.6 FM and PM Demodulators
Phase discriminator
(a)
FM ʋfm(t) signal
Product modulator
LPF
ʋp(t)
Message signal
ʋʋ(t) Voltage controlled oscillator
ʋm(t) ʋm(2Δt) (b)
ʋm(Δt) ʋm(0)
0
Δt
2Δt
t
Figure 8.33 (a) Indirect frequency demodulator using phase-locked loop (PLL); (b) Staircase approximation of original message signal v m (t).
8.6.2.1 PLL Demodulation Process
For a simplified view of how the PLL demodulates the input FM signal vfm (t), let us assume a staircase approximation of the original message signal vm (t) as shown in Figure 8.33b and reckon time from the instant when the VCO is free-running and vfm has frequency f c . That is, the original message signal is zero at this instant (t = 0), so we may write vm (0) = 0. Furthermore, let us assume that the VCO output only changes at the discrete time instants t = Δt, 2Δt, … The phase discriminator output vp (t) at t = 0 is the difference-frequency product of vc (t) and vv (t) with 𝜙v = 0 vp (0) = cos(𝜋∕2) = 0 Now consider the situation at t = Δt. The message signal made a step increase to the value vm (Δt) at time t = 0, and therefore just before this time (t = Δt) the phase of vfm (t) has gained on the phase of the VCO output vv (t), which is still free-running, by a small amount 𝜙fm1 given by 𝜙fm1 = [2𝜋kf vm (Δt)]Δt
(8.81)
At the instant t = Δt, the VCO re-establishes tracking by making a step change of 𝜙v1 = 𝜙fm1 in its phase. From Eq. (8.80), this requires a control voltage of average value vpa1 during the interval t = 0 to Δt given by the relation 𝜙v1 = 2𝜋kv vpa1 Δt
(8.82)
Equating the right-hand side of Eqs. (8.81) and (8.82) yields the result vpa1 =
kf kv
vm (Δt)
By the next time instant t = 2Δt, the FM signal phase has increased to 𝜙fm2 = 𝜙fm1 + [2𝜋kf vm (2Δt)]Δt
(8.83)
579
580
8 Frequency and Phase Modulation
and the VCO again makes a step change in the phase of vv (t) to the new value 𝜙v2 = 𝜙v1 + 2𝜋kv vpa2 Δt where vpa2 is the average value of vp (t) in the interval t = Δt to 2Δt. Equating the expressions for 𝜙v2 and 𝜙fm2 , and noting that 𝜙v1 = 𝜙fm1 , we obtain vpa2 =
kf kv
vm (2Δt)
It follows by a similar argument that the average value of vp (t) in the nth time interval is related to the original message signal by vpa,n =
kf kv
vm (nΔt),
n = 1, 2, 3, · · ·
(8.84)
In the limit Δt → 0, tracking becomes continuous, and the average values vpa1 , vpa2 , …, become the instantaneous values of vp (t), and we may write vp (t) =
kf kv
vm (t)
(8.85)
The Eq. (8.85) shows that, when the PLL is in the tracking or phase-locked state, its output gives the original message signal vm (t), with a scaling factor kf /kv . FM demodulation has therefore been achieved. The LPF is designed with a bandwidth that is just enough to pass the original message signal, e.g. up to 15 kHz for audio signals. 8.6.2.2 PLL States
The PLL can operate in three different states. It is in the free-running state when the VCO control voltage is zero, or when the input and VCO frequencies are too far apart. In this state the VCO output vv (t) oscillates at a frequency determined by an external timing capacitor (in the usual IC implementation of PLLs). The tracking or locked state is discussed in Section 8.6.2.1. Once locked to an input signal vfm (t), the PLL can vary the frequency of vv (t) to keep step with the frequency variations of vfm (t) over a wide but finite band, referred to as the lock range. The PLL is said to be in the capture state when it is in the process of continuously changing the frequency of vv (t) until it is equal to the frequency of the input signal vfm (t), i.e. until lock is achieved and the tracking state ensues. However, lock will not be achieved if the initial separation between the input frequency and the VCO frequency is too large. The band of input signal frequencies over which the PLL will capture the signal is known as the capture range. This is usually much narrower than the lock range. Both the capture and lock ranges are centred at the free-running frequency. 8.6.2.3 PLL Features
The PLL is an extremely versatile device with numerous applications in electronics. It provides excellent FM demodulation with several advantages over discriminator-based direct demodulation. For example, it is not sensitive to amplitude variations in the FM signal, so a limiting circuit is not required. Since its circuit does not include inductors, the PLL is more easily implemented in IC form. It also obviates the need for complicated coil adjustments. The PLL is also very linear, giving accurate recovery of the original message signal. Furthermore, the PLL’s feature of capturing or passing only signals that are within a small band makes it very effective at rejecting noise and interference and gives it a superior signal-to-noise ratio.
8.6.3 Phase Demodulator A phase demodulator can be easily derived from the frequency demodulators discussed above, based on the theoretical relationships between PM and FM. To see how this may be achieved, consider Eq. (8.21), which gives the
8.6 FM and PM Demodulators
PM ʋpm(t) signal Figure 8.34
Frequency demodulator
d ʋ (t) dt m
Integrator
ʋm(t)
Message signal
Phase demodulator.
instantaneous frequency f i of a PM signal. Note that the frequency variation in a PM signal is proportional to the derivative of the message signal. We were able to use the above FM demodulator schemes because frequency variation in FM is proportional to the message signal. See Eq. (8.4). Thus, if a PM signal is the input to any of the above FM demodulators, the output signal will be the derivative of the original message signal. The original message signal can then be obtained by following the FM demodulator with an integrator. A PM demodulator is therefore as shown in Figure 8.34.
8.6.4 Frequency Discriminators Frequency discriminators can be implemented using various circuits. The main requirements are that the frequency response H(f ) of the circuit should be linear in the frequency band of interest and should change rapidly with frequency to give acceptable demodulation sensitivity. Thus, any circuit whose amplitude response can be represented by the following linear function of frequency f will function properly as a discriminator |H(f )| = k1 + k2 f
(8.86)
Here k1 is a constant, which may be zero, but k2 is a nonzero constant that determines the sensitivity with which frequency variation in the FM signal at the discriminator input is converted to amplitude variation in the AM/FM hybrid signal at the output. A few examples of discriminators are discussed in the following sections. 8.6.4.1 Differentiators
We want to show that a differentiator satisfies Eq. (8.86), which means that it can be used as a frequency discriminator. Let the input to the differentiator in Figure 8.35 be a sinusoidal signal of frequency f x(t) = cos(2𝜋ft) Then the output signal is given by dx = −2𝜋f sin(2𝜋ft) dt = 2𝜋f cos(2𝜋ft + 𝜋∕2)
y=
≡ |H(f )| cos[2𝜋ft + 𝜙H (f )] Note that, as discussed in Section 4.7, the effect of the differentiator is to change the amplitude of the input sinusoid by a factor |H(f )| equal to its amplitude response, and to shift the phase of the input sinusoid by an amount 𝜙H (f ) equal to its phase response. It follows that the frequency response of a differentiator is given by H(f ) ≡ |H(f )| exp[j𝜙H (f )] = 2𝜋f exp(j𝜋∕2) = 2𝜋f [cos(𝜋∕2) + j sin(𝜋∕2)] = j2𝜋f
(8.87)
which satisfies Eq. (8.86) with k1 = 0, and k2 = j2𝜋. Thus, FM demodulation can be carried out by differentiating the FM signal followed by envelope demodulation. Question 8.15 addresses this further. Eq. (8.87) shows that, in
581
582
8 Frequency and Phase Modulation
x(t)
(a)
Input
Differentiator H(f)
Output x(t)
|H(f)|
Slope k2 = 2π
x(t)
Slope y(t) =
x(t – τ)
f
t–τ
(b) Figure 8.35
dx dt
y=
x(t) – x(t – t) τ
t
t (c)
Differentiator: (a) block diagram; (b) amplitude response; (c) approximation.
general, if the transfer function of a circuit in a specified frequency range can be approximated by H(f ) = jKf where K is a constant then that circuit acts as a differentiator of gain K/2𝜋 in that frequency band. Note that the frequency response of a differentiator (see Figure 8.35b) is linear over the entire frequency axis, not just within a limited band of interest. We will consider two examples of differentiator circuits. 8.6.4.1.1 Delay-line Differentiator
The output of a differentiator is the derivative of its input signal, which, according to Figure 8.35c, may be approximated by x(t) − x(t − 𝜏) (8.88) 𝜏 where 𝜏 is the time delay between adjacent sample values, which must be small enough to satisfy the condition y(t) =
𝜏≪
1 2𝜋fc
Eq. (8.88) suggests a system known as a delay-line differentiator that uses a delay line and a summing device to obtain the difference between adjacent input signal samples, followed by a gain factor of 1/𝜏. Since a delay of 𝜏 (on a signal having frequency f ) is equivalent to a phase shift of 2𝜋f𝜏, the demodulator realised using a delay-line differentiator followed by an envelope demodulator is referred to as a phase shift demodulator. This is shown in Figure 8.36. Delay-line differentiator FM signal
ʋfm(t)
+
Σ –
Delay τ Figure 8.36
ʋfm(t – τ)
Phase shift demodulator.
1/τ
Envelope demodulator
ʋm(t)
Message signal
8.6 FM and PM Demodulators
8.6.4.1.2 Resistor-capacitor (RC) Differentiator
The simple highpass RC filter circuit shown in Figure 8.37a has the frequency response R R + 1∕(j2𝜋fC) j2𝜋fRC = 1 + j2𝜋fRC
H(f ) =
for 2𝜋fRC ≪ 1
≈ j2𝜋fRC,
(8.89)
It therefore acts as a differentiator of gain RC at low frequencies (f < < 1/2𝜋RC), which is the linear portion of the frequency response shown (shaded) in Figure 8.37b. The RC filter will work as a discriminator for FM demodulation if the frequency band f c ± f d falls in this linear region. Assuming f c ∼ 10 MHz and R = 1 kΩ, this requires a capacitance C of less than about 3.0 pF, which is a small value indeed, comparable to stray capacitances in circuits. An LC BPF is more suitable for linear frequency response approximation at the usual high values of f c , an intermediate frequency (IF) value of 10.7 MHz being common in FM broadcast receivers. 8.6.4.2 Tuned Circuits
A tuned (RLC) circuit can also be used as a discriminator if it is detuned such that the unmodulated carrier frequency falls in the centre of the linear portion of the circuit’s gain response, as shown in Figure 8.38. The main limitation of this simple circuit is that the linear portion of the gain response is very narrow and not enough Gain = ʋo/ʋi Nonlinear portion C ʋi
ʋo
R
Linear portion
Frequency (a) Figure 8.37
(b)
(a) Simple RC highpass filter; (b) frequency response.
Output voltage Linear portion
ΔV
Δf Input frequency fc Figure 8.38
Frequency response of tuned circuit showing linear portion.
583
584
8 Frequency and Phase Modulation
Balanced discriminator
Envelope demodulator
C2 (a)
FM in
C
L2
L
L1
Message out C1
Output voltage
fc
(b)
f1
f2
Frequency
Extended linear region
Figure 8.39
Balanced discriminator: (a) circuit diagram; (b) frequency response.
for WBFM, which has a large frequency deviation. The linear portion can be extended by using two back-to-back BPFs tuned to two different resonant frequencies spaced equally below and above the carrier frequency. This gives what is referred to as a balanced discriminator circuit, an example of which is shown in Figure 8.39a along with the resulting frequency response in (b). In the circuit, L and C are tuned to frequency f c , L1 and C1 are tuned to frequency f 1 , and L2 and C2 to frequency f 2 . The remaining part of the circuit provides envelope demodulation.
8.7 FM Transmitter and Receiver 8.7.1 Transmitter The basic elements of a broadcast FM transmitter are shown in Figure 8.40. The output of the WBFM modulator, the operation of which is described in connection with Figure 8.26, is amplified into a high-power FM signal and coupled to an antenna for radiation. There is, however, a new processing block, called pre-emphasis, which operates on the message signal before it is applied to angle modulate the carrier. The role of pre-emphasis is to improve the noise performance of the FM communication system, as briefly discussed below.
8.7 FM Transmitter and Receiver
Antenna
Audio input
Preemphasis
Wideband FM modulator fc1
RF power amplifier
High-power FM
fLO
Crystal oscillators Figure 8.40
FM transmitter.
Figure 8.41
Worst-case phase error 𝜙e due to noise voltage V n .
V fmn ϕe Vfm
Vn
The effect of additive noise is to alter both the amplitude and phase of the transmitted FM signal, as illustrated in Figure 8.41. A signal of amplitude V fm is transmitted, but because of additive noise of random amplitude V n , the demodulator receives a signal V fmn , which is the sum of V fm and V n . The noise voltage V n can have any phase relative to the FM signal V fm but is shown in the phasor diagram with a phase of 90∘ (relative to V fm ), where the phase error 𝜙e is maximum for a given noise amplitude. Since an FM signal carries information in its angle, the amplitude error in V fmn is of no effect and all amplitude variations are removed by a limiting circuit prior to demodulation. However, the phase error 𝜙e will somehow corrupt the received message signal since the demodulator has no way of knowing that the additional phase shift comes from noise. The maximum phase error 𝜙e increases with noise amplitude up to a value of 𝜋/4 rad at V n = Vfm . Using Eq. (8.28), we see that this phase error will lead to an error f dn in the frequency deviation, which translates to an error V nr in the received message signal given by fdn = kf Vnr = 𝜙e fm
(8.90)
where f m is the frequency of the message signal. We make two important observations: (i) The effect of the phase error increases with message signal frequency. That is, the voltage error due to noise is larger for the higher-frequency components of the message signal. The problem is further compounded by the fact that the higher-frequency components of the message signal are usually of small amplitudes. As a result, the ratio s/n between signal and noise voltages decreases significantly with frequency. To reduce this problem, the amplitude of the high-frequency components of the message signal can be artificially boosted prior to modulation. This process is known as pre-emphasis, and the circuit that is used is called a pre-emphasis filter. This introduces a known distortion into the message signal, which is readily removed at the receiver after demodulation by applying a filter whose transfer function is the exact inverse of the pre-emphasis filter. This second operation, performed at the receiver, is known as de-emphasis. (ii) The effect of noise, as indicated by the error voltage V nr in the demodulated signal, can be reduced arbitrarily by increasing frequency sensitivity kf . Since modulation index 𝛽 = kf V m /f m , this amounts to increasing modulation index, and hence transmission bandwidth. Thus, FM provides an effective means of trading bandwidth for reduced noise degradation, or improved noise performance.
585
586
8 Frequency and Phase Modulation
8.7.2 SNR and Bandwidth Trade-off We can obtain a rough calculation of the worst-case improvement in noise performance afforded by FM as follows. From Figure 8.41, the ratio between signal voltage and noise voltage at the input of the demodulator is (s∕n)i = Vfm ∕Vn If V n is small compared to Vfm , i.e. (s∕n)i ≫ 1, then the phase error can be approximated by tan 𝜙e =
Vn 1 = Vfm (s∕n)i
≈ 𝜙e
(8.91)
Note that 𝜙e is the largest possible phase error due to noise of amplitude V n and leads to the worst-case or smallest ratio between signal voltage and noise voltage at the output of the demodulator. Let us denote this ratio as (s∕n)o . It is clearly given by the ratio between the frequency deviation f d due to the message signal and the frequency deviation f dn due to noise, since the demodulator translates frequency variation to voltage variation. Thus f f 𝛽 (s∕n)o = d = d = 𝜙e fdn 𝜙e fm = 𝛽(s∕n)i
(8.92)
where we have used Eq. (8.91). Since the square of voltage gives normalised power, we square both sides of Eq. (8.92) to obtain the relationship between the output signal-to-noise power ratio SNRo and the input signal-to-noise power ratio SNRi SNRo = 𝛽 2 (SNRi )
(8.93)
Equation (8.93) is an important result, which shows that FM gives an improvement in output signal-to-noise power ratio that increases as the square of modulation index 𝛽. However, this is the worst-case scenario: the actual improvement is usually better than stated in Eq. (8.93). When account is taken of the random variation of the phase of the noise voltage V n relative to Vfm , then the effective phase error is less than 𝜙e . The exact relationship between SNRo and SNRi – for large SNRi – is derived in Section 8.8.
8.7.3 Pre-emphasis and De-emphasis Simple RC filters are usually employed for pre-emphasis and de-emphasis. Figure 8.42 shows the pre-emphasis and de-emphasis circuits used for standard FM broadcasting. The frequency response of the pre-emphasis filter is given by Hp (f ) =
R2 R2 +
R1 1+j2𝜋fR1 C
=K
1 + jf ∕f1 1 + jf ∕f2
R2 1 1 ; f1 = = R1 + R2 2𝜋𝜏1 2𝜋R1 C R 1 R2 1 1 ; R= f2 = = 2𝜋𝜏2 2𝜋RC R1 + R2
K=
(8.94)
Similarly, the frequency response of the de-emphasis filter is Hd (f ) =
1 1 + jf ∕f1
with f 1 given as above. The two frequency responses are plotted in Figure 8.43.
(8.95)
8.7 FM Transmitter and Receiver
Figure 8.42
C
(a) Pre-emphasis filter; (b) de-emphasis filter.
(a)
R1
Input
R2
Output
R1 (b)
Input
C
Output
log10∣Hp(f)∣
(a) log10(f)
log10∣Hd(f)∣
(b)
log10(f) f1 Figure 8.43
fm
Frequency response of (a) pre-emphasis filter and (b) de-emphasis filter.
In FM audio broadcasting, the RC time constant 𝜏 1 = 75 μs, so that the first corner frequency f 1 = 2120 Hz. The second corner frequency f 2 is chosen to be well above the maximum frequency component f m (= 15 kHz) of the audio signal. Observe that the linearly increasing response of the pre-emphasis circuit between f 1 and f m means that the circuit behaves as a differentiator at these frequencies. Refer to Eq. (8.87). It follows from Figure 8.5 that, when the pre-emphasised audio signal frequency modulates the carrier, the result will be FM for frequencies up to about 2120 Hz, and PM for higher frequencies. Thus, FM with pre-emphasis is a combination of FM (at lower message frequencies) and PM (at the higher message frequencies). We have stated earlier that the de-emphasis circuit is used at the receiver after the frequency demodulator to undo the effect of pre-emphasis. However, an illuminating alternative view of the role of de-emphasis can be obtained by observing that the frequency response of the de-emphasis circuit is approximately uniform up to f 1
587
588
8 Frequency and Phase Modulation
Tuneable RF amplifier fc
Common tuning
Mixer
Local oscillator fLO = fc + fIF
IF amplifier fIF ± 100 kHz
FM demod.
Deemphasis
Audio amplifier Loudspeaker
Figure 8.44
FM superheterodyne receiver.
and linearly decreasing beyond this point. That is, this circuit acts as an integrator for frequencies above f 1 . Thus, in the light of Figure 8.34, we have FM demodulation for frequencies below f 1 , and PM demodulation for frequencies above f 1 , and the original message signal is accurately recovered from the received hybrid FM/PM signal. This is remarkable.
8.7.4 Receiver Figure 8.44 shows a block diagram of a complete FM receiver system. It is based on the superheterodyne principle discussed in detail in Chapter 7 for AM reception. See Figure 7.22. Note the following differences in Figure 8.44. The IF is f IF = 10.7 MHz for audio broadcasting, and the bandwidth of the IF amplifier is 200 kHz to accommodate the received FM signal. The demodulator is usually by the direct method of Figure 8.32, consisting of limiter, discriminator, and envelope demodulator, but it may also be by the indirect method of Figure 8.33 involving a PLL. Note that, as discussed above, de-emphasis is carried out after FM demodulation. You may wish at this point to review the discussion of the operation of an FM stereo receiver given in Chapter 7 (Section 7.7.1.4 and Figure 7.32).
8.8 Noise Effect in FM One of the main advantages of FM over all the AM techniques discussed in Chapter 7 is that FM minimises the degrading effect of additive white Gaussian noise on the recovered message signal in a way that can be further enhanced by increasing the modulation index (and hence transmission bandwidth). FM conveys information exclusively through the phase of the carrier, so any noise-induced amplitude variations in the FM signal will be of no consequence. Noise will introduce carrier phase errors which will be translated into errors in the recovered message; however, these errors are small, and the message degradation is negligible, if carrier amplitude (or power) is much greater than noise voltage (or power). To gain a quantitative and more complete understanding of the effect of noise in FM, consider Figure 8.45, which shows the phasor addition of noise voltage r to a transmitted FM signal of amplitude Ac to produce the received FM signal of amplitude Acn . The addition is illustrated for various scenarios of the relative amplitudes of carrier and noise, namely (i) Ac much larger than r, (ii) Ac just larger than r, and (iii) Ac less than r. The noise voltage r has random phase 𝜓 that is uniformly distributed between −𝜋 and 𝜋 rad, so its phasor when added to Ac will start at point D (the endpoint of the transmitted FM signal phasor) and will terminate anywhere on the shaded circle of radius r centred on D. The transmitted FM signal phasor has phase 𝜙 (which contains the transmitted message signal and information), but the received FM signal phasor has phase 𝜙 + 𝜀, which includes a phase error 𝜀 due to the addition of the noise voltage r. We see that when r is much smaller than Ac (as in Figure 8.45a) the maximum
8.8 Noise Effect in FM
B r D Acn
B Ac
O
ϕ
r D
Acn ε max
εmax
ϕ
O
(a) Ac >> r: εmax small
B
Ac
(b) Ac ≈ r: εmax large
D
r Acn
ε
ϕ
Ac
O
(c) Ac < r: εmax = 180° B
r ψ–ϕ C ψ D E
A cn
(d) Analysis of scenario Ac >> r
ε
Ac
θ
ϕ
O
Figure 8.45 Phasor addition of noise voltage r to FM signal of amplitude Ac . Maximum phase error 𝜀max in the received FM signal of amplitude Acn depends on the magnitude of Ac relative to r.
phase error 𝜀max is small, but when r is close in magnitude to Ac (as in Figure 8.45b) then 𝜀max is large, and when r is larger than Ac (as in Figure 8.45c) then 𝜀max = 180∘ . In order for the transmitted message to be reliably recovered by the FM demodulator, it is necessary that 𝜀max is small; otherwise, Acn will undergo large random changes in phase which will be converted into large signal excursions at the demodulator output, perceived as loud crackling noise in an audio message signal. The analysis that follows is therefore based on Figure 8.45d, which assumes Ac > > r. In Figure 8.45d, the transmitted FM signal phasor Ac , noise voltage r, and received FM signal phasor Acn have respective phases 𝜙, 𝜓, and 𝜃 = 𝜙 + 𝜀, where 𝜀 is the phase error due to noise. Consider triangle OBC. Let us resolve the noise voltage phasor r into in-phase and quadrature components vnI (t) and vnQ (t) relative to the carrier phasor Ac so that in Figure 8.45d length DC ≡ vnI (t) and length BC ≡ vnQ (t). Note that we could have resolved r relative to the 0∘ direction so that length DE ≡ vnI (t) and length BE ≡ vnQ (t), but the statistical distributions of vnI (t) and vnQ (t) would be unchanged in either case. This simply exploits the fact that the distribution of sin(𝜓 − 𝜙) is unaffected by the value of 𝜙, allowing sin(𝜓 − 𝜙) to be replaced by sin𝜓. Thus tan 𝜀 =
vnQ (t) BC = OC Ac + vnI (t)
For Ac > > r, 𝜀 → 0 and tan𝜀 ≈ 𝜀, so that 𝜃(t) = 𝜙(t) + 𝜀 ≈ 𝜙 + ≈ 𝜙(t) +
vnQ (t) Ac
vnQ (t) Ac + vnI (t) (since Ac ≫ vnI (t))
(8.96)
Now consider the FM receiver model of Figure 8.46. The BPF blocks all signals outside the range f c ± B/2 and passes the noise corrupted FM signal va (t) = vfm (t) + vn (t) ≡ Acn cos[2𝜋fc t + 𝜃(t)]
589
8 Frequency and Phase Modulation
Figure 8.46
Incoming signals
FM receiver model.
w(t)
+
White noise
BPF (fc – B/2 → fc + B/2) ʋa(t)
C/N
Limiter FM demodulator
590
ʋb(t) Discriminator ʋd(t) LPF (–fm → fm)
Message signal + noise
SNR
where vfm (t) = Ac cos[2𝜋fc t + 𝜙(t)] vn (t) = r(t) cos[2𝜋fc t + 𝜓(t)] = vnI (t) cos(2𝜋fc t) − vnQ (t) sin(2𝜋fc t) are, respectively, the wanted FM signal and noise voltage. This representation of noise voltage in a bandpass system is derived in Section 6.3.3. With noise power equal to N o B, and carrier power = A2c ∕2, the carrier-to-noise ratio (C/N) at the FM demodulator input is C∕N =
A2c 2No B
(8.97)
Note that this is the signal-to-noise ratio (SNRi ) at the input of the demodulator, but since the signal is still on the carrier signal at this point in the transmission system, the description of this quantity as carrier to noise ratio (C/N) is preferred. The discriminator produces an output that is proportional to its input frequency (which is the rate of change of angle divided by 2𝜋). Normalising the constant of proportionality to unity, the discriminator output is given by ] [ vnQ (t) 1 d 1 d𝜃(t) vd (t) = = 𝜙(t) + 2𝜋 dt 2𝜋 dt Ac dv (t) nQ 1 = kf vm (t) + 2𝜋Ac dt 𝛽f ≡ m vm (t) + vnd (t) Am
8.8 Noise Effect in FM
ʋnQ(t)
H(f) = j2πf/(2πAc)
SnQ(f)
ʋnd(t) Snd(f) = SnQ(f)|H(f)|2 Snd(f)
SnQ(f) No –B/2
f
B/2
–B/2
B/2
f
Figure 8.47 Processing of v nQ (t), having baseband equivalent PSD S nQ (f ) shown, by discriminator to produce v nd (t) having PSD S nd (f ), also shown.
where d𝜙(t)∕dt was obtained as 2𝜋kf vm (t) by taking the derivative of Eq. (8.27). The first term of vd (t) is the message signal which is fully passed by the LPF to produce signal power Ps at demodulator output given by Ps = 𝛽 2 fm2 A2rms ∕A2m = (𝛽fm ∕R)2
(8.98)
where R = Am ∕Arms is the peak-to-rms ratio of the message signal. The second term of vd (t) is noise whose power is more readily computed in the frequency domain, noting that only a portion of this noise in the frequency range −f m to +f m is passed by the LPF. The time domain operation of (1∕2𝜋Ac )d∕dt on vnQ (t) to produce vnd (t) is illustrated in the frequency domain in Figure 8.47 based on the Fourier transform property (Eq. (4.88)) in which differentiation in the time domain corresponds to multiplication by j2𝜋f in the frequency domain. Thus, this operation is equivalent to passing vnQ (t), having a flat baseband equivalent power spectral density (PSD) SnQ (f ) of height N o in the frequency range −B/2 to B/2 plotted in Figure 8.47, through a linear system of transfer function H(f ) = j2𝜋f /(2𝜋Ac ) to produce an output vnd (t), which has PSD Snd (f ) given by Eq. (4.163) as Snd (f ) = SnQ (f )|H(f )|2 = No
f2 A2c
,
−B∕2 ≤ f ≤ B∕2
(8.99)
Thus, since f m < B/2, it follows that noise power at the demodulator output is fm
Pn = =
∫−fm
fm
Snd (f )df =
No f 2
∫−fm A2c
df
2No fm3 3A2c
(8.100)
The ratio between Eqs. (8.98) and (8.100) gives a signal-to-noise ratio (SNR) at demodulator output as (SNR)o =
Ps 1.5𝛽 2 A2c = Pn No R2 fm
(8.101)
The processing gain Gp of an analogue demodulator was introduced in Section 6.5.1 as the amount in dB by which the demodulator raises the SNR at its output above the C/N at its input. Thus, Eqs. (8.97) and (8.101) yield 1.5𝛽 2 A2c 2No B (SNR)o = × C∕N No R2 fm A2c 2 3𝛽 B = 2 R fm
Gp =
(8.102)
591
592
8 Frequency and Phase Modulation
Setting B equal to Carson’s bandwidth 2(𝛽 + 1)f m in the above equation yields Gp = 6𝛽 2 (𝛽 + 1)∕R2 = 7.78 + 10 log10 (𝛽 2 (𝛽 + 1)) − 20 log10 R dB = 4.77 + 10 log(𝛽 2 (𝛽 + 1))
dB (for sinusoid where R =
√
2)
(8.103)
We see that, provided C/N is above about 9.5 dB (to justify our assumption that carrier amplitude is much larger than noise voltage), FM provides a mechanism for raising signal quality (i.e. increasing processing gain and hence output SNR) by using a higher modulation index and hence transmission bandwidth. Also, we see in Figure 8.47 and Eq. (8.99) that the noise vnd (t) at the discriminator output is coloured noise in that its PSD Snd (f ) is not flat but increases as the square of frequency f . Noise power can therefore be reduced, and hence SNR increased, by using an LPF of transfer function H d (f ) to attenuate the high-frequency components of vnd (t). To avoid distorting the message signal by this action, we boost the high-frequency components of the message signal at the transmitter using a highpass filter (HPF) of transfer function H p (f ) = 1/H d (f ). The filtering operation at the transmitter is known as pre-emphasis and the inverse operation at the receiver is known as de-emphasis, as discussed in Section 8.7.3. This combined technique of pre-emphasis and de-emphasis is often simply referred to as de-emphasis. It delivers an increase in SNR of 5–10 dB for audio transmission and around 9 dB for television (TV) and is therefore always implemented in FM systems. The processing gain of an FM-TV demodulator is given by Gp = 4.77 + 10 log10 (𝛽 2 (𝛽 + 1)) + P + Q dB
(8.104)
In the above equation, the first two terms give Gp for a sinusoidal message signal, P ≈ 9 dB gives the improvement due to de-emphasis, and Q ≈ 8 dB is the subjective improvement factor, √which arises from a combination of two factors. First, the peak-to-rms ratio R of a video signal is less than the 2 value applicable to a sinusoid. Second, the effect on a display screen of the coloured noise at the FM demodulator output is less annoying than the effect of white noise, so we may reduce coloured noise power by several dB to reflect our higher subjective tolerance of its impact. The processing gain of an FM system carrying a single voice channel is given by Gp = 4.77 + 10 log10 (𝛽 2 (𝛽 + 1)) + P dB P ≈ 7 dB
(8.105)
Worked Example 8.8 Determine the processing gain Gp of an FM demodulator for the following FM transmissions with modulation index 𝛽 = 5. Assume a de-emphasis improvement factor P = 7 dB where applicable. (a) Sinusoidal message signal. (b) Message signal having a uniform probability density function (PDF). (c) Voice signal having peak-to-rms ratio 20log10 R = 9 dB. The applicable relationship for the processing gain Gp in all three cases is Gp = P + 7.78 + 10 log10 (𝛽 2 (𝛽 + 1)) − 20 log10 R = P + 7.78 + 10 log10 (52 (5 + 1)) − 20 log10 R = P + 29.54 − 20 log10 R dB
8.8 Noise Effect in FM
√ (a) A sinusoidal message signal Am cos(2𝜋fm t) has peak value Ap = Am and rms value Arms = Am ∕ 2, and hence peak-to-rms ratio √ Ap Am = R= √ = 2 Arms Am ∕ 2 De-emphasis improvement is not possible for a message signal that contains only one frequency component. Thus, P = 0 for a sinusoidal message and hence √ Gp = P + 29.54 − 20 log10 R = 0 + 29.54 − 20 log10 ( 2) = 29.54 − 10 log10 2 = 26.53 dB (b) A signal X having a uniform PDF and taking on continuous values x in the range −Ap to Ap has PDF pX (x) and mean square value A2rms obtained from Eqs. (3.44) and (3.21), respectively, as pX (x) =
1 ; 2Ap
A2rms =
Ap
∫−Ap
x2 pX (x)dx
Thus A
A2rms =
p 1 x2 dx = A2p ∕3 2Ap ∫−Ap
√ √ Therefore, this signal has peak value Ap and rms value Arms = Ap ∕ 3, and hence peak-to-rms ratio R = 3. With de-emphasis improvement factor P = 7 dB, the FM demodulator processing gain for this signal is thus √ Gp = P + 29.54 − 20 log10 R = 7 + 29.54 − 20 log10 ( 3) = 36.54 − 10 log10 3 = 31.77 dB (c) For the voice signal, we insert the given value of R along with P = 7 dB into the above relation for Gp to obtain Gp = P + 29.54 − 20 log10 R = 7 + 29.54 − 9 = 27.54 dB Worked Example 8.9 A TV signal with a baseband video bandwidth of 4.2 MHz is transmitted by FM. Determine the transmission bandwidth that must be used to ensure that an incoming signal with C/N = 9.5 dB at the receiver’s demodulator input is recovered to meet the minimum SNR quality threshold of 45 dB for video at the demodulator output. Assume de-emphasis and subjective improvement factors P = 9 dB, Q = 8 dB, respectively. Comment on the recovered signal quality during a rain event in which the C/N at the receiving station drops to 8 dB. Since SNR = Gp + C∕N, and we are given SNR = 45, C/N = 9.5, it means that we need to achieve a processing gain Gp = 45–9.5 = 35.5 dB. It follows from Eq. (8.104) that 10 log10 (𝛽 2 (𝛽 + 1)) = Gp − 4.77 − (P + Q) = 35.5 − 4.77 − 17 = 13.73 𝛽 2 (𝛽 + 1) = 1013.73∕10 = 23.6048 𝛽 = 2.571
593
594
8 Frequency and Phase Modulation
The required (Carson’s) transmission bandwidth is therefore B = 2(𝛽 + 1)fm = 2(2.571 + 1) × 4.2 MHz = 30 MHz If C/N of the received signal at demodulator input drops to 8 dB, the message signal will be irrecoverable by the demodulator since this C/N is below the required minimum threshold of C/N = 9.5 dB for FM demodulation. Such a rain event will therefore drive the communication link into outage.
8.9 Overview of FM and PM Features Let us conclude our study of angle modulation with a discussion of its merits and demerits when compared to amplitude modulation (studied in Chapter 7). The major applications of FM and PM are also presented.
8.9.1 Merits (i) WBFM gives a significant improvement in the signal-to-noise power ratio (SNR) at the output of the receiver. The FM demodulator has a processing gain that increases with modulation index. A further improvement in SNR can be obtained using pre-emphasis (to boost the amplitude of high-frequency components of the modulating signal) and de-emphasis (to remove the pre-emphasis distortion from the demodulated signal). (ii) Angle modulation is resistant to propagation-induced selective fading, since amplitude variations are unimportant and are removed at the receiver using a limiting circuit. (iii) Angle modulation is very effective in rejecting interference in the same manner that it minimises the effect of noise. The receiver locks onto the wanted signal and suppresses the interfering signal, provided it is not nearer in strength to the wanted signal than the capture ratio. A capture ratio of, say, 5 dB means that the receiver suppresses any signal that is weaker than the wanted signal (to which it is tuned) by 5 dB or more. A small capture ratio is desirable. (iv) Angle modulation allows the use of more efficient transmitters. The FM signal is generated with low-level modulation. Highly efficient nonlinear Class C amplifiers are then employed to produce a high-power RF signal for radiation. These amplifiers can be optimally operated since the angle modulated signal is of fixed amplitude. There is no need for the high-power audio amplifiers used in high-level AM transmitters, or the inefficient linear RF amplifiers required to preserve the information-bearing RF envelope in low-level AM transmitters. (v) Angle modulation can handle a greater dynamic range of modulating signal than AM without distortion by using a large enough frequency sensitivity to translate all message signal variations to a proportionate and significant carrier frequency variation. The penalty is an increase in transmission bandwidth.
8.9.2 Demerits (i) The most significant disadvantage of angle modulation is that it requires a transmission bandwidth that is much larger than the message signal bandwidth, depending of course on the modulation index. For example, in FM audio broadcasting, a 15 kHz audio signal requires a bandwidth of about 200 kHz. (ii) The interference rejection advantage of angle modulation can be detrimental in, for example, mobile receivers near the edge of a service area, where the wanted signal may be captured by an unwanted signal or noise voltage. If the two signals are of comparable amplitude, the receiver locks intermittently onto one signal or the other – a problem known as the capture effect. (iii) Angle modulation generally requires more complex and expensive circuits than AM. However, with advances in IC technology this is no longer a very significant demerit.
Questions
8.9.3 Applications (i) Audio broadcasting within the VHF band at frequencies from 88 to 108 MHz. The audio signal occupies the frequency band 50 Hz to 15 kHz, and the allowed transmission bandwidth is 200 kHz with a rated system deviation F D = 75 kHz. In the United States, noncommercial stations are assigned carrier frequencies in increments of 200 KHz from 88.1 to 91.9 MHz, and commercial stations from 92.1 to 107.9 MHz. The implementation of FM stereo is discussed in Chapter 7 under applications of double sideband (DSB) (Section 7.7.1.4). (ii) Transmission of accompanying sound in analogue television broadcast using VHF frequencies 54–88 MHz and 174–216 MHz. An audio signal of baseband 50 Hz to 15 kHz frequency modulates a carrier with a rated system deviation F D = 25 kHz. This application is becoming globally obsolete because of the switchover from analogue TV broadcast to digital around the world, the UK, for example, completing that exercise back in October 2012. (iii) Two-way mobile radio systems transmitting audio signals of bandwidth 5 kHz, with rated system deviations F D = 2.5 and 5 kHz for channel bandwidths 12.5 and 25 kHz, respectively. Various frequency bands have been assigned to different services, such as amateur bands at 144–148 MHz and 420–450 MHz, and public service bands at 108–174 MHz. (iv) Multichannel telephony systems. Long-distance telephone traffic is carried on analogue point-to-point terrestrial and satellite links using FM. Several voice channels are stacked in frequency to form a composite frequency division multiplex (FDM) signal, which is used to frequency modulate a carrier at an IF of, say, 70 MHz. The resulting FM signal is then up converted to the right UHF – for terrestrial microwave links or super high frequency (SHF) – for satellite links. This is an analogue transmission technique which is now largely obsolete, having been gradually replaced starting in the 1980s by digital telephony. (v) PM and various hybrids are used in modulated digital communication systems, such as the public switched telephone networks and terrestrial wireless and satellite communication systems.
8.10 Summary This now completes our study of analogue signal modulation techniques started in the previous chapter. In this chapter, we have covered in detail the principles of phase and FMs and emphasised throughout our discussions the relationships between these two angle modulation techniques. Various angle modulation and demodulation circuits were discussed. FM may be generated using a PM modulator that is preceded by a suitable LPF or integrator and demodulated using a PM demodulator followed by a suitable HPF or differentiator. Similarly, PM may be generated by an FM modulator preceded by a differentiator and demodulated using an FM demodulator followed by an integrator. However, PM is not used for transmitting analogue message signals due to its poor bandwidth efficiency when compared to FM. The use of PM for digital signal transmission is addressed in Chapter 11. In the next chapter, we embark on a study of the issues involved in the sampling of analogue signals – an essential step in the process of analogue-to-digital conversion.
Questions 8.1
The staircase waveform shown in Figure Q8.1 modulates the angle of the carrier signal vc (t) = V c cos(1000𝜋t). Make a sketch of the angle 𝜃 c (t) of the modulated carrier in degrees as a function of time in ms, over the entire duration of the modulating signal, for the following cases. (a) Frequency modulation with a frequency sensitivity kf = 0.2 kHz/V. (b) Phase modulation with phase sensitivity kp = 𝜋/4 rad/V.
595
596
8 Frequency and Phase Modulation
4
ʋm(t), volt
Figure Q8.1
Question 8.1
Figure Q8.2
Question 8.2
2 0
3 0
1
2
4
t, ms
–2
2
0
ʋm(t), volt
t, ms 0
1
2
–2
8.2
Repeat Question 8.1 for the case of the triangular modulating signal shown in Figure Q8.2. Compare the final angle (at t = 2 ms) of the unmodulated carrier with the final angle of the modulated carrier and comment on your result.
8.3
The message signal in Figure Q8.1 frequency modulates the carrier vc (t) = 10sin(4000𝜋t) volts using a circuit of frequency sensitivity 500 Hz/V. Sketch the FM signal waveform. Determine the frequency deviation and frequency swing of the modulated carrier.
8.4
Sketch the resulting PM waveform when the message signal in Figure Q8.1 phase modulates the carrier vc (t) = 5sin(4000𝜋t) volts using a circuit of phase sensitivity 45∘ /V.
8.5
A message signal vm (t) = 2sin(10 × 103 𝜋t) volt is used to frequency modulate the carrier vc (t) = 10sin(180 × 106 𝜋t) volt, giving a percent modulation of 60%. Determine (a) The frequency deviation of the modulated carrier. (b) The frequency sensitivity of the modulator circuit. (c) The frequency swing of the modulated carrier. (d) The modulation index. (e) The phase deviation of the modulated carrier.
8.6
A message signal vm (t) = 5sin(20 × 103 𝜋t) volt phase modulates the carrier vc (t) = 10sin(106 𝜋t) volt, giving a phase modulation factor m = 1. Determine (a) The phase sensitivity of the modulating circuit. (b) The frequency deviation of the modulated carrier.
8.7
MATLAB Exercises: the triangular waveform shown in Figure Q8.2 modulates the carrier signal vc (t) = 5sin(20 × 103 𝜋t) volts. Following the discussion and equations in Section 8.3.2, make an accurate plot of the modulated carrier waveform for each of the following cases: (a) Frequency modulation with sensitivity 2 kHz/V. (b) Phase modulation with sensitivity 𝜋/4 rad/V.
Questions
8.8
The display of an NBFM signal on a spectrum analyser shows three frequency components at 18, 20, and 22 kHz, with respective amplitudes 1, 10, and 1 V. (a) What is the modulation index? (b) Determine the minimum and maximum amplitudes of the modulated carrier and hence the percentage variation in carrier amplitude. (c) Compare the results obtained in (b) with the case of an AM signal that has an identical amplitude spectrum. (d) Draw a phasor diagram – similar to Figure 8.16, but with all sides and angles calculated – of the NBFM signal at each of the time instants t = 0, 62.5, 125, 250, and 437.5 μs. (e) Based on the results in (d), or otherwise, make a sketch of the NBFM waveform, as would be displayed on an oscilloscope. Your sketch must be to scale and with clearly labelled axes.
8.9
Obtain Eqs. (8.55) and (8.56) for the Fourier series of tone modulated FM and PM signals. To do this, you may wish to expand Eq. (8.36) using the relevant trigonometric identity and then apply the following relations cos[𝛽 sin(𝜔m t)]
= J0 (𝛽) + 2J2 (𝛽) cos(2𝜔m t)
+2J4 (𝛽) cos(4𝜔m t) + · · · sin[𝛽 sin(𝜔m t)]
= 2J1 (𝛽) sin(𝜔m t) + 2J3 (𝛽) sin(3𝜔m t)
+2J5 (𝛽) sin(5𝜔m t) + · · · 8.10
A message signal of baseband frequencies 300 Hz → 5 kHz is used to frequency modulate a 60 MHz carrier, giving a frequency deviation of 25 kHz. Determine the Carson’s and 1% transmission bandwidths.
8.11
When the carrier signal vc (t) = 10sin(106 𝜋t) volts is frequency modulated by the message signal vm (t) = 2sin(104 𝜋t) volts, the carrier frequency varies within ±4% of its unmodulated value. (a) What is the frequency sensitivity of the modulator? (b) What is the modulation index? (c) Determine the Carson’s and 1% transmission bandwidths. (d) Sketch the amplitude spectrum of the FM signal. Include all spectral components with an amplitude larger than 1% of the unmodulated carrier amplitude. (e) Determine the percentage of the total power contained in the frequency band 473 → 526 kHz.
8.12
Determine the percentage of total power contained within the Carson’s and 1% bandwidths of a tone modulated FM signal with the following values of modulation index: (a) 𝛽 = 0.2 (b) 𝛽 = 2 (c) 𝛽 = 20.
8.13
The circuit of Figure 8.25 is employed to generate an NBFM signal of modulation index 𝛽 = 0.2. The message signal is a 2 kHz sinusoid. If the required frequency sensitivity is kf = 5 kHz/V, give a specification of suitable component values R and C and the message signal amplitude V m .
8.14
Figure Q8.14 is the block diagram of an Armstrong modulator involving two stages of frequency multiplication. The message signal vm (t) contains frequencies in the range 50 Hz to 15 kHz. The WBFM output signal has a carrier frequency f c = 96 MHz and a minimum frequency deviation f d = 75 kHz. The NBFM modulator uses a carrier frequency f c1 = 100 kHz, with a modulation index 𝛽 1 = 0.2. Determine the frequency
597
598
8 Frequency and Phase Modulation
ʋm(t) Message signal
NBFM
fc1, β1
fc1
Figure Q8.14
Frequency n f , n β 1 c1 1 1 multiplier (×n1)
Down converter
fc2, n1β1 Frequency fc, β multiplier Wideband FM (×n2)
fLO
Question 8.14
multiplication ratios n1 and n2 , which will allow an oscillator frequency f LO = 8.46 MHz to be used in the downconverter, with f LO > n1 f c1 . 8.15
By taking the derivative of the general expression for an FM signal – Eq. (8.32) – show that frequency demodulation can be obtained using a circuit that consists of a differentiator followed by an envelope demodulator. Specify the modification or extra processing block required to use this circuit for phase demodulation.
599
9 Sampling
Those who write off education as too expensive may be paying more for ignorance. In this Chapter ✓ Sampling theorem: how to sample a bandlimited analogue signal to ensure distortion-free reconstruction. Both lowpass and bandpass signals are discussed. ✓ Aliasing: the penalty of undersampling. ✓ Anti-alias filter: considerations in the design of a lowpass filter (LPF) to limit the bandwidth of an analogue signal prior to sampling. ✓ Non-instantaneous sampling: the use of a finite-width pulse train as switching signal for flat-top sampling and the resulting aperture effect.
9.1 Introduction This chapter lays an important foundation for the introduction of digital modulation techniques. The subject of sampling is introduced as a nondestructive elimination of the redundancy inherent in analogue signal representations. The sampling theorem is presented and terminologies such as Nyquist frequency, Nyquist interval, and aliasing are discussed. Using sinusoidal signals, we demonstrate how sampling at an adequate rate duplicates (without distortion) the baseband spectrum of the sampled signal. The penalty of undersampling is also illustrated in the time and frequency domains, and the measures usually taken to minimise alias distortion are discussed with an example from telephone speech. Instantaneous sampling is discussed, and it is shown that, although this method is highly desirable, it is impossible to realise in practice. Natural and flat-top sampling using a pulse train that has a nonzero duty cycle are discussed. The distortion due to aperture effect is presented along with practical measures to minimise this distortion.
9.2 Sampling Theorem Analogue signals are continuous both in time and in amplitude, and they require exclusive use of a communication system resource for the entire duration of transmission. However, the values of an analogue signal at two sufficiently close time instants are usually related in some way and there is therefore inherent redundancy in such Communication Engineering Principles, Second Edition. Ifiok Otung. © 2021 John Wiley & Sons Ltd. Published 2021 by John Wiley & Sons Ltd. Companion Website: www.wiley.com/go/otung
600
9 Sampling
signals. For example, the values of a DC signal at all time instants are related as equal, so that only one value is required to recover the entire signal. A signal that is a linear function of time can be fully reconstructed from its values taken at only two time instants. The advantage of storing and transmitting only selected values of one signal is that the system resource can be allocated to other signals during the unused intervals. The process of taking only a few values of an analogue signal g(t) at regular time intervals is referred to as sampling. Thus, sampling converts the continuous-value continuous-time signal g(t) to a continuous-value discrete-time signal g(nT s ), where n = 0, 1, 2, 3, …, and T s is the sampling interval. The surprise is that you can perfectly reconstruct the original signal g(t) from the samples, provided you follow a simple rule, known as the sampling theorem, which may be stated as follows. A band-limited lowpass signal that has no frequency components above fm (Hz) may be perfectly reconstructed, using an LPF, from its samples taken at regular intervals at the rate Fs ≥ 2fm (samples per second, or Hz). There are three important points to note about the sampling theorem. ●
●
●
The analogue signal must have a finite bandwidth f m . A signal of infinite bandwidth cannot be sampled without distortion. In practice, it is necessary to artificially limit the bandwidth of the signal using a suitable LPF. At least two samples must be taken during each period T m of the highest-frequency sinusoid in the signal. That is, the sampling interval T s must be less than or equal to half T m , which is the same as stated above that the sampling rate F s must be at least twice f m . When the above two conditions are satisfied then the original signal g(t) can be recovered from these samples, with absolutely no degradation, by passing the sampled sequence g[nT s ] = {g(0), g(T s ), g(2T s ), g(3T s ), …} through an LPF. The filter used in this way is referred to as a reconstruction filter.
The minimum sampling rate F smin specified by the sampling theorem, equal to twice the bandwidth of the analogue signal, is called the Nyquist rate or Nyquist frequency. The reciprocal of the Nyquist rate is referred to as the Nyquist sampling interval, which represents the maximum sampling interval T smax allowed by the sampling theorem. Nyquist rate = Fsmin = 2fm Nyquist interval = Tsmax =
1 1 = Nyquist rate 2fm
(9.1)
9.3 Proof of Sampling Theorem Rather than attempt a formal mathematical proof of the sampling theorem, we will follow an intuitive discussion that highlights the conditions for distortion-free sampling stated in the theorem. The sampling of an arbitrary analogue signal g(t) can be obtained as shown in Figure 9.1a. The switching signal 𝛿 Ts (t) is an impulse train of period T s , which causes the electronic switch to close and open instantaneously at intervals of T s . The result is a sampled signal g𝛿 (t) that is equal to the analogue signal g(t) at the instants t = nT s , n = 0, 1, 2, 3, …, when the switch is closed and is zero everywhere else. This type of sampling is therefore referred to as instantaneous or ideal sampling. The waveforms g(t), 𝛿 Ts (t), and g𝛿 (t) are shown in Figure 9.1b–d. We see that the sampled signal may be expressed as the product of the continuous-time analogue signal g(t) and a switching signal 𝛿 Ts (t) g𝛿 (t) = g(t) × 𝛿Ts (t)
(9.2)
9.3 Proof of Sampling Theorem
(a)
g(t)
Analogue input
Electronic switch gδ(t)
Sampled output
δT (t) = Σ δ(t – nTs) s
n
Switching signal ≡ Impulse train of period Ts = 1/Fs
g(t) (b)
t δT (t) = Σ δ(t – nTs) s
n
(c) Ts
gδ(t) (d)
t
t
Ts
Figure 9.1 Instantaneous sampling of analogue signal: (a) switch; (b) analogue signal; (c) switching impulse train; (d) instantaneously sampled signal.
Note that the switching signal 𝛿 Ts (t) is a periodic rectangular pulse of infinitesimally small duty cycle d, and of period T s . It can therefore be expressed in terms of the Fourier series ( ) 𝛿Ts (t) = Ao + A1 cos[2𝜋Fs t] + A2 cos[2𝜋 2Fs t] ( ) (9.3) + A3 cos[2𝜋(3Fs )t] + A4 cos[2𝜋 4Fs t] + · · · where F s = 1/T s is the sampling frequency. From the discussion of the amplitude spectrum of a rectangular pulse train in Section 4.2 (see, for example, Worked Example 4.1), we recall that the amplitudes of the harmonic frequency components of 𝛿 Ts (t) are given by Ao = Ad An = 2Ad sinc (nd) ,
n = 1, 2, 3, · · ·
(9.4)
where A is the amplitude of the pulse train, and the first null in the spectrum occurs at frequency f = F s /d. Thus, as d → 0, the sinc envelope flattens out and the first null occurs at f → ∞, so that the amplitude An of the harmonic frequency components become Ao = Ad An = 2Ad,
n = 1, 2, 3, · · ·
(9.5)
Normalising the factor Ad to unity, and substituting Eq. (9.5) in (9.3), we obtain the normalised Fourier series of an impulse train as 𝛿Ts (t) = 1 + 2 cos[2𝜋Fs t] + 2 cos[2𝜋(2Fs )t] + 2 cos[2𝜋(3Fs )t] + 2 cos[2𝜋(4Fs )t] + · · ·
(9.6)
601
602
9 Sampling
Substituting in Eq. (9.2) yields an alternative expression for the instantaneously sampled signal g𝛿 (t) g𝛿 (t) = g(t){1 + 2 cos[2𝜋Fs t] + 2 cos[2𝜋(2Fs )t] + · · ·} = g(t) + 2g(t) cos[2𝜋Fs t] + 2g(t) cos[2𝜋(2Fs )t] + · · · = g(t) + 2
∞ ∑
g(t) cos(2𝜋nF s t)
(9.7)
n=1
Equation (9.7) is an important result, which shows that the instantaneously sampled signal g𝛿 (t) is the sum of the original signal g(t) and the product of 2 g(t) and an infinite array of sinusoids of frequencies F s , 2F s , 3F s , … Recall that the double-sided spectrum of 2g(t) cos(2𝜋nF s t) is merely the spectrum G(f ) of g(t) shifted without modification from the location f = 0 along the frequency axis to the locations f = ±nF s .
9.3.1 Lowpass Signals Let the original signal g(t) have the spectrum G(f ) shown in Figure 9.2a. The shape of G(f ) shown is merely representative and is unimportant. What matters is that g(t) is bandlimited with bandwidth f m . Therefore, the double-sided spectrum of g𝛿 (t) is as shown in Figure 9.2c. The normalised spectrum of the switching impulse train is also shown in Figure 9.2b.
|G(f)|
(a) –fm
f
fm ΔT (f) s
…..
…..
(b) –3Fs
–2Fs
–Fs
Fs
0
2Fs
3Fs
f
Gδ(f) …..
…..
(c)
–3Fs
–2Fs
–Fs
–fm
0
fm
Fs
2Fs
3Fs
f
Fs – fm Figure 9.2 Instantaneous sampling of analogue signal. Spectra of the following signals: (a) analogue signal; (b) switching impulse train; (c) sampled signal.
9.3 Proof of Sampling Theorem
Note that (instantaneous) sampling does not distort the spectrum G(f ) of a bandlimited signal g(t). Rather, it replicates G(f ) at intervals of the sampling frequency F s . We see from Figure 9.2c that the lowest frequency component in the first band replicated at F s is F s – f m , whereas the highest-frequency component in the baseband is f m . So, if F s − f m ≥ f m (i.e. F s ≥ 2f m ) then the first replicated band does not overlap into the baseband, and neither does any of the other replicated bands overlap with another. Thus, any one of these bands, and hence the original signal g(t), can be recovered without distortion by employing a realisable filter. An LPF with a cut-off frequency f m will recover g(t) by passing the baseband (or zeroth band) in G𝛿 (f ), corresponding to the first term in the right-hand side of Eq. (9.7), and blocking all other replicated bands. We see therefore that a bandlimited signal of bandwidth f m can be recovered without distortion from its samples taken at a rate F s ≥ 2f m . This is a statement of the sampling theorem.
9.3.2 Bandpass Signals A lowpass signal has been assumed so far. However, the sampling theorem is also applicable to a bandpass signal gbp (t) of bandwidth B that contains positive frequencies centred on f c > > 0, as shown in Figure 9.3a. The highest-frequency component of gbp (t) is f m and the lowest frequency component is f L . According to the sampling theorem, the spectrum |Gbp𝛿 (f )| of the sampled signal results from replications (at regular frequency spacing F s ) of the spectrum |Gbp (f )|of gbp (t). Figure 9.3a shows that the spectrum of gbp (t) lies in the positive frequency band (fL , fm ) and corresponding negative frequency band (−fm , −fL ). The nth replicated band within Gbp𝛿 (f ) is at locations Bandn = (nF s − fm , nF s − fL ),
(nF s + fL , nF s + fm ) n = ±1, ±2, ±3, · · ·
(9.8)
where positive n specifies the positive band and negative n specifies the corresponding negative band. If we apply the sampling theorem as earlier stipulated for lowpass signals then the original bandpass signal can obviously be recovered from samples of gbp (t) taken at the rate F s ≥ 2f m . The resulting spectrum of the sampled signal is shown in Figure 9.3b for F s = 2f m . Only the first replicated band is shown in Figure 9.3b, where the positive bands (F s − f m , F s − f L ) and (F s + f L , F s + f m ) have been labelled 1 and 1′ , respectively, and their corresponding negative bands are labelled −1 and −1′ , respectively. Band 1 at (F s − f m , F s − f L ) is the lowest replicated positive band. Substituting F s = 2f m , we see that this band is at location (2f m − f m , 2f m − f L ) = (f m , f m + B), which just avoids any overlap into the original spectrum Gbp (f ) at (f L , f m ) shown in Figure 9.3a. A bandpass filter (BPF) of bandwidth B centred at f c (having gain response as shown in dotted outline in Figure 9.3b) will therefore recover Gbp (f ) and hence the bandpass signal gbp (t) from the sampled sequence. It is worth noting that here the required reconstruction filter is a BPF, whereas the filter employed to reconstruct baseband signals is lowpass, as discussed in the previous section. The biggest drawback of sampling a bandpass signal at the rate F s = 2f m discussed above is that this rate can be prohibitively high since f c is often quite large, sometimes many orders of magnitude larger than the signal bandwidth B. Observing that large swathes of Gbp (f ) are empty, with occupancy only within the two bands (f L , f m ) and (−f m , −f L ), distortion-free reconstruction ought to be possible at a much lower sampling rate if we can find one that simply ensures that none of the replicated bands of Eq. (9.8) overlaps into any of these two original bands. This is indeed the case for a sampling rate F s in the range ⌊ ⌋ f 2fm 2fL ≤ Fs ≤ , 1≤p≤ m (9.9) p p−1 B where p is an integer and ⌊x⌋ denotes the integer part of a positive real number x. Let us elaborate on two special cases of Eq. (9.9) to shed more light on its interpretation:
603
604
9 Sampling
|Gbp(f)|
B = fm – fL
(a) –fc –fm –fL
(b) Fs = 2fm –1′
fL
|Gbpδ(f)|
fm BPF gain response
–1 1 –Fs
f
fc
0
1 0
–fc
1′
1
f
Fs
fc |Gbpδ(f)|
(c) Fs = 2B –4′
–3′
–2′ –4Fs
–1′ –4 –3Fs
–3 –fc
1
–2
2 0
–Fs
–2Fs
–2
–4Fs
–4′
–3Fs
–1
–3′
–2Fs
3
Fs
–2′ –fc –Fs
1
–1′ 0
4
1′
fc
2′
3Fs 2Fs
|Gbpδ(f)|
(d) Fs = 2.8B –3
–1
2
4′
f
4Fs
Realisable filter gain response
3 Fs fc
3′
1′ 2Fs
4
2′ 3Fs
3′ 4Fs
f
Figure 9.3 Instantaneous sampling of bandpass signal gbp (t) of bandwidth B centred on f c and located at (f L , f m ), where f L = 3B: (a) representative spectrum of gbp (t); (b)–(d) spectra of sampled signal using various sampling rates F s .
●
If f m = B, meaning that the signal is a baseband (i.e. lowpass) signal, which by definition has bandwidth B equal to its highest-frequency component f m , then p = 1 and the applicable range of sampling rates is 2fm 2fL ≤ Fs ≤ ; 1 1−1 ⇒ Fs ≥ 2fm
●
⇒ 2fm ≤ Fs ≤ ∞;
which is simply the sampling theorem stated earlier for baseband signals and agrees with a minimum sampling rate F smin = 2f m as expected. If f L = qB, where q is an integer, it means that the positive frequency range f L to f m of gbp (t) starts at a location which is an integer multiple of bandwidth B. Note that q = 0 corresponds to a start at the origin, which is the baseband case discussed above, so here we consider only the bandpass cases where q ≥ 1. Substituting f L = qB and f m = qB + B = (q + 1)B into Eq. (9.9) yields the applicable range of sampling rates as 2(q + 1)B 2qB ≤ Fs ≤ , p p−1
1≤p≤q+1
Let us take a moment to examine the possibilities for F s . First, when q = 1, there are only two possible values for p, namely p = 1 and p = 2, which respectively yield the following ranges for F s
9.3 Proof of Sampling Theorem
4B 2B ≤ Fs ≤ ; ⇒ Fs ≥ 4B 1 1−1 4B 2B q = 1, p = 2 ∶ ≤ Fs ≤ ; ⇒ Fs = 2B 2 2−1 Thus when q = 1 (which means that f L = B and f m = 2B) then alias-free sampling is possible at the specific rate F s = 2B or at any rate greater than or equal to 4B. The latter option (F s ≥ 4B) corresponds to sampling at a rate that is at least twice the maximum frequency component of the bandpass signal. This is explored earlier and is illustrated in Figure 9.3b. So it is the first option (F s = 2B) that delivers a saving in sampling rate by taking advantage of the unoccupied frequency region f < f L to allow sampling to replicate the band (f L , f m ) at a regular spacing F s < 2f m without any overlap into ±(f L , f m ). Next, examining the situation for q = 2 (when p can take on values 1, 2, and 3), q = 3 (when p can be 1, 2, 3, 4) and so on, we find that all cases of integer band positioning (i.e. f L = qB, q = 0, 1, 2, 3, …) allow alias-free sampling (i.e. no overlap between image and original bands) at a minimum rate F s = 2B. In addition, there are windows of other nonminimum rates F s > 2B that support alias-free sampling. For example, for a bandpass signal in which the lowest-frequency component f L is three times its bandwidth (i.e. q = 3 and f m = 4B), there are four increasingly narrower windows from which F s can be selected for alias-free sampling, namely q = 1, p = 1 ∶
Window 1 ∶ Fs ≥ 8B Window 2 ∶ 4B ≤ Fs ≤ 6B 8 B ≤ Fs ≤ 3B Window 3 ∶ 3 Window 4 ∶ Fs = 2B In general, for all values of q, the smallest value of p (= 1) gives the sampling rates window F s ≥ 2f m , the largest value of p (= q + 1) gives the minimum sampling rate F s = 2B, and the values of p in between gives a total of q − 1 windows of allowed sampling rates between 2B and 2f m . Figure 9.3c and d illustrate alias-free sampling at a rate of F s = 2B and F s = 2.8B, respectively, for a bandpass signal in which f L = 3B. The pair of image bands replicated at nF s is labelled n and n’ in the figures. Notice in Figure 9.3c how the first three replicated bands fall below f L , whereas the fourth (and higher) replicated bands fall above f m , and none of them overlaps into the original band (f L , f m ). This original band can therefore be extracted using a BPF to reconstruct the original signal gbp (t) without any distortion. A nonminimum rate (such as F s = 2.8B in this example) allows a realisable BPF to be used as illustrated in Figure 9.3d. It is important to recognise the fact that allowed sampling rates for a bandpass signal fall in disjoint windows. This means that one cannot increase the sampling rate willy-nilly above an allowed minimum value in an attempt, for example, to insert or increase a gap (called guard band) between the original and the replicated bands to permit the use of realisable reconstruction filters. The following general guidelines should be borne in mind when sampling a bandpass signal of bandwidth B and frequency band (f L , f m ). ● ● ●
●
●
You cannot sample at any rate less than 2B, otherwise there will be alias distortion. You can sample at any rate ≥ 2f m without alias distortion. You can use the theoretical minimum rate F s = 2B only if the passband is located at an integer number of bandwidths from the origin, i.e. if f m = (q + 1)B, where q is a positive integer. There are some rates between 2B and 2f m which will allow alias-free sampling, but these rates must satisfy Eq. (9.9). Any rate not satisfying this equation will result in an overlap of a replicated band into the original band thereby causing alias distortion. To provide a transition band Δf on either side of (f L , f m ) for a realisable reconstruction filter, apply Eq. (9.9) to the augmented band (f L − Δf /2, f m + Δf /2) and determine the minimum allowed sampling rate for this augmented band. When the true signal of band (f L , f m ) is sampled at the minimum rate so determined, there will be a guaranteed guard band ≥ Δf on either side of (f L , f m ) separating it from adjacent image bands.
605
606
9 Sampling
The method of bandpass sampling described above is known as uniform bandpass sampling. Another method of bandpass sampling, known as quadrature sampling, allows the use of any rate F s ≥ 2B. For a more detailed discussion of bandpass sampling supported by worked examples, the reader is referred to Chapter 4 of [1]. In the light of Figure 9.3, it should be emphasised that the required minimum sampling rate or Nyquist frequency F smin is twice the bandwidth of the analogue signal, and not necessarily twice the maximum frequency component of the signal. In lowpass signals, bandwidth and maximum frequency component are equal, and F smin may be correctly expressed as twice the maximum frequency component. However, the bandwidth of bandpass signals is typically much less than the maximum frequency component of the signal, and F smin in this case must be expressed as twice the bandwidth. In the rest of this chapter it will be assumed that the analogue signal g(t) is a lowpass signal. The discussion that follows may be applied to a bandpass signal gbp (t) of centre frequency f c if the signal is first transformed into a lowpass signal by a frequency translation of f c .
9.3.3 Sampling at Nyquist Rate Figure 9.4 shows the result of sampling an analogue signal g(t) of bandwidth f m at the Nyquist rate 2f m . Note that the replicated bands in the spectrum of the sampled signal touch but do not overlap. In order to recover the original signal an LPF is required that passes the baseband spectrum containing frequencies f = 0 to f m , and completely rejects the replicated bands at frequencies f = f m to ∞. The filter must have the ideal brickwall frequency response of zero transition width, as shown by the dotted line in Figure 9.4c. Such a response is unrealisable in real time. |G(f)|
(a)
–fm
f
fm |ΔT (f)| s
(b)
–3Fs
–2Fs
–Fs
0
|Gδ(f)|
Fs
2Fs
3Fs
f
Brickwall filter response
(c)
–3Fs
–2Fs
–Fs
0
Fs
2Fs
3Fs
f
Figure 9.4 Sampling at Nyquist rate F s = 2f m . Spectra of the following signals: (a) analogue signal g(t); (b) switching impulse train; (c) sampled signal g𝛿 (t).
9.4 Aliasing
Realisable reconstruction filter response
Undistorted spectrum
|Gδ(f)|
1st replicated band
Fs – f m > f m ⇒ Fs > 2fm
–Fs Figure 9.5
–Fs + fm
–fm
fm
0
Fs – fm
f
Fs
Sampling a rate of F s larger than Nyquist rate allows use of a realisable reconstruction filter.
Therefore, in practice a sampling rate higher than the Nyquist frequency is employed to allow the use of a realisable reconstruction filter having a finite transition width. The frequency response of the reconstruction filter is shown in dotted lines in Figure 9.5. It has the following specifications Pass band:
0 ≤ f ≤ fm
Transition band:
fm ≤ f ≤ Fs − fm
Stop band:
Fs − fm ≤ f < ∞
(9.10)
Let us now consider what happens if the sampling theorem is flouted by reducing the sampling frequency below the Nyquist rate.
9.4 Aliasing An analogue signal g(t) sampled at less than the Nyquist rate is said to be undersampled. The replicated bands in the spectrum G𝛿 (f ) of the sampled signal overlap, as shown in Figure 9.6, giving a resultant spectrum that is no longer an exact replica of the original spectrum G(f ). The baseband spectrum in G𝛿 (f ) – the shaded region of Figure 9.6 – is clearly distorted. The original signal g(t) can no longer be recovered from the sampled signal g𝛿 (t) even with an ideal LPF. This distortion resulting from undersampling is known as alias distortion because every frequency component f h in the original signal that is higher than half the sampling frequency F s appears in the sampled signal at a false Distorted resultant |Gδ(f)| spectrum
Overlapping replicated bands
Fs – fm < fm ⇒ Fs < 2fm
–3Fs Figure 9.6
–2Fs
–Fs –fm
0 fm Fs Fs – fm
2Fs
3Fs
Aliasing distortion due to undersampling a rate of F s less than Nyquist rate.
f
607
608
9 Sampling
or alias frequency fa = |Fs − fh |
(9.11)
To understand how the alias frequency is produced, let us consider the sampling of a sinusoidal signal g(t) of frequency f m = 4 kHz. Figure 9.7a shows samples of the sinusoid taken at a rate F s = 12 kHz, which satisfies the sampling theorem. You may observe in Figure 9.7b,c that this sequence of samples, denoted g𝛿 (t), not only fits the original sinusoid f m but also exactly fits and replicates an infinite set of sinusoids at frequencies Fs ± fm ,
2Fs ± fm ,
3Fs ± fm ,
···
This means that g𝛿 (t) contains the set of frequency components fm ,
Fs − fm ,
Fs + fm ,
2Fs − fm ,
2Fs + fm ,
3Fs − fm ,
3Fs + fm , · · ·
Thus, when g𝛿 (t) is passed through a lowpass reconstruction filter as specified by Eq. (9.10), the lowest-frequency component f m is extracted, thereby recovering the original sinusoid without distortion. Figure 9.8 gives a plot of the spectrum of g𝛿 (t) showing the above frequency components and the response of a reconstruction filter that would extract the original sinusoid from the sampled sequence. It is worth pointing out that for this arrangement to work without any distortion the reconstruction LPF must pass only frequency f m and block all the other frequency components in the above set, starting from F s − f m . This requires that F s − f m must be no lower than f m . That is Fs − fm ≥ fm ;
⇒
Fs ≥ 2fm
(9.12)
which is a statement of the sampling theorem given earlier. Let us consider what happens when the sinusoid is sampled as shown in Figure 9.9 a rate of F s = 6 kHz, which is less than the Nyquist rate (= 8 kHz). Now the sequence of samples is so widely spaced that it also exactly fits Sinusoid g(t) at fm = 4 kHz Sampled sinusoid gδ(t) at Fs = 12 kHz (a)
Sinusoid at Fs + fm = 16 kHz
(c)
t (ms)
0.5
t (ms)
0.5
t (ms)
Sinusoid at Fs – fm = 8 kHz
(b)
Sinusoid at 2Fs + fm = 28 kHz
0.5
Sinusoid at 2Fs – fm = 20 kHz
Figure 9.7 The samples g𝛿 (t) of a sinusoidal signal g(t) of frequency f m taken a rate of F s larger than Nyquist rate will fit (i.e. replicate) an infinite set of sinusoids f m , F s ± f m , 2F s ± f m , 3F s ± f m , . . . .
9.4 Aliasing
|G(f)|
(a)
–fm LPF recovers original sinusoid fm
(b)
f
fm |Gδ(f)|
1st replicated band
2nd replicated band
…
…
–2Fs
Figure 9.8
–Fs
–fm
fm Fs – fm F Fs + fm 2Fs s 2F – f 2Fs + fm s m Baseband
f
Amplitude spectra of (a) sinusoid g(t) of frequency f m and (b) the sinusoid sampled a rate of F s .
Fs – fm = 2 kHz ≡ fa
fm = 4 kHz
0.5
t (ms)
Figure 9.9 Sampling a sinusoid of frequency f m = 4 kHz a rate of F s = 6 kHz (less than Nyquist rate). The samples also fit (i.e. replicate) an infinite set of sinusoids having frequencies nF s ± f m , n = 1, 2, 3, … In this case, the lowest replicated frequency F s − f m = 2 kHz is lower than f m and therefore cannot be blocked by an LPF designed to pass f m , so it will constitute an alias frequency, denoted f a .
a lower-frequency sinusoid f a = F s − f m = 2 kHz, which is shown in Figure 9.9. The sequence will of course also fit higher-frequency sinusoids at F s + f m and nF s ± f m , n = 2, 3, 4, …, which are not shown in Figure 9.9 to avoid clutter. Thus, in this case of undersampling, the sampled signal contains the frequency components fa ,
fm ,
Fs + fm ,
2Fs − fm ,
2Fs + fm ,
3Fs − fm ,
3Fs + fm , · · ·
The lowpass reconstruction filter of Eq. (9.10) will therefore pass two sinusoids, namely the original sinusoid f m , and an alias sinusoid f a , which causes distortion. This can be seen in Figure 9.10b, which shows the spectrum of the undersampled sinusoid. For clarity, the replicated bands have been sketched using different line patterns – solid
609
9 Sampling
|G(f)| (a)
–fm
f
fm
(b) … –2Fs – fm
Alias, fa
|Gδ(f)|
LP F
610
0 –2Fs + fm –Fs –fm –2Fs fm –Fs – fm –Fs + fm Fs – fm
… Fs 2Fs – fm 2Fs Fs + fm
2Fs + fm
f
Figure 9.10 Amplitude spectrum |G𝛿 (f )| of a sinusoid of frequency f m that is sampled a rate of F s < Nyquist rate. Spectra of (a) the sinusoid and (b) the undersampled sinusoid.
for the baseband, dotted for the first replicated band F s ± f m , and dashed for the second replicated band 2F s ± f m . Note that the alias sinusoid of frequency f a arises from the first replicated band overlapping into the baseband. The overlapping of replicated bands occurs because they have been duplicated too close to each other along the frequency axis at intervals of (a small value of) F s . It must be emphasised that only frequency components higher than half the sampling frequency produce an alias. When the sampling frequency is chosen to be at least twice the maximum frequency component of the (lowpass) analogue signal as required by the sampling theorem then no such high-frequency components exist in the signal, and aliasing does not occur. Figure 9.10 shows aliasing caused by the overlap of only the first replicated band into the baseband of the sampled signal’s spectrum. In general, depending on how small F s is relative to f m , aliasing may be caused by the overlap of one or more of the replicated bands into the baseband. That is, if kF s − f m < f m (i.e. kF s < 2f m ), it means that the kth replicated band overlaps into the baseband (0, f m ), where f m is the maximum frequency component of the lowpass signal. As a result of this overlap, a frequency component f in the lowpass signal in the range kF s − fm ≤ f ≤ fm ,
k = 1, 2, 3, 4, · · ·
(9.13)
will present itself within the sampled signal sequence at an alias frequency fa,k = |kF s − f |
(9.14)
so that a kth band of alias frequencies is produced in the range Banda,k = |kF s − fm | → fm
(9.15)
inside the signal reconstructed (by the LPF) from the sampled sequence. Figure 9.11 shows the result of sampling a 5 kHz sinusoid using sampling rate F s = 3 kHz, which clearly violates the sampling theorem. The message is a sinusoid, so it contains only one frequency component f which is also the maximum frequency f m . Substituting F s = 3, f m = 5 into the condition stated in Eq. (9.13), we see that it is satisfied for k = 1, 2, and 3, since, for k = 1, 2, and 3, kF s − f m , respectively, = −2, 1, and 4 kHz, each of which is less than f m . Thus, the first three replicated bands overlap into the baseband and produce the respective alias components
9.4 Aliasing
fm
–fm
(a)
f, kHz –5
5
0
Alias components at 1, 2, 4 kHz
–5 –2Fs
0
–Fs
Fs
5
2Fs
3Fs
10
3Fs + fm
2Fs + fm
Fs + f m
–Fs + fm 3Fs – fm
2Fs – fm
–Fs – fm
–2Fs – fm
–3Fs – fm
–10 –3Fs
fm –3Fs + fm Fs – fm –2Fs+fm
–fm
(b)
f, kHz
Figure 9.11
Spectra of (a) Sinusoid of frequency f m = 5 kHz and (b) its sampled sequence using sampling rate F s = 3 kHz.
START
k=1 k=k+1
kFs < 2fm?
Yes
No No (more) alias
fa,k = kFs – fm
Alias frequencies fa,1, fa,2, ... END
Figure 9.12 Steps for extracting all alias frequency components in a sampled sequence obtained by sampling an analogue signal of maximum frequency f m at a sampling rate F s .
labelled in Figure 9.11b and obtained using Eq. (9.14) as fa,1 = |Fs − fm | = |3 − 5| = 2 kHz fa,2 = |2Fs − fm | = |6 − 5| = 1 kHz fa,3 = |3Fs − fm | = |9 − 5| = 4 kHz Figure 9.12 provides steps in the form of a flowchart which may be followed to systematically check for and extract all alias components in a sampled sequence. Starting with k = 1, one tests if F s < 2f m . If the sampling theorem was obeyed in the sampling then the test outcome will be negative (i.e. ‘No’) and the process terminates with no alias components. However, if F s < 2f m , then the test outcome is positive (i.e. ‘Yes’), indicating the presence of an alias component (or band) located at frequency f a,k = |kF s − f m | for k = 1. The value of k is incremented by 1 and the steps are repeated using k = 2. This continues until a value of k is reached at which the test yields a negative outcome, indicating that there are no further alias components, and the process terminates. Aliasing is further explored in the following worked example.
611
612
9 Sampling
Worked Example 9.1 An information-bearing signal g(t) has the spectrum shown in Figure 9.13a. (a) What is the minimum sampling frequency F smin to avoid aliasing? (b) Sketch the spectrum of the sampled signal (in the interval ± 28 kHz) when sampled at F smin . Why would F smin not be used in practice? (c) Sketch the spectrum of the sampled signal over the interval ± 16 kHz when g(t) is sampled at a rate F s = 6 kHz and determine the band of alias frequencies.
(a) Minimum sampling frequency F smin required to avoid aliasing is twice the maximum frequency component f m of the analogue signal. From Figure 9.13a, f m = 4 kHz, so that Fsmin = 2fm = 8 kHz (b) Figure 9.13b shows a sketch of the spectrum of the information signal when sampled at the above rate. This minimum sampling rate is not used in practice because it would necessitate the use of an ideal brickwall reconstruction filter to recover the original signal g(t) at the receiver. A higher sampling rate is used, which creates a frequency gap between the replicated spectra. This gap is needed by a realisable filter to make a transition from passband to stopband. (c) The spectrum of the sampled signal with F s = 6 kHz is shown in Figure 9.13c. Only the first replicated band overlaps into the baseband to create a band of alias frequencies given by Eq. (9.15), with k = 1, as Banda,1 = |Fs − fm | → fm = 2 → 4 kHz
|G(f)| (a) –4
f, kHz
4
fm = 4 kHz; Fs = 8 kHz
|Gδ(f)|
(b) –24
–16
–8 –4
0
fm = 4 kHz; Fs = 6 kHz
4
8
12 16 20 24 28
f, kHz
Distorted spectrum
|Gδ(f)|
(c) –16 Figure 9.13
–12
–6 –4
Worked Example 9.1.
0
2
4
6
8
10 12 14 16
f, kHz
9.5 Anti-alias Filter
A lowpass reconstruction filter will recover the spectrum in the shaded region of Figure 9.13c (from −4 to 4 kHz). It is this alias band that is responsible for the distortion in this recovered spectrum, altering its shape from a linear slope to a flat level between 2 and 4 kHz. Two steps are usually employed to minimise alias distortion. ●
●
Prior to sampling, the signal bandwidth is limited to a small value f m that gives acceptable quality. The LPF employed for this purpose is therefore called an anti-alias filter. The filtered signal is sampled at a rate F s > 2f m . For example, in standard telephone speech transmission using pulse code modulation (PCM), f m = 3.4 kHz and F s = 8 kHz.
9.5 Anti-alias Filter As stated above, the anti-alias circuit is an LPF used to remove nonessential or insignificant high-frequency components in the message signal in order, ultimately, to reduce the required transmission bandwidth. The application and desired fidelity determine the extent to which high-frequency components are discarded. For example, in high-fidelity compact disc audio, all audible frequencies must be faithfully recorded. This means that frequency components up to f m = 20 kHz are retained, and a sampling rate F s > 2f m is employed. Usually F s = 44.1 kHz. However, in telephone speech the requirements for fidelity are much less stringent and the bandwidth of the transmission medium is very limited. Although the low frequencies (50–200 Hz) in speech signals enhance speaker recognition and naturalness, and the high frequencies (3.5–7 kHz) enhance intelligibility, good subjective speech quality is still possible in telephone systems with the baseband frequency limited to 300–3400 Hz. This frequency range has been adopted as the standard telephone speech baseband. In television signals, the eye is largely insensitive to the high-frequency components of the colour signal, so these frequencies may be suppressed without a noticeable degradation in colour quality. An anti-alias filter is required to pass the baseband signal frequencies up to f m , and ideally to present infinite attenuation to higher frequencies. In this way there is no overlapping of replicated bands when the filtered signal is sampled at a rate of F s > 2f m . In practice, we can achieve neither infinite attenuation nor brickwall (zero-transition-width) performance but can only design a filter that has a specified minimum attenuation in the stopband and has a small but non-negligible transition width from passband to stopband. Let us examine the issues involved in the specification of an anti-alias filter using the worst-case situation shown in Figure 9.14 where a baseband information signal gi (t) has a uniform spectrum and is to be limited in bandwidth to f m before sampling. The spectrum G(f ) of the filtered signal is shown in Figure 9.14a along with the spectrum G𝛿 (f ) of the signal obtained by sampling g(t) at a rate of F s . Figure 9.14b shows a more detailed view of G𝛿 (f ). Observe that frequency components beyond F sb in the first replicated band cross over into the message baseband (0 → f m ) and will cause aliasing. Thus, we must have the following specifications for the anti-alias filter. ●
The stopband of the filter must start at frequency F sb given by Fsb = Fs − fm
●
(9.16)
In most cases, sampling is followed by quantisation to restrict the signal to a discrete set of values, as discussed in Chapter 10. The process of quantisation inevitably introduces error, and it is therefore enough to ensure that aliasing error in the passband is maintained at a level just below the inevitable quantisation noise. So the minimum stopband attenuation Amin must be as indicated in Figure 9.14b Amin = SQNR + 3 dB where SQNR is the signal-to-quantisation-noise-power ratio.
(9.17)
613
614
9 Sampling
Input signal gi(t)
Anti-alias LPF
(a)
Filtered signal g(t)
Sampler Fs
Sampled signal gδ(t)
|G(f)|
|Gi(f)|
|Gδ(f)|
f
f
–Fs
Fs
f
|Gδ(f)|, dB
0 –3 (b)
SQNR Amin
–Fs
0
fm
Fs
f
Fsb Figure 9.14
Specification of an anti-alias filter.
Let the anti-alias filter be an nth order lowpass Butterworth filter. The attenuation A of this filter as a function of frequency f is given by [ ( )2n ] f A = 10 log10 1 + dB (9.18) fm where f m is the maximum passband frequency (i.e. the 3 dB point, which is the frequency at which the gain of the filter is down by 3 dB from its maximum value). Since A = Amin at f = F sb , we may write Amin = 10 log10 [1 + (Fsb ∕fm )2n ] and solve for F sb to obtain Fsb = fm [10(Amin ∕10) − 1]1∕2n Finally, using Eq. (9.16), we obtain an expression for the required sampling frequency F s { } Fs = fm 1 + [10(Amin ∕10) − 1]1∕2n
(9.19)
The above equation is an important result, which gives the minimum sampling frequency F s required to maintain alias frequencies at a level at least Amin (dB) below the desired baseband frequency components.
9.6 Non-instantaneous Sampling
Worked Example 9.2 The analogue input signal to a 12-bit uniform quantisation PCM system has a uniform spectrum that is required to be limited to the band 0 → 4 KHz using a sixth order Butterworth anti-alias filter. The input signal fully loads the quantiser. In designing the filter, the goal is to maintain aliasing error in the passband at a level just below quantisation noise. (a) Determine the minimum stopband attenuation Amin of the anti-alias filter. (b) Calculate the minimum sampling frequency F s . (c) How does Fs compare with the standard sampling rate of 8 kHz used for speech?
(a) Amin is given by Eq. (9.17). It is shown in Chapter 10 that SQNR = 1.76 + 6.02k
(9.20)
where k (= 12 in this case) is the number of bits needed to identify each of the discrete signal levels in the quantiser. Thus Amin = 1.76 + 6.02 × 12 + 3 = 77 dB (b) Substituting the values f m = 4 kHz, n = 6, and Amin = 77 dB in Eq. (9.19) gives the desired minimum sampling frequency } { Fs = 4 1 + [10(77∕10) − 1]1∕12 = 21.53 kHz (c) The required sampling frequency, F s = 21.53 kHz, is rather high compared to a Nyquist rate of 8 kHz. To see why, note that a worst-case scenario was assumed in which the input analogue signal has a uniform amplitude spectrum that needs to be limited to the band 0 → f m using a realisable filter. The anti-alias filter requires enough transition width to reduce the signal’s spectral components within the stopband by the required amount Amin . In practice, for example speech signals, the amplitude spectrum Gi (f ) is not uniform, and the frequency components above f m are of very small amplitudes, requiring only a little extra attenuation by a filter to bring them down to the level of the quantisation noise voltage. Even for signals of uniform spectrum, the sampling frequency required to maintain aliasing error below quantisation noise can be reduced by using a higher-order (i.e. a steeper roll-off) filter. For example, if in the above problem we use a Butterworth filter of order n = 12 then F s would be only 12.37 kHz. Note, however, that in many cases filters with a steeper roll-off introduce more phase distortion. Question 9.6 deals with the design of an anti-alias filter for a signal of nonuniform spectrum.
9.6 Non-instantaneous Sampling We have so far discussed instantaneous sampling, which is not feasible in practice for the following reasons: ●
Sampling pulses must have infinite amplitude in order to carry enough signal energy. This can be seen by examining Eq. (9.4) and noting that the pulse amplitude A must be infinite if the factor Ad, and hence the spectral components of the impulse train, are to be nonzero as d → 0.
615
616
9 Sampling ●
●
A system must have infinite bandwidth in order to sustain instantaneously sampled pulses without introducing some distortion. Instantaneous switching is required, which is not attainable using physically realisable electronic devices with their inherent capacitive and inductive elements that tend to slow down the rates of change of voltage and current, respectively.
Let us therefore consider the practical implementation of sampling using pulses of finite width, known as non-instantaneous sampling. We will discuss its effect on the spectrum of the sampled signal, and the distortion on the reconstructed signal. There are two types of non-instantaneous sampling, namely natural and flat-top sampling.
9.6.1 Natural Sampling If we replace the impulse train switching signal 𝛿 Ts (t) in Figure 9.1a with the rectangular pulse train rectTs (t/𝜏) of nonzero duty cycle d = 𝜏/T s shown in Figure 9.15b, we obtain the sampled signal gΠn (t) shown in Figure 9.15c. Note how the top of each pulse follows the variation of the original analogue signal g(t). This scheme, in which an input analogue signal is simply switched through to the output by a rectangular pulse train, is known as natural sampling. The electronic switch in this case is realisable using, for example, CMOS hardware. The output or sampled signal follows the variations of the analogue input during the fraction d of time that the switch is closed and is zero during the fraction (1 − d) of time that the switch is open. The question now is whether the original signal g(t) can be recovered from this non-instantaneously sampled signal. This question is explored in the frequency domain below, but first, for the sake of completeness, let us draw on the work of Chapter 2 to give a mathematical
(a)
g(t)
t rectT (t/τ) s
(b)
τ d = τ/Ts
Ts
(c)
t
gпn(t)
t
Figure 9.15
Natural sampling: (a) analogue signal; (b) switching rectangular pulse train; (c) sampled signal.
9.6 Non-instantaneous Sampling
definition of the rectangular pulse train of Figure 9.15b as ) ( ( ) ∑ t − nT s t = , 𝜏 = dT s rect rectTs 𝜏 𝜏 n { ( ) 1, −𝜏∕2 ≤ t ≤ 𝜏∕2 t = rect 𝜏 0, Otherwise
(9.21)
Let us assume that g(t) is of bandwidth f m and that it has the amplitude spectrum shown in Figure 9.16a. The specific shape of the spectrum is irrelevant; however, we have chosen a rectangular shape because it makes spectral distortion more easily observable. The same arguments leading to Eq. (9.2) and Eq. (9.4) apply. However, in this case d > 0 so that the simplification of Eq. (9.5) does not apply. Using Eq. (4.17) in Chapter 4 (with f o ≡ F s = 1/T s ), the normalised (Ad = 1) Fourier series of the rectangular pulse train rectTs (t/𝜏) (of duty cycle d = 𝜏/T s and amplitude A) and that of the naturally sampled signal gΠn (t) are given by ∞ ( ) ∑ t =1+2 sinc(nd) cos(2𝜋nF s t) 𝜏 n=1 ( ) t gΠn (t) = g(t) × rectTs 𝜏 ∞ ∑ = g(t) + 2 g(t) sinc(nd) cos(2𝜋nF s t)
rectTs
(9.22)
(9.23)
n=1
|G(f)|
(a)
–fm
f
fm RT (f) s
(b) 5Fs
–5Fs –4Fs
–3Fs
–2Fs
–Fs –fm 0 fm
Fs
2Fs
3Fs
4Fs
f
GΠn(f) (c)
d = ¼; Fs = 2.5fm –5Fs –4Fs
–3Fs
–2Fs
–Fs –fm
0 fm
5Fs Fs
2Fs
3Fs
4Fs
f
Figure 9.16 Natural sampling with rectangular pulse train of duty cycle 1/4. Spectrum of (a) analogue signal, (b) switching rectangular pulse train, and (c) sampled signal.
617
618
9 Sampling
We see that the naturally sampled signal gΠn (t) is the sum of the original signal g(t) and the product of 2 g(t) and an infinite array of sinusoids of frequencies nF s , n = 1, 2, 3, …, each product scaled by the factor sinc(nd). This is an interesting result, which shows that the only difference between natural sampling and instantaneous sampling is that the nth replicated band in the sampled spectrum is reduced in size (but not distorted) by the factor sinc(nd). The spectrum of gΠn (t) is shown in Figure 9.16c for d = 1/4. You will recall from Section 2.6.8 that sinc(nd) = 0 whenever nd is an integer (±1, ±2, ±3, …). This means that the following replicated bands will be scaled by a zero factor and therefore will be missing from the spectrum of gΠn (t) 1 2 3 4 , , , , ··· d d d d In Figure 9.16 with d = 1/4, the missing replicated bands are the 4th, 8th, 12th, … The original signal g(t) can be recovered from gΠn (t) using an LPF as discussed earlier, provided the sampling frequency F s ≥ 2 f m , as specified by the sampling theorem. Natural sampling is rarely used in practice because it places a severe limitation on the maximum frequency f m that can be accurately digitised after sampling. It can be shown that if the quantiser (that follows the natural sampler) has a conversion time 𝜏, and the desired conversion accuracy is half the quantiser step size then the highest frequency that can be digitised is n=
1 (9.24) 2 𝜋𝜏 where k is the number of bits per sample of the quantiser. Note that there is a limitation on f m whenever the conversion time 𝜏 is nonzero, as is usually the case. For example, with 𝜏 = 0.1 μs and k = 12, we have f m = 777 Hz. fm =
k
9.6.2 Flat-top Sampling If the top of each pulse is maintained constant or flat with a height equal to the instantaneous value of the analogue signal at the beginning of the pulse then we have what is known as flat-top sampling. This is the usual method of sampling implemented by an integrated circuit referred to as sample-and-hold, the operation of which is shown in Figure 9.17 in terms of a block diagram in (a), circuit implementation in (b), and illustrative input and output waveforms in (c). In Figure 9.17b, switch S1 is normally open and S2 is normally closed. A rectangular pulse train rectTs (t/𝜏) of switching period T s , pulse duration 𝜏, and duty cycle d = 𝜏/T s is used as the switching signal to actuate the two switches. S1 is turned on momentarily by the positive-going edge of the pulse, while S2 is turned off at the same instant and held off for the entire duration 𝜏 of the pulse. The capacitor C charges rapidly (through the negligible resistance path of ‘S1 on’) to the value of g(t) at the instant that S1 is turned on and holds that voltage for the entire pulse duration 𝜏 (during which S2 is off) since it has negligible discharge through the high-resistance path of ‘S2 off’. At the end of the pulse, S2 returns to its normally closed state and C discharges rapidly through S2 to zero and remains there until the next pulse causes S1 to turn momentarily on thereby charging C to the voltage level of g(t) at this next sampling instant. This carries on repeatedly so that the capacitor voltage gives a flat-top-sampled version of g(t). The input signal g(t) is usually connected to S1 through an operational amplifier (opamp, not shown) and the output gΠ (t) is taken from across the holding capacitor C through another opamp (also not shown). Figure 9.17c shows an illustrative analogue waveform and the resulting flat-top-sampled output for d = 1/4. Again, we must ask whether the original signal g(t) can be extracted from this flat-top-sampled signal gΠ (t). To examine this, we note that the instantaneously sampled signal g𝛿 (t) in Figure 9.1d is a train of impulse or Dirac delta functions each weighted by g(nT s ), the value of the analogue signal at the sampling instant. Figure 9.18 shows that the flat-top-sampled signal gΠ (t) consists of a rectangular pulse rect(t/𝜏) replicated at the locations of
9.6 Non-instantaneous Sampling
Analogue g(t) signal
(a)
Sample & gΠ(t) Hold
Sampled signal
rectTs(t/τ) Switching signal
(b)
Analogue signal Switching signal
gΠ(t)
S1
g(t) rectT (t/τ) s
S2
Sampled signal
C
g(t) gΠ(t) τ
d = τ/Ts = 1/4
(c)
t
Ts Figure 9.17 waveforms.
Sample-and-hold operation. (a) Block diagram; (b) circuit implementation; (c) analogue and sampled
gΠ(t)
t
rect(t/τ)
=
t
τ * gδ(t)
t Figure 9.18
Flat-top sampling as a convolution between instantaneous sampling and a rectangular pulse.
619
620
9 Sampling
the impulse functions that constitute g𝛿 (t). The height of each replicated pulse is equal to the weight of the impulse at that location. This can be recognised as a convolution process. That is, gΠ (t) is obtained by convolving g𝛿 (t) with rect(t/𝜏), which is written as follows gΠ (t) = g𝛿 (t) ∗ rect(t∕𝜏)
(9.25)
Noting that convolution in the time domain translates into multiplication in the frequency domain, it follows that the spectrum GΠ (f ) of the flat-top-sampled signal is given by the product of the spectrum G𝛿 (f ) of the instantaneously sampled signal g𝛿 (t) and the spectrum RTs (f ) of the rectangular pulse (9.26)
GΠ (f ) = G𝛿 (f )RTs (f )
With the spectrum G𝛿 (f ) being a replication of G(f ) at intervals F s along the frequency axis, and RTs (f ) being a sinc envelope (Figure 4.29b), it follows that GΠ (f ) is as shown in Figure 9.19b. We see that the spectrum of gΠ (t) is distorted by the sinc envelope of RTs (f ) – the spectrum of the finite-width rectangular pulse. This distortion is called the aperture effect and is like the distortion observed in television and facsimile arising from a finite scanning aperture size. Note, however, that the distortion to the baseband spectrum is very small, depending on the duty cycle of the sampling pulse. An LPF can therefore be used to recover the spectrum G(f ) and hence the original signal g(t) from the flat-top-sampled signal, with compensation made for aperture effect, as discussed after the worked example. Again, we require that F s ≥ 2f m .
|G(f)|
(a)
–fm
f
fm |GΠ(f)|
(b)
–4Fs
Fs = 3fm
–3Fs
–2Fs
–Fs
–fm 0
fm
Fs
2Fs
3Fs
4Fs
f
Figure 9.19 Flat-top sampling using rectangular pulse train of duty cycle 1/4 as switching signal. Spectrum of (a) analogue signal and (b) sampled signal.
9.6 Non-instantaneous Sampling
Worked Example 9.3 Obtain expressions for the following spectra in terms of the spectrum G(f ) of the original analogue signal g(t): (a) G𝛿 (f ), the spectrum of the instantaneously sampled signal. (b) GΠn (f ), the spectrum resulting from natural sampling. (c) GΠ (f ), the spectrum of the flat-top-sampled signal. (a) G𝛿 (f ) is obtained by taking the Fourier transform of both sides of Eq. (9.7). Before doing this, we first re-introduce in the right-hand side of (9.7) the factor Ad that was normalised to unity, and note that Ad = A𝜏∕Ts = 1∕Ts since A𝜏 is the area under the impulse function and we are dealing with a unit impulse train. Therefore [ ] ∞ 1 2 ∑ G𝛿 (f ) = G(f ) + F g(t) cos(2𝜋nF s t) Ts Ts n=1 =
∞ 1 ∑ G(f − nF s ) Ts n=−∞
(9.27)
where we have used the fact that the spectrum of 2 g(t)cos(2𝜋nF s t) is G(f ± nFs), which means the spectrum G(f ) shifted to the locations −nFs and +nFs along the frequency axis. Eq. (9.27) states that G𝛿 (f ) is given by exact duplications (except for a scaling factor 1/T s ) of G(f ) at intervals of F s along the frequency axis. This spectrum is shown in Figure 9.2c for a representative G(f ). (b) Let us denormalise the right-hand side of Eq. (9.23) by re-introducing the factor Ad, and note that usually the rectangular pulse train is of unit amplitude, so that A = 1. Taking the Fourier transform of Eq. (9.23) after this change yields [ ∞ ] ∑ 2g(t)sinc(nd) cos(2𝜋nF s t) GΠn (f ) = dG(f ) + F d n=1
=d
∞ ∑
sinc(nd)G(f − nF s )
(9.28)
n=−∞
Equation (9.28) states that GΠn (f ) is obtained by replicating G(f ) without distortion at intervals of F s along the frequency axis; however, the duplicates located at ±nF s are scaled down by the factor dsinc(nd), where d is the duty cycle of the rectangular pulse train employed in sampling. The spectrum GΠn (f ) is shown in Figure 9.16c for a representative G(f ). (c) GΠ (f ) is given by Eq. (9.26), with G𝛿 (f ) given by Eq. (9.27) and the spectrum RTs (f ) of a rectangular pulse given by entry 8 of Table 4.5 as RTs (f ) = 𝜏sinc(f𝜏). Thus GΠ (f ) = 𝜏 sinc(f 𝜏) ×
∞ 1 ∑ G(f − nF s ) Ts n=−∞
= d sinc(df ∕Fs )
∞ ∑
G(f − nF s )
(9.29)
n=−∞
where we have substituted d for the factor 𝜏/T s and d/F s for 𝜏 in the argument of the sinc function. Eq. (9.29) states that GΠ (f ) is obtained by duplicating G(f ) at intervals of F s along the frequency axis, and modifying the frequency component f within each duplicate by the factor dsinc(df/F s ). A plot of GΠ (f ) is shown in Figure 9.19b for a representative G(f ).
621
622
9 Sampling
9.6.3 Aperture Effect We observe above that neither instantaneous nor natural sampling is suitable for practical applications, although they provide samples from which the original signal can be recovered without distortion. However, the more feasible technique of flat-top sampling introduces a distortion of the baseband spectrum known as the aperture effect, which increases with duty cycle d, as shown in Figure 9.20 for d = 0.1, 0.5, and 1, and F s = 3f m . A measure of aperture effect is given by the attenuation Aa (in dB) of a frequency f in the baseband of GΠ (f ) relative to the attenuation at f = 0. Ignoring the factor d in Eq. (9.29) since it represents a constant scaling factor, we see that Aa is given by Aa = −20 log10 [sinc(df ∕Fs )],
0 ≤ f ≤ Fs ∕2
(9.30)
where the negative sign is required to make Aa positive. The indicated frequency range applies since the maximum frequency component f m of the original signal is at most half the sampling frequency. The maximum attenuation due to the aperture effect occurs at the maximum frequency component f m . In the worst case, f m = F s /2, giving Aamax = −20 log10 [sinc(d∕2)]
(9.31)
Equation (9.31) yields a maximum distortion of 0.04 dB, 0.9 dB, and 3.9 dB at duty cycles d = 0.1, 0.5, and 1.0, respectively. There is, therefore, negligible distortion for duty cycles d ≤ 0.1, which is evident in Figure 9.20.
(a) d = 0.1
–Fs
–fm
0
fm
Fs
–Fs
–fm
0
fm
Fs
–Fs
–fm
0
fm
Fs
f
(b) d = 0.5
f
(c) d = 1
f
Figure 9.20 Spectrum |GΠ (f )| of flat-top-sampled signal at various values of duty cycle d of sampling pulses. Based on a representative rectangular shaped spectrum |G(f )| of the original analogue signal g(t).
9.7 Summary
There are three different ways to minimise the aperture effect. ●
●
By using sampling pulses of small duty cycle d. As noted above, the distortion is negligible when d is less than about 10%. By oversampling with F s > > f m . If this condition is satisfied then the term f/F s in Eq. (9.30) is much less than unity for all frequency components of the original signal, which leads to very small values of attenuation Aa . For example, if F s = 20f m , then maximum aperture distortion in the worst-case scenario (d = 1) is [ ( )]| df | Aamax = Aa |d=1, f =fm = −20 log10 sinc | Fs ||d=1, f =f m [ ( )] fm = −20 log10 [sinc(0.05)] = −20 log10 sinc 20fm = 0.04 dB
●
However, there is a penalty of an increased bandwidth requirement as the sampling frequency is increased. By using a compensated reconstruction LPF. At large values of duty cycle and a moderate sampling frequency F s satisfying the sampling theorem, the amplitude distortion due to aperture effect is significant and must be compensated for. Since the functional form of the distortion is known, it is a straightforward matter to replace the ordinary LPF having a uniform passband response with a lowpass equaliser whose (normalised) response in the passband (f = 0 → f m ) is given by |He (f )| =
1 sinc(df ∕Fs )
(9.32)
In practice, the distortion can be treated as a linear effect, allowing the use of an equaliser whose gain increases linearly with frequency in the range 0 to f m .
9.7 Summary In this chapter, we have studied in some detail the three types of sampling, namely instantaneous, natural, and flat-top. It was noted that the first two do not introduce any distortion, provided the sampling frequency is larger than twice the analogue signal bandwidth. However, instantaneous sampling is not realisable using physical circuits, and natural sampling places a limitation on the maximum frequency that can be digitised when the conversion time is nonzero. The more practical technique of flat-top sampling introduces a distortion known as the aperture effect, which is, however, not a serious drawback since it can be readily minimised. The problem of aliasing, which arises when a signal is sampled at a rate of less than twice its bandwidth, was examined in detail, both in the time and in the frequency domains. Several measures for minimising alias distortion were also discussed, including the specifications of an anti-alias filter. The samples of an analogue signal have a continuum of values, which may be transmitted by analogue means using the so-called pulse amplitude modulation (PAM), pulse duration modulation (PDM), or pulse position modulation (PPM). These techniques are discussed in Chapter 1, where their demerits are highlighted. Signal processing that goes beyond sampling is required in order to exploit the numerous advantages of digital communications. In the next chapter, we will study the processes of quantisation and encoding, which complete the transformation of a signal from analogue to digital. Our focus will be on the techniques of PCM and its variants.
623
624
9 Sampling
Questions 9.1
Determine the Nyquist rate and Nyquist sampling interval for the following signals: (a) 5cos(200𝜋t) volts (b) 20 – sin2 (104 𝜋t) volts (c) 10rect(2 × 103 t) volts (d) 20rect(104 t)cos(106 𝜋t) (e) 5sin(105 𝜋t)trian(500 t) (f) 10sinc(400 t)sin(2 × 106 𝜋t). (Note: the rectangular (rect) and triangular (trian) pulse functions are defined in Section 2.6.5.)
9.2
A sinusoidal voltage signal v(t) = 20sin(2𝜋 × 104 t) volts is sampled using an impulse train 𝛿 Ts (t) of period T s = 40 μs. Sketch the waveform and corresponding double-sided amplitude spectrum of the following signals: (a) v(t) (b) 𝛿 Ts (t) (c) Sampled signal v𝛿 (t). (Note: the sketched spectrum of v𝛿 (t) should extend over three replicated bands.)
9.3
Repeat Question 9.2 with a sampling interval Ts = 66 23 𝜇s. Can v(t) be recovered from v𝛿 (t) in this case? Explain.
9.4
Figure Q9.4 shows the single-sided spectrum of a signal g(t). Sketch the spectrum of the instantaneously sampled signal g𝛿 (t) over the frequency range ± 4F s , for the following selections of sampling frequency F s : (a) F s = Nyquist rate (i.e. F smin ) (b) F s = 2F smin (c) Fs = 23 Fsmin . (d) Determine the band of alias frequencies in (c). |G(f)|
Figure Q9.4 Question 9.4.
A
0
6
f (kHz)
9.5
An AM signal g(t) lies in the frequency band 40–50 kHz. Assuming a triangular spectrum for g(t), sketch the spectrum of the instantaneously sampled signal g𝛿 (t) obtained by sampling g(t) at three times the Nyquist rate. Your sketch should extend over three replicated bands. How would g(t) be recovered from g𝛿 (t)?
9.6
Let us assume that the spectrum G(f ) of a speech signal can be approximated as shown in Figure Q9.6, where the spectrum is constant up to 500 Hz and then decreases linearly to −52 dB at 7 kHz. The signal is to be sampled at F s = 8 kHz and digitised in a uniform analogue-to-digital conversion (ADC) using k = 8 bits/sample. (a) Determine the order n of an anti-alias Butterworth filter of cut-off frequency f m = 3.4 kHz that is required to maintain aliasing error in the passband at a level just below quantisation noise. (b) Repeat (a) for a signal that has a uniform spectrum and compare the two results.
Reference
|G(f)|, dB 0
–52
0.5
7.0
f (kHz)
Figure Q9.6 Question 9.6
9.7
A sinusoidal voltage signal v(t) = 20sin(2𝜋 × 104 t) volts is sampled using a rectangular pulse train rectTs (t/𝜏) of period T s = 40 μs and duty cycle d = 0.5 as the switching signal. Sketch the waveform and corresponding double-sided amplitude spectrum of the following signals: (i) v(t), (ii) rectTs (t/𝜏), and (iii) the sampled signal vΠ (t), assuming (a) Natural sampling (b) Flat-top sampling. (Note: the spectrum of the sampled signal should extend over three replicated bands.)
9.8
Starting with a sinusoidal message signal of frequency f m , show that if the quantiser has a conversion time 𝜏, and the desired conversion accuracy is half the quantiser step size then the highest frequency that can be digitised using natural sampling is given by Eq. (9.24).
9.9
Speech signal of baseband frequencies 300 Hz to 3400 Hz is sampled in a sample-and-hold circuit that uses a rectangular pulse train of duty cycle d and sampling frequency F s . Determine the maximum distortion (in dB) due to aperture effect for the following cases: (a) F s = 6.8 kHz, d = 0.8 (b) F s = 8 kHz, d = 0.8 (c) F s = 40 kHz, d = 0.8 (d) F s = 8 kHz, d = 0.1. Comment on the trends in your results.
Reference 1 Otung, I. (2014). Digital Communications: Principles & Systems. London: Institution of Engineering and Technology (IET). ISBN: 978-1849196116.
625
627
10 Digital Baseband Coding
Have a goal! Erect it before the start of play. Without goalposts, even the most exciting football game becomes boring and aimless.
In this Chapter ✓ Types of quantisation: midrise, mid-step, rounding, and truncation. ✓ Uniform quantisation: all the design parameters, trade-offs, and limitations. ✓ Nonuniform quantisation: a detailed discussion of the quantisation and encoding processes in 𝜇-law and A-law pulse code modulation (PCM) and improvements in signal-to-quantisation-noise ratio (SQNR). ✓ Differential pulse code modulation (DPCM) and low bit rate (LBR) speech coding: introduction to a wide range of techniques to digitally represent analogue signals, especially speech with as few bits as possible. ✓ Line coding: how bit streams are electrically represented in communication systems.
10.1 Introduction The four steps involved in converting analogue signals to digital are introduced in Chapter 1 and the first two steps of lowpass filtering and sampling are discussed in detail in Chapter 9. This chapter focuses on the remaining steps involving quantisation and encoding. Quantisation converts a sampled analogue signal g(nT s ) to digital form by approximating each sample to the nearest of a set of discrete values. The result is a discrete-value discrete-time signal gq (nT s ), which can be conveyed accurately in the presence of channel noise that is less than half the spacing of quantisation levels. However, further robustness to noise can be achieved if the N quantisation levels are numbered from 0 to N − 1, and each level is expressed as a binary number consisting of k binary digits (or bits), where k = log2 N. This encoding process in which gq (nT s ) is converted to a string of binary 0’s and 1’s has been traditionally called pulse code modulation (PCM). Note, however, that the use of the word modulation in this context is inappropriate, in view of our discussions in Chapters 7 and 8. The resulting bit stream is electrically represented as voltage values by using a suitable line code, e.g. +12 V for binary 0 and −12 V for binary 1. Binary coding gives maximum robustness against noise and is easy to regenerate. The concern at the receiver is not with the exact voltage level, but with whether the received voltage level falls in the range that represents a binary 0 or 1. Thus, the noise level must be large in order to cause any error. The technique of line coding (first introduced in Chapter 1) is discussed further in this chapter. Communication Engineering Principles, Second Edition. Ifiok Otung. © 2021 John Wiley & Sons Ltd. Published 2021 by John Wiley & Sons Ltd. Companion Website: www.wiley.com/go/otung
628
10 Digital Baseband Coding
In discussing PCM, we concentrate first on uniform quantisation, which is also called linear ADC (analogue-to-digital conversion), in order to discuss in quantitative terms the problem of quantisation noise and various design considerations, including the noise-bandwidth trade-off in PCM. The demerits of uniform quantisation that make them unsuitable for bandwidth-limited applications are highlighted. Attention then shifts to the more practical nonuniform quantisation techniques. Encoding and decoding procedures are explained in detail for the standard A-law and 𝜇-law PCM. Various modifications to PCM, mainly aimed at improved spectrum efficiency, are discussed. These include differential PCM, with delta modulation (DM) and adaptive differential pulse code modulation (ADPCM) introduced as special cases. LBR coding techniques for speech are then also briefly introduced.
10.2 Concept and Classes of Quantisation The sampling process converts an analogue signal g(t) into a discrete sequence of sample values g(nT s ), n = 0, 1, 2, 3, …, spaced at the sampling interval T s . As these samples are obtained from the instantaneous values of an analogue (continuous value) signal, they can take on any value in a range from the smallest to the largest sample value. The sampled signal g(nT s ), although discrete in time, is still continuous in value and is thus not a digital signal. To convert g(t) into a digital signal, the next step after sampling must be to constrain the sequence of values to a finite set of possible values. This process is known as quantisation. We saw earlier that sampling a signal at the correct rate does not introduce any errors. That is, sampling is a reversible process. The original signal can be accurately recovered from the samples. By contrast, quantisation introduces irrecoverable errors, known as quantisation noise. We can make this noise as small as we wish, but at a price! Quantisation may be classified as: ● ● ●
Uniform or nonuniform Midrise or mid-step (also called mid-tread) Rounding or truncation.
In uniform quantisation the quantisation levels to which sample values are approximated are equally spaced using a constant step size over the entire input range of the quantiser. Nonuniform quantisation, on the other hand, employs unequal spacing of quantisation levels, with step sizes that are smaller near zero and progressively larger away from the origin. We will find in Section 10.4 that these step sizes are usually designed to increase exponentially away from the origin. Uniform and nonuniform quantisation are illustrated in Figure 10.1a for N = 8 intervals. The quantiser has input range 2C, from −C to +C, which is divided into N equal intervals. The dashed line bisecting each interval is the quantiser output for all inputs in that interval, i.e. it is the discrete level to which all sample values falling in the interval are approximated. Each output level of the quantiser has been assigned a positive integer quantisation index L(x) by counting the output levels from 0 to N − 1, going away from the origin in either direction towards ±C. The sign of the quantiser input x is usually identified by the most significant bit (MSB) in the encoding process, as discussed later. This allows each output level to be represented by a unique binary codeword. Midrise quantisation has the value zero as a boundary between two quantisation intervals so that zero is a point of transition where you ‘rise’ from one quantisation level to another. The transfer characteristic (i.e. graph of output versus input) of an N-level midrise quantiser therefore has exactly N/2 output levels on either side (i.e. left and right) of the y axis. In mid-step quantisation on the other hand, there is no transition between quantisation levels at zero; and zero is one of the quantisation levels or steps or treads but not a boundary between two quantisation levels. The mid-step quantiser therefore has N/2 − 1 output levels below zero and N/2 − 1 output levels above zero in addition to the zero-output level, making a total of N − 1 output levels. The description ‘mid’ is a reference to the fact that zero is the midpoint of the quantiser input range. One beneficial feature of mid-step (also referred to as
10.2 Concept and Classes of Quantisation
Input level
Input level
C
C
3
3
2 2
1 (i)
0
0
(ii)
0
2
1 2
3
3
–C
1 0
0 1
0
–C Quantisation index L(x)
Quantisation index L(x) (a) Midrise
Mid-step
Output
C 7C/8
Rounding
5C/8
4C/7
3C/8 Input
C/8 0 –C/8
0
2C 8
4C 8
6C 8
–6C/7 –C –C – 5C – 3C – C 0 C 7 7 7 7
C
Output
C
6C/8
4C/8
4C/8
2C/8
–2C/8
–4C/8
–4C/8
–6C/8
–6C/8 2C 8
4C 8
6C 8
Input
0
–2C/8
0
C
2C/8
Input
0
5C 7
3C 7
Output
C
6C/8
–C 6C 4C 2C – – –C – 8 8 8
Input
0 –4C/7
–5C/8
Truncation
2C/7 –2C/7
–3C/8 –7C/8 –C –C – 6C – 4C – 2C 8 8 8
Output
C 6C/7
–C 6C 4C 2C – – –C –
C
8
8
8
0
2C 8
4C 8
6C 8
C
(b) Figure 10.1 (a) Classes of quantisation: (i) uniform, (ii) nonuniform; (b) Classes of quantisation: Left column = Midrise; Right column = Mid-step or Mid-tread; Top row = Rounding; Bottom row = Truncation.
629
630
10 Digital Baseband Coding
mid-tread) quantisation is that it removes noise chatter on idle lines since all small voltage fluctuations about zero are set to zero at the quantiser output. Such noise on an idle line will cause a midrise quantiser output to fluctuate between the two output levels nearest to zero. Mid-step quantisation is employed in 𝜇-law PCM, whereas A-law PCM uses midrise quantisation. A-law and 𝜇-law PCM are, however, based on nonuniform quantisation, which is discussed in detail in Section 10.4. In rounding quantisation, the quantiser output is generated by approximating all input values within each quantisation interval to the midpoint of the interval, whereas in truncation quantisation all input values within each interval are approximated to the boundary point of the interval. Thus, the maximum quantisation error in a rounding quantiser is half the quantisation step size, whereas that of a truncating quantiser is the full step size. Each of these four classes of quantisation can be implemented in the uniform or nonuniform class. Figure 10.1b shows their transfer characteristics for the uniform class of implementation. The top left graph is for a midrise rounding quantiser, the top right is for a mid-step rounding quantiser, bottom left is for a midrise truncating quantiser and bottom right is for a mid-step truncating quantiser. Notice that the mid-step truncating quantiser has an enlarged dead zone (i.e. input range quantised to zero) equal to two step sizes; the mid-step rounding and midrise truncating quantisers have a dead zone equal to one step size, but the midrise rounding quantiser has no dead zone. Also, all the quantisers have a step size of 2C/N except the mid-step rounding quantiser which has a step size of 2C/(N − 1). The mid-step rounding quantiser uses a larger step size than the midrise rounding quantiser and therefore incurs slightly larger approximation errors or quantisation noise. The difference in step size is 2C∕(N − 1) − 2C∕N = 2C∕[N(N − 1)], which is negligibly small for large N, e.g. N = 256, and typically low values of C. Each of the above quantisers, when implemented as a uniform quantiser having a constant step size Δ, maps an input sample of value x to an output quantisation index L(x), which is a positive integer given by ⌋ ⌊ ⎧ |x| + 1 , Mid-step rounding ⎪ Δ 2 ⎪ ⎪⌊|x|∕Δ⌋, Mid-step truncation (10.1) L(x) = ⎨ ⎪⌊|x|∕Δ⌋, Midrise rounding ⎪ ⎪ Midrise truncation ⎩⌊|x|∕Δ⌋, where |x| denotes the magnitude or absolute value of x and the notation ⌊⌋ is the floor operator. That is, ⌊z⌋ means the largest integer less than or equal to the real number z. For example, ⌊2.01⌋ = 2; ⌊8.99⌋ = 8; ⌊−3.2⌋ = −4. An N-level quantiser requires a k-bit source encoder (where N is an integer power of 2 and k = log2 N) to convert the quantised output into a binary codeword by coding the above index L(x) using k − 1 bits along with an extra bit for the sign of x. This sign of x is specified below using the signum function sgn(x), which is +1 for x positive (or zero) and is −1 for negative x. A suitable uniform or linear decoder will read or extract L(x) and sgn(x) from this codeword and then convert them into a reconstructed sample y(x) given by ⎧Δ ⋅ L(x), ⎪ ⎪Δ ⋅ L(x), ⎪ ( ) 1 y(x) = sgn(x) × ⎨ , Δ ⋅ L(x) + ⎪ 2 ( ) ⎪ ⎪Δ ⋅ L(x) + 1 − sgn(x) , ⎩ 2
Mid-step rounding Mid-step truncation Midrise rounding
(10.2)
Midrise truncation
There are various source encoding standards for constructing binary codewords to represent the quantised outputs. These include A-law and 𝜇-law PCM (discussed in Section 10.4), fixed-point number representation using signed magnitude, one’s complement and two’s complement formats (introduced in Worked Example 10.1) and floating-point number representation (not discussed).
10.2 Concept and Classes of Quantisation
Worked Example 10.1 This is an important worked example in which we introduce three formats for fixed-point binary representation of quantiser outputs and compare the quantisation noise of midrise rounding and midrise truncating quantisers. An analogue signal x(t) = 12sin(200 𝜋t + 30∘ ) V is sampled at five times the minimum rate stipulated by the sampling theorem starting at time t = 0.1 ms. The sampled signal is then quantised using: (a) Uniform midrise rounding quantiser (b) Uniform midrise truncating quantiser that has a quantiser input range of ±12 V. The quantised samples are coded using a 6-bit linear encoder. Determine the following for each quantiser: (i) (ii) (iii) (iv)
Binary representation of the first 10 samples in signed magnitude format. Binary representation of the first 10 samples in one’s complement format. Binary representation of the first 10 samples in two’s complement format. The mean square quantisation error (MSQE) of the midrise rounding quantiser based on the first 10 samples. (v) The MSQE of the midrise truncating quantiser based on the first 10 samples. (vi) Compare the MSQE of the two quantisers and comment on your result.
We will show detailed calculations for only two samples and then present the rest of the results for the first 10 samples in two tables, one for each quantiser. The input signal x(t) is a sinusoid of frequency f m = 100 Hz. As discussed in Chapter 9, the minimum sampling rate is 200 Hz, so at five times this minimum, the sampling rate is F s = 1 kHz, giving a sampling interval T s = 1/F s = 1 ms. Sampling starts at 0.1 ms and is spaced T s apart, so the sampling instants, numbered from n = 0 to n = 9, are t(0) = 0.1 ms, t(1) = 1.1 ms, t(2) = 2.1 ms, t(3) = 3.1 ms, …, t(9) = 9.1 ms. The samples at these respective sampling instants are denoted x(0), x(1), …, x(9) and obtained as x(0) = 12 sin(200π × 0.1 × 10−3 + π∕6) = 6.6407 V x(1) = 12 sin(200π × 1.1 × 10−3 + π∕6) = 11.2474 V ⋮ x(9) = 12 sin(200π × 9.1 × 10−3 + π∕6) = −0.5025 V Given that the quantiser has input range ±C = ±12 V and that the number of bits/sample is k = 6, the number of quantisation levels is N = 2k = 64 and the step size Δ of the (specified midrise) quantisers is Δ = 2C∕N = 2 × 12∕64 = 3∕8 V Using Eq. (10.1), we obtain the output quantisation index, denoted Lr (x) for the rounding quantiser and Lt (x) for the truncating quantiser, for each input sample x(n) as ⌋ ⌊ ⌋ ⌊ |x(0)| 6.6407 = = ⌊17.7085⌋ = 17 Lr (x(0)) = Δ 3∕8 ⌋ ⌊ ⌋ ⌊ |x(1)| 11.2474 Lr (x(1)) = = = ⌊29.9930⌋ = 29 Δ 3∕8 ⋮
631
632
10 Digital Baseband Coding
⌊ Lr (x(9)) =
⌋ ⌊ ⌋ |x(9)| 0.5025 = = ⌊1.34⌋ = 1 Δ 3∕8
Lt (x(0)) = 17;
Lt (x(1)) = 29;
· · ·;
Lt (x(9)) = 1
Now to define the three formats for binary representation B(z) of a number z using k bits: in signed magnitude, k − 1 bits are used to represent the magnitude of z as a binary number. An MSB is then prepended (i.e. inserted in front), set to 0 if z is positive and 1 if z is negative. In one’s complement, if z is positive then B(z) is as for signed magnitude. But if z is negative then B(z) is obtained by flipping (i.e. negating) every bit of the signed magnitude representation of |z|. Two’s complement format is as above (i.e. the same as signed magnitude) when z is positive. But when z is negative, 1 is added to the one’s complement representation of z to obtain its two’s complement representation. For example, with k = 6 and k − 1 = 5, the 5-bit binary number for Lr (x(0)) = 17 is 10001 (since 17 = 16 + 1). Because x(0) is positive, we prepend MSB = 0 to this 5-bit number to obtain the 6-bit signed magnitude codeword 010001, which is also the one’s complement and two’s complement representation. Similarly, the 5-bit binary number for Lr (x(9)) = 1 is 00001. Because x(9) is negative, we prepend MSB = 1 to this 5-bit number to obtain the 6-bit signed magnitude codeword 100001. The one’s complement codeword must convey the information that x(9) is negative and this is done by flipping every bit of 000001 (which is the 6-bit signed magnitude representation of |Lr (x(9))|) to obtain 111110. The two’s complement codeword is obtained by adding 1 to the one’s complement codeword (since in this case the number is negative), giving 111110 + 1 = 111111. The complete set of results is listed in Table 10.1 for the midrise rounding quantiser. Codewords for the midrise truncating quantiser outputs are identical to the list in Table 10.1. To compute MSQE, we must first determine the quantisation error eq (n) in each quantised sample. This is given by (10.3)
eq (n) = x(n) − y(x(n))
where x(n) is the input sample and y(x(n)) is the quantised sample given by Eq. (10.2) for all the uniform quantisers.
Table 10.1
Worked Example 10.1: Results for midrise rounding quantiser. Codewords
n
t(n), ms
x(n), V
L(x(n))
sgn(x(n))
Signed magnitude
One’s complement
Two’s complement
0 1
y(x(n)), V
eqr (n), mV
0.1
6.6407
17
1
010001
010001
010001
6.5625
78.20
1.1
11.2474
29
1
011101
011101
011101
11.0625
184.88
2
2.1
11.5580
30
1
011110
011110
011110
11.4375
120.45
3
3.1
7.4538
19
1
010011
010011
010011
7.3125
141.27
4
4.1
0.5025
1
1
000001
000001
000001
0.5625 −59.99
5
5.1
−6.6407
17
−1
110001
101110
101111
−6.5625 −78.20
6
6.1 −11.2474
29
−1
111101
100010
100011
−11.0625 −184.88
7
7.1 −11.5580
30
−1
111110
100001
100010
−11.4375 −120.45
8
8.1
−7.4538
19
−1
110011
101100
101101
−7.3125 −141.27
9
9.1
−0.5025
1
−1
100001
111110
111111
−0.5625
59.99
10.2 Concept and Classes of Quantisation
Table 10.2
Worked Example 10.1: Results for midrise truncating quantiser.
y(x(n)), V eqt (n), mV
6.375
10.875
265.70
372.38
11.25 307.95
7.125 328.77
0.375 127.51
−6.75
−11.25
109.30
2.62
−11.625 67.05
−7.5 46.23
−0.75 247.49
The quantisation error eqr (0) of the first sample of the midrise rounding quantiser is ] [ 1 eqr (0) = x(0) − sgn(x(0)) × Δ × L(x(0)) + 2 [ ] 1 3 = 6.6407 − (+1) × × 17 + 8 2 = 78.20 mV The quantisation error eqt (9) of the 10th sample of the midrise truncating quantiser is ] [ 1 − sgn(x(9)) eqt (9) = x(9) − sgn(x(9)) × Δ × L(x(9)) + 2 ] [ 1 − (−1) 3 3 = −0.5025 + = −0.5025 − (−1) × × 1 + 8 2 4 = 247.49 mV And so on for all the results listed in Tables 10.1 and 10.2. The MSQE is obtained by averaging as follows over the quantisation errors in the 10 outputs of each quantiser 1 ∑ 2 e (n) 10 n=0 q 9
MSQE =
(10.4)
Using the values listed in Table 10.1 the MSQE of this midrise rounding quantiser operating on this input signal is 1 ∑ 2 e (n) 10 n=0 qr ] 1 [ 0.07822 + 0.184.882 + · · · + 0.059992 = 10 = 15.6725 mW 9
MSQEr =
Similarly, from the values in Table 10.2 we obtain the MSQE of the midrise truncating quantiser operating on the same input signal as 1 ∑ 2 e (n) 10 n=0 qt ] 1 [ = 0.26572 + 0.372382 + · · · + 0.247492 10 = 50.8288 mW 9
MSQEt =
MSQE gives a measure of quantisation noise power. Comparing the MSQE of both quantisers, we see that the quantisation noise power of the truncating quantiser exceeds that of the rounding quantiser by 10log10 (MSQEt ∕MSQEr ) = 5.11 dB
633
634
10 Digital Baseband Coding
In general, the quantisation noise power of truncating quantisers is larger than that of rounding quantisers. It is shown in the next section and in Section 10.5.2 that ⎧ Δ2 , Rounding Quantiser ⎪ (10.5) MSQE = ⎨ 122 ⎪ Δ , Truncating Quantiser ⎩ 3 So, the exact ratio between the two quantisation noise powers is 10log10 (4) = 6 dB, rather than the 5.11 dB obtained above. Using a much larger number of samples than just the 10 in this worked example yields a closer estimate to the exact theoretical result. For example, choosing a sampling rate of 100 kHz (to collect 1000 samples of the signal over one cycle) yields a ratio of 5.83 dB between the two quantisation noise powers.We now focus our discussion on uniform quantisation based exclusively on the midrise rounding implementation in order to introduce key design parameters.
10.3 Uniform Quantisation The simplest method of quantisation is to divide the full range of input signal values, from −C to +C, into N equal intervals, as shown in Figure 10.2. Note that there are N/2 intervals (indexed L = 0, 1, 2, …, N/2 − 1) in the positive range and N/2 intervals in the negative range (also indexed L = 0, 1, 2, …, N/2 − 1) and zero is an interval boundary, so this is a midrise quantisation scheme. Furthermore, all input samples falling within a given interval are approximated or quantised to the midpoint of that interval, making this a rounding quantiser. Figure 10.2 thus represents a midrise rounding uniform quantisation scheme.
Input, x C
yN/2 – 1 = (N – 1)Δ/2
C–Δ
yj = (2j + 1)Δ/2
Δ 0 –Δ
y1 = 3Δ/2 Δ
y0 = Δ/2 –y0
–C
Quantisation index, L = 0, 1, 2, …, N/2 – 1
–yN/2 – 1 Quantised output, yL
Figure 10.2
Δ/2 0
–y1
–yj
eq
Δ s
Midrise rounding uniform quantisation.
10.3 Uniform Quantisation
Table 10.3
Choosing N = 8 intervals uses all 3-bit binary codewords.
Quantisation interval
Binary codeword
0
1
2
3
4
5
6
7
000
001
010
011
100
101
110
111
← Not used if N = 5
→
The number of intervals is usually chosen to be an integer power of 2 N = 2k ;
⇒
k = log2 N
(10.6)
This choice allows the k-bit binary codes to be fully utilised at the coding stage of the ADC process to represent the N intervals. For example, if we choose N = 5 then we require k = 3 bits to represent these five intervals. See Table 10.3, where, for convenience, the intervals have been unidirectionally indexed from 0 to N − 1 (instead of the usual bidirectional indexing from 0 to N/2 − 1 for the positive range and 0 to N/2 − 1 for the negative range). Notice how the codes 101, 110, and 111 are surplus to requirement if N = 5. It is shown below that quantisation noise is reduced as N increases. In this example therefore, we can reduce quantisation noise at no extra cost (i.e. without increasing the number of bits k required to represent each quantisation interval) simply by increasing N from 5 to 23 = 8.
10.3.1 Quantisation Noise In a uniform or linear quantisation, all intervals are of an equal size Δ, called the quantiser step size. And in a rounding quantiser, the quantised value of each interval is set to the average of all input samples that fall in that interval. If the samples in each interval are equally likely to be located anywhere within the interval, this average and hence quantised value will be the midpoint of the interval. Each sample is therefore approximated to the midpoint of the interval in which it falls. For example, any sample of value in the range 0 to Δ (which is the zeroth positive quantisation interval with index L = 0) is quantised to y0 = Δ/2; any sample in the range 0 to −Δ (which is the zeroth negative quantisation interval) is quantised to −y0 ; any sample in the topmost interval (C, C−Δ), of index L = N/2 − 1, is quantised to (N − 1)Δ/2; and in general any sample in the jth positive quantisation interval is quantised to yj = (2j + 1)Δ/2. Therefore, for each sample of value s in the jth interval, quantisation produces an error given by eq = |s − yj |
(10.7)
This error has a maximum value eq max when s is either by the top or by the bottom of the interval, i.e. s = yj ± Δ/2. Thus eq max = |yj ± Δ∕2 − yj | = Δ∕2
(10.8)
The maximum possible error incurred in each quantisation is therefore half the quantiser step size. As discussed in Section 3.5.2, the mean square value of a signal gives its (normalised) power. Therefore, to determine quantisation noise power, we need to calculate the MSQE across all the samples of the signal. However, if the input signal lies entirely within the quantiser range, i.e. it does not overload the quantiser, the error statistics are the same across all the uniform quantiser intervals spanned by the input. Therefore, we may focus on the bottom positive interval (shown blown out on the right-hand side of Figure 10.2) and sum over the entire interval (0 → Δ) the product of e2q = (s − Δ∕2)2 and the probability ds/Δ that the sample s lies in an infinitesimal interval
635
636
10 Digital Baseband Coding
ds around s. This gives Δ
Δ
Δ
1 1 (s − Δ∕2)2 ds = (s2 − sΔ + Δ2 ∕4)ds ∫0 Δ ∫0 Δ ∫0 [( ) Δ] ( ) 1 1 Δ3 Δ3 Δ3 s3 s2 Δ sΔ2 || = − + − + | = Δ 3 2 4 ||0 Δ 3 2 4 2 2 2 4Δ 6Δ 3Δ = − + 12 12 12 e2q ds∕Δ =
MSQE =
Therefore MSQE =
Δ2 12
(10.9)
We see that the MSQE depends only on the step size Δ. This error will appear as noise associated with the signal. The main advantage of this quantisation process is that if we can somehow convey the information about each quantised level without error then we will introduce no further degradation to the signal. We can therefore make this quantisation noise as small as we wish by sufficiently reducing Δ. However, there is a price to pay for this improved noise performance. From Figure 10.2 and Eq. (10.6) it follows that Δ=
2C 2C = k N 2
(10.10)
Here, k is the number of bits required to represent each quantised level, i.e. the number of bits per sample. Thus, Δ and hence quantisation noise can be reduced by increasing the number of bits per sample. This increases the bit rate and hence the bandwidth required for transmission.
10.3.2 Dynamic Range of a Quantiser A sinusoidal input signal of amplitude V max = C will fully load the quantiser in Figure 10.2, since it has values that cover the entire range from −C to +C. Sinusoids of a larger amplitude would cause clipping and distortion. On the other hand, the variations of a sinusoidal input signal of amplitude V min = Δ/2 are confined to a single interval of the quantiser. In other words, variations in this sinusoid will go undetected at the quantiser output. The ratio of the largest amplitude V max of a sinusoid that avoids clipping to the largest amplitude V min of a sinusoid whose variations go undetected is called the dynamic range of the quantiser Dynamic Range =
Vmax C = 2k = Vmin Δ∕2
= 6.02k dB
(10.11)
The dynamic range therefore depends on the number of bits per sample. It increases by 6 dB for each extra bit available for representing each sample.
10.3.3 Signal-to-quantisation-noise Ratio (SQNR) An important parameter for assessing the performance of the quantiser is the ratio of signal power to quantisation noise power, called the signal-to-quantisation-noise ratio (SQNR). Let us define a parameter R known as the peak-to-rms ratio of the signal to be quantised. For a signal of rms = 𝜎, and peak value V p R=
Vp 𝜎
;
Signal Power = 𝜎 2 =
Vp2 R2
10.3 Uniform Quantisation 2 2 12 × 22k Vp 2 Signal Power Vp ∕R = 2 = MSQE Δ ∕12 4C2 R2 2 3Vp = 2 2 22k C R
SQNR =
(10.12)
where we have made use of the expression for Δ in Eq. (10.10). Expressing Eq. (10.12) in dB [ ] 3Vp2 2k 2 SQNR = 10log10 C 2 R2 = 10log10 (3) + 10log10 (22k ) + 10log10 (R−2 ) + 10log10 (Vp 2 ∕C2 ) = 4.77 + 6.02k − 20log10 (R) + 20log10 (Vp ∕C)
dB
(10.13)
If the signal fully loads the quantiser, V p = C, and the SQNR improves to SQNR = 4.77 + 6.02 k − 20log(R)
(10.14)
Worked Example 10.2 Determine the SQNR as a function of number of bits/sample for each of the following signals: (a) Sinusoidal signal. (b) Signal with a uniform probability density function (PDF). (c) Speech signal. (a) If a sinusoidal signal g(t) fully loads the quantiser then its amplitude V p = C, and we may write g(t) = Csin(𝜔t). The signal g(t) has a period T = 2𝜋/𝜔, and a mean square value T
𝜎2 =
T
1 1 g2 (t)dt = C2 sin2 (𝜔t)dt T ∫0 T ∫0 T
C2 (1 − cos 2𝜔t)dt 2T ∫0 C2 = 2 √ Peak value C R= = √ = 2 𝜎 C∕ 2 =
It follows from Eq. (10.14) that
√ SQNR = 4.77 + 6.02k − 20 log( 2) = 1.76 + 6.02k dB
(10.15)
(b) The samples of this signal can take on any value between a minimum −C and a maximum +C with equal probability. It is assumed that the signal fully loads the quantiser. The probability that a sample of the signal lies between s − ds/2 and s + ds/2 is given by the shaded area pds in Figure 10.3. Since each sample must lie somewhere between −C and + C, we must have p × 2C = 1, or p = 1/2C. The mean square value 𝜎 2 of the signal is obtained by summing (over the entire signal range −C to +C) the product of the square of the sample value s and the probability pds that a sample lies within the infinitesimal interval centred on s +C
𝜎2 =
∫−C
+C
s2 pds =
1 C2 s2 ds = ∫ 2C −C 3
637
638
10 Digital Baseband Coding
PDF p
ds Sample value s
–C Figure 10.3
+C
Worked Example 10.2: uniform probability density function (PDF).
Thus R=
√ C Peak = √ = 3. 𝜎 C∕ 3
Eq. (10.14) then yields SQNR = 6.02 k
(10.16)
(c) Measurements show that speech signals have on average 20log(R) = 9 dB. Thus, if the speech signal fully loads the quantiser (i.e. peak value V p = C) then it follows from Eq. (10.14) that SQNR = 6.02 k − 4.23 dB
(10.17)
Worked Example 10.3 A speech signal is to be transmitted by PCM with an output SQNR of 55 dB. (a) What is the minimum number of bits per sample that must be used to achieve this performance, if the speech signal fully loads the quantiser? (b) If the quantiser is only half-loaded by the speech signal, what is the resulting output SQNR for the same number of bits/sample as above? (a) The required bits/sample is obtained by rewriting Eq. (10.17) to make k the subject SQNR + 4.23 55 + 4.23 = = 9.84. 6.02 6.02 The smallest integer larger than or equal to the above result gives the minimum number of bits/sample: k = 10. (b) The full expression for SQNR in Eq. (10.13) must be used in this case, with 20log(R) = 9 and V p /C = 12 ( ) 1 SQNR = 4.77 + 6.02k − 9 + 20 log 2 = 4.77 + 6.02 × 10 − 9 − 6.02 k=
= 50 dB Note how the SQNR degrades when the quantiser is underloaded by a small input signal. Overloading, on the other hand, leads to clipping. Optimum performance is obtained by scaling the signal prior to quantisation to ensure that it just fully loads the quantiser.
10.3 Uniform Quantisation
Codewords of quantised outputs 011 010
Quantised output Input analogue signal
001 000 100 101 110 111 Error
Figure 10.4
Ts = 1/Fs
Eight-level uniform quantisation.
A demonstration of a combined process of sampling and uniform quantisation of a sinusoidal signal is shown in Figure 10.4. There are eight quantiser output levels, requiring k = 3 bits to represent each output level. The input signal is scaled prior to sampling in order to fully load the quantiser. The bottom plot of Figure 10.4 shows the quantisation error, which is the difference between the quantised output and the analogue input signal. At each sampling instant the value of the analogue signal sample is approximated to the midpoint of the quantisation interval in which it lies. This quantised value is then held until the next sampling instant. The result is a staircase output signal.
10.3.4 Design Considerations Several important PCM system design considerations may be deduced from the relationships between the design parameters in Eqs. (10.12) and (10.13). ●
●
●
SQNR increases exponentially with the number of bits per sample k, which is itself directly proportional to the bandwidth B required to transmit the PCM signal. Thus, PCM provides an exponential increase of SQNR with bandwidth. This is a better bandwidth for noise improvement trade-off than offered by frequency modulation (FM), where signal-to-noise ratio (SNR) increases roughly as the square of bandwidth. That is, if the SNR is SNR1 at bandwidth B1 , and the bandwidth is increased by a factor n to nB1 , then the SNR increases to n2 × SNR1 in FM, but more dramatically to SNRn1 in PCM. More simply put, you make a gain of 6.02 dB per extra bit used for coding each sample in PCM, but you generate more bits per second as a result, and therefore require a larger transmission bandwidth. The number of bits/sample required for a desired SQNR can be read from Figure 10.5 for the three signals discussed in Worked Example 10.2. SQNR decreases as the square of the quantiser range 2C needed to accommodate the input signals without clipping. An improvement in SQNR can therefore be realised by reducing the range of input signal values. Some differential quantisers achieve such gains by quantising the difference between adjacent samples, rather than the samples themselves. If the sampling rate is sufficiently high, adjacent samples are strongly correlated and the difference between them is very small, resulting in a reduced range of quantiser input values.
639
10 Digital Baseband Coding
75 70 65 60 SQNR (dB)
640
55 50
u
soid
Sin
45
DF
Un
mP ifor
ech
Spe
40 35 30 25
5
Figure 10.5 ●
●
6
7
8 9 Number of bits/sample, k
10
11
12
SQNR of a uniform quantiser as a function of number of bits/sample for various signal types.
A large segment of the quantisation error signal resembles a sawtooth waveform with a fundamental frequency that increases with the sampling frequency F s . Thus, oversampling an analogue signal (i.e. choosing a much higher value of F s than required by the sampling theorem) will have the effect of spreading out the quantisation noise power over a wider frequency band. As a result, only a significantly reduced fraction of the noise lies within the signal band at the reconstruction filter. When an input signal underloads the quantiser, SQNR decreases by 20log(r) dB, where r is the ratio between the quantiser range 2C and the peak-to-peak value of the input signal. More simply put, a signal that is at r dB below the level that fully loads the quantiser will have an SQNR that is r dB worse than the values obtained from Figure 10.5. In speech communication, for example, this would mean a good SQNR for the loudest speakers and a significant degradation for soft speakers.
10.3.5 Demerits of Uniform Quantisation We have observed that the quantisation error associated with each quantised sample can range from 0 to a maximum of Δ/2, where Δ is the quantiser step size. This error must be kept small compared to the sample value. To faithfully quantise small samples, one must therefore choose a very small step size, which requires the number of quantisation intervals and hence number of bits per sample k to be large. For example, speech signals are characterised by a nonuniform PDF with a preponderance of low values. To maintain fidelity, these low values must be faithfully transmitted as they mostly represent the consonants that carry intelligibility. The typical dynamic range of a speech signal is 60 dB, which means a ratio of highest to lowest sample magnitude given by ( ) VH 60 = 20log10 VL Thus VH = 10(60∕20) = 1000 VL That is, if the peak value allowed when digitising a speech signal is V H = 1 V, then the weakest passage may be as low as V L = 1 mV. A step size Δ < V L is required to faithfully quantise the smallest samples. Choosing Δ = V L
10.4 Nonuniform Quantisation
results in 1000 intervals for the positive samples, and another 1000 for the negative samples, or 2000 intervals in total. The number of bits required to code up to 2000 levels is given by k = ⌈log2 (2000)⌉ = 11 bits where, ⌈x⌉ denotes the smallest integer larger than or equal to x. With a sampling frequency F s = 8 kHz, the bit rate that must be transmitted is 8000 samples × 11 bits/sample = 88 kb/s. Such a high bit rate translates to an unacceptably large transmission bandwidth requirement. Another problem with uniform quantisation may be observed by noting that the quantisation error magnitude is constant across all intervals. The SQNR is therefore much smaller at the lower intervals (corresponding to small signal values) than nearer the top since VL2 ∕e2q
≪
VH2 ∕e2q
In telephony, it is desirable to have a constant SQNR over a wide range of input signal values so that the service quality is maintained at the same level for both quiet and loud talkers. The above problems of large bit rate and nonconstant SQNR can be alleviated by using a nonuniform quantiser in which the step size is a function of input signal value. Large input samples are coarsely quantised using larger step sizes, whereas the smaller input samples are more finely quantised using smaller step sizes.
10.4 Nonuniform Quantisation Nonuniform quantisation may be achieved through either of the schemes shown in Figure 10.6. In Figure 10.6a, the analogue signal is first compressed before being quantised in a uniform quantiser. At the receiver, the decoded PCM signal is expanded in a way that reverses the effect of the compression process. The combined process of compression at the transmitter and expansion at the receiver is called companding. Note that companding does not introduce any distortion, and quantisation error remains the only source of distortion in the ADC process.
Transmitter
Analogue input
Uniform ADC
Compressor
PCM output
(a)
Receiver
PCM input
Uniform DAC
Expander
n bits/sample Transmitter
Analogue input
Fine-grain uniform ADC
Digital translator
Analogue output m bits/sample PCM output n>m
(b) n bits/sample Receiver Figure 10.6
PCM input
Digital translator
Nonuniform quantisation schemes.
Fine-grain uniform DAC
Analogue output
641
642
10 Digital Baseband Coding
The scheme of Figure 10.6b first quantises the input signal using a fine-grain uniform quantiser of, say, n = 13 bits/sample, corresponding to 213 = 8192 levels. A digital translator is then used to reduce the number of transmitted bits/sample to, say, m = 8, corresponding to 28 = 256 levels. The reduction is achieved in compressor fashion by mapping an increasing number of fine-grain levels to the same output level as you go from the low to high range of signal values. For example, the translation may progress from 2-to-1 near 0, to 128-to-1 near the maximum signal value. A 128-to-1 translation means that 128 fine-grain levels are equated to 1 output level, usually the midpoint of the 128 levels. At the receiver, a reverse translation is performed that converts from m to n bits/sample. The overall effect of this scheme is again finer quantisation of small signal samples and coarser quantisation of the larger samples. The quantisation scheme of Figure 10.6b is in practice the preferred approach, since it can be implemented using low-cost digital signal processors (DSPs). However, to understand the required compressor characteristic, we first discuss the system of Figure 10.6a and then present the implementation of Figure 10.6b as a piece-wise linear approximation of the nonlinear compression function in the system of Figure 10.6a.
10.4.1 Compressor Characteristic Figure 10.7 shows the input–output characteristic of the compressor. The input is normalised to the range −1 to +1. The output (y axis) is divided into N uniform intervals (illustrated in Figure 10.7 for N = 8), corresponding to uniform quantisation with step size Δ = 2/N. Note that the uniform intervals of the compressor output y correspond to differently sized intervals of the compressor input signal x, which are small near the origin and increase steadily for larger magnitudes of x. This arrangement achieves finer quantisation of the smaller samples of x and coarser quantisation of the larger samples. In the example shown in Figure 10.7, all input samples in the second positive interval are quantised to output y1 with maximum error Δ1 /2, whereas the input samples in the fourth positive interval are quantised to level y3 with a bigger maximum error of Δ3 /2. Recall that one of the goals of nonuniform quantisation is to achieve a constant SQNR throughout the entire range of input signal x. Let us obtain the SQNR for a signal that is compressed and uniformly quantised as illustrated in Figure 10.7. Our aim is to determine the shape of the compression curve that gives an SQNR that is independent of x.
y +1 y3 y2 y1 –1
y0 Δ0
dy = 2/N
Δ1
Δ2
Δ3 x +1
–y0 –y1 –y2 –y3 1
Figure 10.7
dx = Δj
Compressor characteristic.
Input x
Compressor
Output y
10.4 Nonuniform Quantisation
When the step size varies from interval to interval as in Figure 10.7, the overall MSQE of the quantiser is obtained by averaging Eq. (10.9) over all the intervals as follows MSQE =
∑
Pj
j
Δ2j
(10.18)
12
where Δj is the step size of the jth interval and Pj is the probability that a sample of the input signal will have a value lying within the range of the jth interval. Assuming sufficiently small intervals, the slope of the compressor curve is constant within the space of one interval, and (from Figure 10.7) is given for interval j by 2∕N dy = Δj dx It follows that the jth step size is given by Δj =
2 dx N dy
(10.19)
The probability of the input signal value falling in the jth interval (of width dx) is (10.20)
Pj = pX (x)dx
where pX (x) is the PDF of the input signal x. In the limit of a large number of quantisation intervals, the summation for MSQE in Eq. (10.18) becomes an integration operation over the normalised input signal range −1 to +1. Thus ( )2 ∑ 1 1 ∑ dx 2 PΔ = p (x)dx MSQE = 12 j j j dy 3N 2 All intervals X ) ( 2 +1 1 dx = p (x) dx (10.21) X dy 3N 2 ∫−1 where we have used Eqs. (10.19) and (10.20) for Δj and Pj . But the mean square value operation – see Eq. (3.21) – gives input signal power as +1
Signal Power =
∫−1
x2 pX (x)dx
(10.22)
The desired SQNR is given by the ratio between Eqs. (10.22) and (10.21) SQNR =
Signal Power MSQE +1
= 3N 2
∫−1 x2 pX (x)dx ( )2 +1 ∫−1 dx pX (x)dx dy
(10.23)
The right-hand side of Eq. (10.23) must be independent of x if SQNR is to be independent of x as desired. By examining the above equation, it is easy to see that this can be achieved by setting dx = Kx, dy
(10.24)
where K is a constant. This leads to the result +1
SQNR = 3N 2 =
3N 2 K2
∫−1 x2 p(x)dx +1
∫−1 K 2 x2 p(x)dx (10.25)
643
644
10 Digital Baseband Coding
which is independent of input signal level x as desired. Thus, the correct compression characteristic is one that satisfies Eq. (10.24), or dy 1 = Kx dx Integrating y=
1 ln(x) + D K
where D is a constant that we choose in order to make (x, y) = (1, 1) a point on the curve, since the normalised maximum input is compressed to the normalised maximum output. Thus D = 1, and the desired compressor characteristic, is 1 ln(x) + 1 K which has a slope y=
(10.26)
dy 1 = Kx dx
(10.27)
So, what have we achieved? We now have in Eq. (10.26) the full specification of a compression curve that can be used to compress the input signal x to give an output y, which when uniformly quantised produces a constant SQNR across the entire input signal range. The result of these two steps (of compression using the curve specified in Eq. (10.26) followed by uniform quantisation) is fine quantisation of small input values and coarse quantisation of larger input values. However, there is a practical problem with the compression function of Eq. (10.26). The slope of the curve (see Eq. (10.27)) is infinite at x = 0, implying infinitesimally small quantiser steps as x → 0. To circumvent this problem, the logarithmic function in Eq. (10.26) is replaced by a linear function in the region x → 0. The ITU-T (International Telecommunication Union – Telecommunication) has standardised two such compressor characteristics, the A-law in Europe and the 𝜇-law in North America and Japan.
10.4.2 A-law Companding This companding law follows from Eq. (10.26) by setting the constant K = 1 + ln(A)
(10.28)
where A is a positive constant (usually A = 87.6). This defines the logarithmic portion of the characteristic ln(x) + K 1 ln(x) + 1 = K K ln(x) + 1 + ln(A) = 1 + ln(A) 1 + ln(Ax) = 1 + ln(A)
ylog =
(10.29)
A linear function ylin is used in the region |x| ≤ 1∕A. This is the region x → 0 referred to above. The linear function ylin = mx + c
(10.30)
where m and c are constants, is determined by satisfying two conditions: ●
That ylin passes through the origin, so that x = 0 is compressed to y = 0. This means that in Eq. (10.30), ylin = 0 when x = 0, so that the constant c = 0.
10.4 Nonuniform Quantisation ●
That, for continuity, the linear and logarithmic functions have the same value at x = 1∕A. Since ylin |x=1∕A = m∕A and ylog |x=1∕A =
( ) 1 + ln A × A1 1 + ln(A)
=
1 1 + ln(A)
it follows by equating both expressions that m=
A 1 + ln(A)
Thus ylin = mx + c =
Ax 1 + ln(A)
(10.31)
To summarise, the A-law compression curve is defined by the following equations ⎧ Ax ⎪ 1 + ln(A) , 0 ≤ x ≤ 1∕A ⎪ ⎪ y = ⎨ 1 + ln(Ax) , 1∕A ≤ x ≤ 1 ⎪ 1 + ln(A) ⎪ ⎪−y(|x|), −1 ≤ x ≤ 0 ⎩
(10.32)
The last expression in Eq. (10.32) indicates that the compression curve y has odd symmetry, so that a negative input value, say, −X, where X is positive, is compressed to give a negative output that has the same magnitude as the output corresponding to input X. For later use, the gradient of the A-law compression curve at x = 0 is ]| [ dy || d A || Ax | = | = 1 + ln A ||x=0 dx ||x=0 dx 1 + ln(A) ||x=0 A (10.33) = 1 + ln A and the gradient at x = 1 is ] [ 1∕x || dy || d 1 + ln(Ax) || = | = | 1 + ln A ||x=1 dx |x=1 dx 1 + ln(A) ||x=1 1 = 1 + ln A
(10.34)
10.4.3 𝝁-law Companding This companding function is obtained from Eq. (10.26) in two steps: 1. Set the constant K = ln(1 + 𝜇) where 𝜇 is a positive constant (usually 𝜇 = 255), so that
(10.35)
645
646
10 Digital Baseband Coding
ln(x) + ln(1 + 𝜇) ln(x) +1= ln(1 + 𝜇) ln(1 + 𝜇) ln(x + 𝜇x) = ln(1 + 𝜇)
y=
2. Modify the above result by replacing x + 𝜇x in the numerator with 1 + 𝜇x. This gives the 𝜇-law compression function ⎧ ln(1 + 𝜇x) , x≥0 ⎪ y = ⎨ ln(1 + 𝜇) ⎪−y(|x|), −1 ≤ x ≤ 0 ⎩
(10.36)
The modification in the second step is necessary in order to satisfy the requirement for linear compression in the region x → 0. To see that this is the case, note that ln(1 + z) ≈ z for z → 0, which means that the 𝜇-law compression curve reduces to the following linear relation near the origin y ≡ ylin =
𝜇x , ln(1 + 𝜇)
x→0
(10.37)
Furthermore, if 𝜇 ≫ 1 then 1 + 𝜇x ≈ 𝜇x for x → 1. Thus, the compression is logarithmic as required for large input values ln(𝜇x) y ≡ ylog = , x→1 (10.38) ln(1 + 𝜇) The 𝜇-law function has derivative dy 𝜇 1 ⋅ = 1 + 𝜇x ln(1 + 𝜇) dx and thus, its slope at x = 0 is dy || 𝜇 = dx ||x=0 ln(1 + 𝜇)
(10.39)
and at x = 1 its slope is dy || 𝜇 = dx ||x=1 (1 + 𝜇) ln(1 + 𝜇)
(10.40)
Worked Example 10.4 Discuss how the values of A and 𝜇 affect the relative sizes of the quantisation steps in A-law and 𝜇-law companding. The maximum step size occurs at x = 1 (normalised), and the minimum at x = 0. Let us denote these step sizes as Δmax and Δmin , respectively. Applying Eq. (10.19) 2 N 2 = N
Δmin = Δmax
2∕N dx || = dy ||x=0 (dy∕dx)|x=0 2∕N dx || = | dy |x=1 (dy∕dx)|x=1
Thus 2∕N dy || = ; | dx |x=0 Δmin
2∕N dy || = | dx |x=1 Δmax
(10.41)
10.4 Nonuniform Quantisation
and the ratio of maximum step size to minimum step size is the ratio between the gradients of the compressor curve at x = 0 and x = 1 (dy∕dx)|x=0 Δmax = Δmin (dy∕dx)|x=1 For A-law, we use the expressions for the above derivatives given in Eqs. (10.33) and (10.34) to obtain Δmax =A Δmin
(10.42)
For 𝜇-law, the above derivatives are given in Eqs. (10.39) and (10.40), from which we obtain Δmax =1+𝜇 Δmin or 𝜇=
Δmax −1 Δmin
(10.43)
We see therefore that the constant A sets the ratio between the maximum and minimum step size in A-law compression. If A = 1, then Δmax = Δmin and the step sizes are all equal. This is the special case of uniform quantisation. A significant compression is achieved by choosing A ≫ 1. In the case of 𝜇-law compression, 𝜇 = 0 gives Δmax = Δmin and corresponds to uniform quantisation. The required compression characteristic, Eq. (10.38), is obtained only by choosing a large value for the constant 𝜇, usually 𝜇 = 255. Figure 10.8 shows the A-law and 𝜇-law characteristics for various values of A and 𝜇.
10.4.4 Companding Gain and Penalty Companding gain Gc is defined as the improvement in SQNR for small input values in a nonuniform quantiser compared with the SQNR of the same signal type when using a uniform quantiser of the same number of y
1.0
Compressed output, y →
A-law μ-law 0.5
A = ∞; μ = ∞ A = 87.6; μ = 255
A = 1000; μ = 2000
x
A = 1; μ → 0
0
–0.5
–1.0 –1.0
–0.5
0
0.5
Normalised input, x → Figure 10.8
Worked Example 10.4: A-Law and 𝜇-Law compression characteristics.
1.0
647
648
10 Digital Baseband Coding
bits/sample. It follows that Gc is the square of the ratio between the step size of a uniform quantiser and the smallest step size of a nonuniform quantiser of the same bits/sample and input range. Noting from Figure 10.7 that dy is the step size of a uniform quantiser and dx is the corresponding step size of a nonuniform quantiser, it follows that [ ] ( ) Gc = 10log10 (dy∕dx)2 |x=0 = 20log10 (dy∕dx)|x=0 dB
(10.44)
Using Eqs. (10.31) and (10.37), we obtain the companding gain for A-law and 𝜇-law companding as follows ] [ A dB Gc A-law = 20log10 1 + ln(A) ] [ 𝜇 dB (10.45) Gc 𝜇-law = 20log10 ln(1 + 𝜇) Thus, if A = 87.6, A-law nonuniform quantisation gives a gain of 24 dB over a uniform quantisation that uses the same number of bits/sample. A gain of 33 dB is realised with 𝜇-law for 𝜇 = 255. We will see later that, in practice, a piecewise linear approximation is adopted, resulting in a slightly lower companding gain of 30 dB for 𝜇-law. Recall that an improvement of 6 dB in SQNR is provided by each extra bit used for coding the quantised samples. A gain of 24 dB is therefore equivalent to four extra coding bits. This means that a uniform quantiser would require 12 bits/sample in order to have the same performance as A-law with 8 bits/sample. By using A-law nonuniform quantisation, we have reduced the required number of bits/sample from 12 to 8, representing a saving of 33% in bit rate and hence bandwidth. In the case of 𝜇-law that achieves 30 dB of companding gain, the bit rate reduction is 5 bits – from 13 to 8, a saving in bandwidth of 38.5%. Note that these figures apply only to the bandwidth required for transmitting information bits. In practice, overhead (noninformation) bits must be added, as discussed in Chapter 13, leading to a lower saving in bandwidth. Companding penalty Lc is defined as the ratio between the SQNR of an input signal that fully loads a uniform quantiser and the SQNR of the same signal type when using a nonuniform quantiser of the same number of bits/sample. We noted earlier that the SQNR of a uniform (or linear) quantiser decreases with peak input signal level. SQNR in the nonuniform (or log-) quantiser discussed above does not vary significantly over the entire input signal range. However, we fail to realise an ideal nonlinear quantisation in which SQNR is strictly constant. This is because, for practical implementation, we had to replace the logarithmic curve of Eq. (10.26) with a linear curve for |x| ≤ 1/A in the A-law. In the case of 𝜇-law, an approximate curve was employed that becomes linear as x → 0 and logarithmic as x → 1 (normalised). Consider a log-quantiser and a linear quantiser employing the same number of bits/sample and let their SQNR be respectively denoted SQNRlog and SQNRlin . We will work with normalised input values, so the quantiser input range is ±1 and the peak value of the input signal is V p ≤ 1, with V p = 1 (= 0 dB relative to quantiser limit C) if the input signal fully loads the quantiser. At the top end of the quantiser input range (i.e. x ≈ 1) the step size of the linear quantiser (= 2/N) is smaller than the step size of the log-quantiser (= Δmax ), whereas at the bottom end (x ≈ 0, or x ≪ 1) the linear quantiser step size (still 2/N) is much larger than the log-quantiser step size (which in this region is Δmin ). Thus, for small input signals (having peak value V p ≪ 1), SQNRlog > SQNRlin and the difference SQNRlog − SQNRlin in dB is the companding gain Gc discussed above and given by ( Gc ≡ (SQNRlog − SQNRlin )|x≈0 = 20log10 ( ) dy || dB = 20log10 dx ||x=0
2∕N Δmin
) dB
where we have used Eq. (10.41) for the last line, which you will recognise to be Eq. (10.44) for the companding gain of a compressor of transfer characteristic specified by output y as a function of input x.
10.4 Nonuniform Quantisation
On the other hand, for large input signals (with peak value V p ≈ 1), SQNRlin > SQNRlog and their difference in dB is known as companding penalty Lc given by ) ( Δmax Lc ≡ (SQNRlin − SQNRlog )|x≈1 = 20log10 dB 2∕N ( ) dy || dB = −20log10 dx ||x=1 where we have again used Eq. (10.41) for the last line. Substituting Eqs. (10.34) and (10.40) for the slopes of the A-law and 𝜇-law functions at x = 1, we obtain ⎧20log (1 + ln A) 10 ⎪ ) ] [( Lc = ⎨ 1 ln(1 + 𝜇) 1+ ⎪20log10 𝜇 ⎩
dB,
A-law
dB,
𝜇-law
(10.46)
Putting A = 87.6 and 𝜇 = 255 in the above equations yields companding penalty Lc = 14.8 dB for A-law and Lc = 14.9 dB for 𝜇-law. What this means is that the SQNR improvement (by 24 dB in A-law and 30 dB in 𝜇-law) achieved by a log-quantiser for small input signals is through sacrificing signal quality at the top end of input (by ∼15 dB in both A-law and 𝜇-law). This results in a lower but consistent SQNR across the entire input range which is more satisfying in communication services than the situation that obtains in uniform quantisation where SQNR is high at the top end and poor at the bottom end. Log-quantisation is truly an ingenious way of imitating the familiar societal practice of taking a little more from the rich or strong (≡ companding penalty) to benefit the poor or weak (≡ companding gain). SQNRlin (in dB) decreases proportionately with peak input signal level V p (in dB), whereas SQNRlog remains roughly constant as V p decreases until V p ≪ 1 (or more precisely for A-law, V p ≡ V th = 1/A) when log-quantisation is abandoned and linear quantisation is adopted with a fixed step size Δmin , so that SQNRlog also begins to decrease in step with V p from this point. Figure 10.9 illustrates this variation of SQNR with V p in a linear quantiser and SQNR (dB) SQNRlinmax Lc Log-quantiser
SQNRlogmax Li
ne
ar
qu
an
Gc
tis
er
Gc
S
0
Vth
Vlog
Vp
Peak input level, Vp (dB relative to quantiser limit C) Figure 10.9 SQNR versus peak input level for linear and log-quantisers. Lc is the companding penalty and Gc the companding gain.
649
650
10 Digital Baseband Coding
also in a log-quantiser that utilises log-quantisation up to the point V p = V th and linear quantisation below V th . Both quantisers operate with the same number of quantisation levels N, and therefore the same number of bits per sample k. Note in this graph that V p decreases from left to right along the x axis so that V log < V th < 0. The linear and log-quantisers have respective maximum SQNR denoted SQNRlinmax and SQNRlogmax . The difference between these two values is the companding penalty Lc defined earlier Lc = SQNRlinmax − SQNRlogmax
dB
At V p = V th , the SQNR of the log-quantiser is still SQNRlogmax but the SQNR of the linear quantiser has dropped to S. The difference between these two SQNR values is the companding gain Gc earlier defined Gc = SQNRlogmax − S
dB
⇒ S = SQNRlogmax − Gc
dB
Beyond V th , the log-quantiser’s SQNR decreases linearly in step with V p , reaching the value S at V p = V log . Thus S = SQNRlogmax + Vlog − Vth
dB
Equating the right-hand side of the last two equations yields Gc = Vth − Vlog
dB
(10.47)
Equation (10.47) suggests the following equivalent definition of companding gain: to achieve the same SQNR for small input signals, a linear quantiser requires a higher peak input signal level than a log-quantiser operating with the same number of bits/sample. The dB difference between the peak input signal level of the linear quantiser and the peak input signal level of the log-quantiser when their SQNR are equal is the companding gain of the log-quantiser.
10.4.5 Practical Nonlinear PCM In practice, nonlinear A-law and 𝜇-law companding is implemented using the piecewise linear approximation shown in Figure 10.10a,b for A-law and 𝜇-law, respectively. Details of this piecewise companding scheme are presented in Tables 10.4 and 10.5 for the two laws. Consider first the A-law scheme in Figure 10.10a and Table 10.4. The input signal is normalised to the range ±C, where C = 4096. This is simply to scale all step sizes into integer numbers. The magnitude of negative and positive input signal values is processed in the same way, and in the output code the MSB is set to 1 for positive values and 0 for negative. Figure 10.10a, Table 10.4 and the following discussion show how the positive input values (0 → 4096) are handled. The A-law curve is approximated using a series of eight straight-line segments covering the input intervals (0 → 32), (32 → 64), (64 → 128), (128 → 256), (256 → 512), (512 → 1024), (1024 → 2048), (2048 → 4096). These segments are numbered from 0 to 7 in the first column of Table 10.4. Each segment is divided into 16 equal intervals, numbered from 0 to 15 in column four of the table. The first two segments (s = 0, 1) have equal intervals or step sizes, normalised to 2, whereas in the remaining six segments (s = 2 to 7) the step size in one segment is double that of the previous segment. The compression line for segments 0 and 1 is a single straight line that straddles the origin and covers the input range −64 to +64. Thus, there are 13 unique line segments in all. The range of input values that constitutes a quantisation interval in each segment is given in the third column of Table 10.4, which also shows the 128 output levels available to the positive input signal, numbered from 0 to 127 in column five. There are a further 128 output levels, not shown, for negative input signals. This gives 256 output levels in total, which can therefore be represented using 8 bits. Observe that segments s = 0 and 1 have the smallest step size Δmin = 2. If the entire input range (±C) were quantised uniformly using this step size, the number of quantisation levels would be N=
2C 2 × 4096 = = 212 Δmin 2
10.4 Nonuniform Quantisation
128
A-law function
s7
112 96 s5
Output, y
80 s4
64 s3
48 32 16 0
Piecewise linear approximation
s6
s2 s1 s0 256 512 0, 32, 64, 128
1024
2048 Input, x (a)
4096
Output, y 128 112 96 80 64 48 32 –8160
–4064
16 –2016 –992
32 96 224 480 992 2016 –16
4064
8160
Input, x
–32 –48 –64 –80 –96 –112 –128 (b) Figure 10.10
(a) A-law piecewise companding (A = 87.6). (b) 𝜇–law piecewise companding (𝜇 = 255).
651
652
10 Digital Baseband Coding
Table 10.4
A-law coding and decoding specifications.
Segment (s)
Step size, (𝚫) Input range, (X) Interval (l) Output (256 levels), (Yq ) Output code, (8-bit) Receiver output, (Xq )
0
2
1
0–2
2
2
4
3
8
4
16
5
32
6
64
7
128
0
0
1 000 0000
1
⋮
⋮
⋮
⋮
⋮
30–32
15
15
1 000 1111
31
32–34
0
16
1 001 0000
33
⋮
⋮
⋮
⋮
⋮
62–64
15
31
1 001 1111
63
64–68
0
32
1 010 0000
66
⋮
⋮
⋮
⋮
⋮
124–128
15
47
1 010 1111
126
128–136
0
48
1 011 0000
132
⋮
⋮
⋮
⋮
⋮
248–256
15
63
1 011 1111
252
256–272
0
64
1 100 0000
264
⋮
⋮
⋮
⋮
⋮
496–512
15
79
1 100 1111
504
512–544
0
80
1 101 0000
528
⋮
⋮
⋮
⋮
⋮
992–1024
15
95
1 101 1111
1008
1024–1088
0
96
1 110 0000
1056
⋮
⋮
⋮
⋮
⋮
1984–2048
15
111
1 110 1111
2016
2048–2176
0
112
1 111 0000
2112
⋮
⋮
⋮
⋮
⋮
3968–4096
15
127
1 111 1111
4032
which would require 12 bits to represent. Thus, the small input values have an equivalent of k = 12 bits/sample linear quantisation, which we may express as kmax = ⌈log2 (2C∕Δmin )⌉
(10.48)
Segment 7, on the other hand, has the maximum step size Δmax = 128. If the entire input range were uniformly quantised at this spacing, the number of output levels would be N=
2C 2 × 4096 = 64 = 26 = Δmax 128
Thus, the large input values are coarsely quantised at a resolution equivalent to k = 6 bits/sample linear quantisation, or kmin = ⌈log2 (2C∕Δmax )⌉
(10.49)
The binary code b7 b6 b5 b4 b3 b2 b1 b0 for each of the 256 output levels is given in column six of Table 10.4. This code is determined as follows.
10.4 Nonuniform Quantisation
Table 10.5
𝜇-law coding and decoding specifications.
Segment (s)
Step size, (𝚫) Input range, (X) Interval (l) Output (256 levels), (Yq ) Output code, (8-bit) Receiver output, (Xq )
0
2
1
2
3
4
5
6
7
4
8
16
32
64
128
256
0–2
0
0
1 000 0000
1
⋮
⋮
⋮
⋮
⋮
30–32
15
15
1 000 1111
31
32–36
0
16
1 001 0000
34
⋮
⋮
⋮
⋮
⋮
92–96
15
31
1 001 1111
94
96–104
0
32
1 010 0000
100
⋮
⋮
⋮
⋮
⋮
216–224
15
47
1 010 1111
220
224–240
0
48
1 011 0000
232
⋮
⋮
⋮
⋮
⋮
464–480
15
63
1 011 1111
472
480–512
0
64
1 100 0000
496
⋮
⋮
⋮
⋮
⋮
960–992
15
79
1 100 1111
976
992–1056
0
80
1 101 0000
1024
⋮
⋮
⋮
⋮
⋮
1952–2016
15
95
1 101 1111
1984
2016–2144
0
96
1 110 0000
2080
⋮
⋮
⋮
⋮
⋮
3936–4064
15
111
1 110 1111
4000
4064–4320
0
112
1 111 0000
4192
⋮
⋮
⋮
⋮
⋮
7904–8160
15
127
1 111 1111
8032
1. The MSB b7 is set to 1 for a positive input value and 0 for a negative input value. 2. b6 b5 b4 is the binary number equivalent of segment s. For example, if s = 6 then b6 b5 b4 = 110; and if s = 3 then b6 b5 b4 = 011. 3. b3 b2 b1 b0 is the binary number equivalent of the interval l within a segment. For example, if l = 13 then b3 b2 b1 b0 = 1101; and if l = 2 then b3 b2 b1 b0 = 0010. At the receiver, this code is converted to the value shown in the last column of Table 10.4, which is the midpoint of the interval (column three) in which the original input sample falls. Before moving on to 𝜇-law coding, it is worth highlighting again that in A-law PCM the largest input samples are coarsely quantised at a resolution of 6 bits/sample, whereas the smallest input samples are finely quantised at a resolution of 12 bits/sample. This means that, compared to 8-bit linear ADC, A-law PCM delivers a 4-bit improvement for small inputs at the bottom end, which, since each extra bit yields a 6 dB increase in SQNR, corresponds to a companding gain of 24 dB, as discussed in Section 10.4.4. Notice, however, that this improvement is achieved at the expense of a 2-bit shortage for large input signals at the top end, which corresponds to a companding penalty
653
654
10 Digital Baseband Coding
of 12 dB. This companding penalty is less than 15 dB, as calculated in Section 10.4.4, because the piecewise linear approximation to the compression curve leads to a smaller maximum step size than in the exact curve. 𝜇-law piecewise companding (Figure 10.10b and Table 10.5) follows a similar strategy to the A-law, but with several important differences. Note that the input signal is normalised to the range ±8160 to allow the use of integer numbers for the step sizes. Of the 16 segments, the two that straddle the origin are co-linear, giving 15 unique segments. Starting from segment 0 (for the positive signal range), the step size of each segment is double that of the previous one. Following Eqs. (10.48) and (10.49), we see that the 𝜇-law scheme is equivalent to a fine linear quantisation of the smallest input values (in segment s = 0) using )⌉ ⌈ ( 2 × 8160 kmax = ⌈log2 (2C∕Δmin )⌉ = log2 2 = 13 bits∕sample and a coarse linear quantisation of the largest input values (in segment s = 7) using ⌈ ( )⌉ 2 × 8160 kmin = ⌈log2 (2C∕Δmax )⌉ = log2 256 = 6 bits∕sample An important summary of the segments and step sizes used in A-law and 𝜇-law PCM is provided in Table 10.6. In Worked Example 10.5, we show how the information in this table is used for PCM coding and decoding. Piecewise linear companding may be viewed as a two-step process (Figure 10.6b) of fine uniform quantisation and digital translation. This view is demonstrated in Figure 10.11 for A-law piecewise companding. Note that the results of Figure 10.11 and Table 10.4 are the same. Figure 10.12 provides a summary of the steps involved in converting an analogue sample to a PCM code. In the coding process, the analogue sample x(n) is first scaled by a factor of F for the input signal x(t) to fully load the quantiser, which has an input range of −C to +C. In our case, in order to use Table 10.6, the value of C is 4096 for A-law PCM and 8160 for 𝜇-law. The scaled sample, denoted X(n), is converted in the quantiser to a value X q (n) equal to the midpoint of the quantisation interval in which X(n) lies. The encoder then converts X q (n) to a binary code according to the procedure outlined above. It is noteworthy that at every stage of Figure 10.12 the signal is processed in a reversible manner, except at the quantiser. The original input sample x(n) may be obtained from X(n), and likewise X q (n) from the binary PCM code b7 b6 b5 b4 b3 b2 b1 b0 . But once the exact sample X(n) has been converted to the approximate value X q (n), knowledge of X(n) is lost for ever and the incurred quantisation error is a permanent degradation of the transmitted signal. It is important to keep this error to a minimum. Table 10.6
Segments and step sizes in A-law and 𝜇-law PCM. 𝝁-law
A-law Input range Segment (s)
Input range
Step size (𝚫)
X s min
X s max
Step size (𝚫)
X s min
X s max
0
2
0
32
2
0
32
1
2
32
64
4
32
96
2
4
64
128
8
96
224
3
8
128
256
16
224
480
4
16
256
512
32
480
992
5
32
512
1024
64
992
2016
6
64
1024
2048
128
2016
4064
7
128
2048
4096
256
4064
8160
10.4 Nonuniform Quantisation
Analogue Input
Fine-grain linear ADC (k = 13)
Tx digital translator (13 → 8)
A-law PCM (k = 8)
Rx digital translator (8 → 13)
Various resolutions High resolution (k = 13) 2 to 1 1 0000 0000 0000 1 000 0000 32 levels 1 0000 0000 0001 16 levels s0 1 0000 0001 1111 1 000 1111 2 to 1 1 0000 0010 0000 1 001 0000 32 levels 1 0000 0010 0001 16 levels s1 1 0000 0011 1111 1 001 1111 1 0000 0100 0000 4 to 1 … 1 010 0000 64 levels 1 0000 0100 0011 16 levels s2 1 0000 0111 1111 1 010 1111 1 0000 1000 0000 8 to 1 … 1 011 0000 s3 128 levels 1 0000 1000 0111 16 levels 1 0000 1111 1111 1 011 1111 1-to-1 1 0001 0000 0000 16 to 1 1 100 0000 … s4 256 levels 1 0001 0000 1111 16 levels 1 0001 1111 1111 1 100 1111 1 0010 0000 0000 32 to 1 1 101 0000 … s5 512 levels 1 0010 0001 1111 16 levels 1 0011 1111 1111 1 101 1111 1 0100 0000 0000 64 to 1 … 1 110 0000 s6 1024 levels 1 0100 0011 1111 16 levels 1 0111 1111 1111 1 110 1111 1 1000 0000 0000 128 to 1 … 1 111 0000 s7 2048 levels 1 1000 0111 1111 16 levels 1 1111 1111 1111 1 111 1111
Figure 10.11
Tx ≡ transmitter; Rx ≡ receiver
(k = 13)
Various resolutions 1 0000 0000 0001 1 0000 0001 1111 1 0000 0010 0001 1 0000 0011 1111 1 0000 0100 0010 1 0000 0111 1110 1 0000 1000 0100 1 0000 1111 1100 1 0001 0000 1000 1 0001 1111 1000 1 0010 0000 0110 1 0011 1111 0000 1 0100 0010 0000 1 0111 1110 0000 1 1000 0100 0000 1 1111 1100 0000
Digital translation view of A-law piecewise companding.
Analogue x(n) sample
F
X(n)
Quantiser
Xq(n)
Encoder
8-bit PCM code (b7b6b5b4b3b2b1b0 ≡ Ssssllll)
Figure 10.12 Converting analogue sample to PCM code. In the PCM code, S represents a sign bit, sss represents one of 8 segments, and llll represents one of 16 quantisation intervals within each segment.
At the receiver, an incoming PCM code is converted to a quantised value X q (n), which is the midpoint of interval l within segment s, the values of both l and s being indicated by the code. X q (n) is de-scaled by the factor 1/F to yield a received signal sample xq (n), which, barring channel-induced errors in the received PCM code, differs from the original sample x(n) by a quantisation error given by eq = |xq (n) − x(n)| | Xq (n) − X(n) | | | (10.50) =| | | | F | | The maximum value of eq , denoted eq max , is half the step size of the segment s in which X(n) lies. It follows from Tables 10.4 and 10.5 that for A-law PCM eq max
⎧ ⎪1∕F, =⎨ ⎪2(s−1) ∕F, ⎩
s=0 s = 1, 2, · · · , 7
(10.51)
And for 𝜇-law PCM eq max = 2s ∕F
(10.52)
655
656
10 Digital Baseband Coding
Since samples of larger magnitude fall in higher segments s, we see that the maximum quantisation error increases as the sample value increases. This maintains the ratio of sample value to quantisation error and hence SQNR approximately constant over the entire input signal range. All the above coding and decoding processes, including the anti-alias filtering and sample-and-hold functions discussed in Chapter 9, are usually performed by a single large-scale IC unit known as a codec. On the transmit side the codec takes in an analogue signal and generates a serial bit stream output, being the A-law or 𝜇-law PCM representation of the analogue signal. On the receive side it takes an incoming PCM serial bit stream and recovers the original analogue signal, subject of course to quantisation noise and any channel-induced errors in the received PCM code. Worked Example 10.5 An analogue signal of values in the range −V p to +V p is to be digitised. Let V p = 5 V and assume that the signal fully loads the quantiser. Determine (a) The A-law PCM code for the sample value −4 V (b) The voltage value to which the A-law code 0 101 1100 is converted at the receiver (c) The quantisation error incurred in transmitting the sample value 3.3 V using 𝜇-law. (a) In order to use Table 10.6, we scale the input signal from the range ±V p to the range ±C, where C = 4096 (for A-law). This requires that every input sample x(n) is multiplied by a factor of ⎧ ⎪4096∕Vp , F=⎨ ⎪8160∕Vp , ⎩
A-law 𝜇-law
(10.53)
In this case, with V p = 5, F = 819.2. Ignoring for the moment the sign of the input sample x(n) = −4 V, since that only affects the sign bit in the PCM code, and following Figure 10.12 X(n) = Fx(n) = 819.2 × 4 = 3276.8 Table 10.6 shows that X(n) lies in the segment s = 7. The interval l in which X(n) lies within this segment is given by the number of steps Δ (= 128 in segment 7) required to reach a value just below or equal to X(n) starting from the lower limit (X min = 2048) of the segment ⌋ ⌊ X(n) − Xmin (10.54) l= Δ where ⌊y⌋ is the floor operation introduced in Eq. (10.1). Using the above values in Eq. (10.54) gives l = 9. Thus, the (negative) sample lies in the interval s = 7; l = 9; and is therefore assigned the 8-bit PCM code (= Ssssllll) 0 111 1001 where the MSB = 0 for a negative sample. Note that the segment s must be expressed as a 3-bit binary number sss, and interval l as a 4-bit binary number llll, using leading zeros if necessary. This means, for example, that s = 0 would be expressed as sss = 000; and l = 2 as llll = 0010 (b) The A-law code 1 101 1100 represents a positive sample X(n) in the interval l = 11002 = 12 and within segment s = 1012 = 5, where the subscript 2 denotes a binary number equal to the indicated decimal equivalent. The receiver converts this sample to the mid-point X q (n) of the interval Xq (n) = Xmin + lΔ + Δ∕2
(10.55)
10.4 Nonuniform Quantisation
where X min is the lower limit and Δ the step size of segment s (= 5 in this case). From Table 10.6 Xmin = 512 and Δ = 32 giving Xq (n) = 912 De-scaling yields the received sample xq (n) = Xq (n)∕F = 912∕819.2 = 1.11 V (c) Proceeding in a similar manner as in (a) for this 𝜇-law case, we obtain the scaling factor F = 1632, and a scaled sample X(n) = Fx(n) = (1632)(3.3) = 5385.6 We see from Table 10.6 that X(n) lies in segment s = 7, which has X min = 4064 and Δ = 256. Substituting in Eq. (10.54) yields the interval l = 5. The mid-point X q (n) of this interval to which X(n) is quantised is given by Eq. (10.55) Xq (n) = 4064 + 5 × 256 + 256∕2 = 5472 Thus the quantisation error incurred is, from Eq. (10.50) | 5472 − 5385.6 | | = 52.9 mV eq = || | 1632 | | You may prefer to use the following direct formulas that give the sample value xq (n) to which any received PCM code b7 b6 b5 b4 b3 b2 b1 b0 is decoded s = 4b6 + 2b5 + b4 l = 8b3 + 4b2 + 2b1 + b0
(−1)b7 +1 xq (n) = F
⎧ s ⎪2 (l + 0.5), ⎪ × ⎨2s (l + 16.5), ⎪ ⎪2s+1 (l + 16.5) − 32, ⎩
A-law, s = 0 A-law, s = 1, 2, · · · , 7
(10.56)
𝜇-law
where F is the scaling factor given by Eq. (10.53). You should verify that Eq. (10.56) yields the expected result for the A-law PCM code 1 101 1100 of Worked Example 10.5(b).
10.4.6 SQNR of Practical Nonlinear PCM Let us consider the SQNR characteristic of A-law and 𝜇-law PCM as a function of the peak input signal level and compare this with that of a linear ADC of the same number of bits per sample (k = 8). This important exercise allows us to examine the extent to which A-law and 𝜇-law PCM achieve the ideal of a constant SQNR independent of the peak input signal level. We note earlier that the SQNR of a linear ADC (or uniform quantisation) decreases with the peak value of the input signal. This follows from Eq. (10.13), the last term of which gives the peak value V p of the input signal expressed in dB relative to the maximum level C of the quantiser. It is shown in Worked Example 10.2 that √ R = 3 for a uniform PDF input signal. Thus, the SQNR of an 8-bit linear ADC decreases exactly as the peak input signal level (dB) decreases, the maximum SQNR being 48.2 dB for an input signal whose samples are uniformly distributed in the range −C to C.
657
658
10 Digital Baseband Coding
Usually, the input signal is scaled before quantisation in order that the overall peak value of the input signal just fully loads the quantiser. There will, however, be a large variation in the short-term peak value of the input signal. For example, the voice level in one conversation may vary between a whisper (with a short-term peak value much less than C) and a shout when the peak value equals C. The SQNR will be unacceptably low during the ‘whisper’ intervals if 8-bit linear ADC is employed. To overcome this problem we derived a nonlinear, more specifically logarithmic, PCM scheme that has a constant SQNR at all signal levels, given by Eq. (10.25). Implementing this ideal log-PCM scheme, with N = 2k = 256 and K given by Eq. (10.28) for A-law and Eq. (10.35) for 𝜇-law gives a constant SQNR ( SQNR = 10log10
3N 2 K2
)
⎧ ⎪38.2 dB, for K = 1 + ln(A), =⎨ ⎪38.1 dB, for K = ln(1 + 𝜇), ⎩
A = 87.6 𝜇 = 255
However, for practical implementation it was necessary to modify the ideal log-PCM in two ways, resulting in the A-law and 𝜇-law PCM schemes. First of all, small input values were uniformly quantised using a small step size, which is equivalent to using a linear compression curve in the region where the input value x → 0, as discussed earlier. Secondly, the step size Δ does not decrease continuously with x as required by the ideal log-PCM. Rather, Δ only changes in discrete steps and is held constant within specified segments of the input signal, listed in Table 10.6. This corresponds to a piecewise linear approximation. How is SQNR affected by these modifications? Let the input signal samples be uniformly distributed in the range −V p to V p . Then, from Worked Example 10.2(b), the signal power is
Signal Power =
Vp2
(10.57)
3
Figure 10.13 shows the first four segments s = 0, 1, 2, 3 of the quantiser, with V p falling in the top segment. Each segment is of size equal to 16 times its step size. The probability Pj that a sample of the input signal falls in any of the lower segments j = 0, 1, 2 is simply the ratio between the segment size and V p , whereas the probability P3 that a sample will fall in the top segment of Figure 10.13 is the ratio between (X3min − V p ) and V p , where X3min is the lower limit of segment 3. That is
Pj =
16Δj Vp
;
P3 =
Vp − X3 min Vp
;
j = 0, 1, 2
Therefore, in the general case, if V p lies in segment s, then it follows from Eq. (10.18) that the MSQE is given by ( ( ) ) s−1 Δ2s Vp − Xs min 1 ∑ 2 16Δj Pj Δ MSQE = = + 12 12 j=0 j Vp 12 Vp j ) ( s−1 2 4 ∑ 3 Δs Vp − Xs min Δj + = 3Vp j=0 12 Vp ∑
Δ2j
10.4 Nonuniform Quantisation
Figure 10.13 First four segments of a non-uniform quantiser. An input signal with uniformly distributed samples in the range (−V p to V p ) has its peak value in segment s = 3.
X3max
16Δ3 Vp X2max
X3min 16Δ2
X1max X0max 0 –X0max
16Δ1 16Δ0
X2min X1min
–X1max
–X2max –Vp
–X3max
The ratio between signal power in Eq. (10.57) and this MSQE, expressed in dB, gives the SQNR of the piecewise linear log-quantiser as [ SQNRlog = 30log10 (Vp ) − 10log10
] s−1 ∑ Δ2s (V − Xs min ) + 4 Δ3j dB 4 p j=0
(10.58)
Equation (10.58) applies to both A-law and 𝜇-law PCM. It gives the SQNR of a signal of peak value V p , which falls in segment s having lower limit X smin and step size Δs . Note that this result is not affected by normalisation. For example, we presented the A-law and 𝜇-law schemes (see Table 10.6) based on a quantiser range of ±4096 for A-law and ±8160 for 𝜇-law. The above expression for SQNR holds even when the quantiser range is arbitrarily scaled to a different limit (such as ±1), provided all step sizes are also scaled by the same factor to retain the shape of the compression curve. For comparison, the SQNR of a k-bit linear quantiser of input range ±C with a uniform-PDF input signal of peak value V p follows from Eq. (10.13) as SQNRlin = 6.02k + 20log10 (Vp ∕C) dB
(10.59)
As an example, let us apply Eq. (10.58) to obtain the SQNR of a signal that fully loads the quantiser (i.e. the signal has 0 dB peak value relative to quantiser input limit C). For A-law: V p = C = 4096 and lies in segment s = 7
659
660
10 Digital Baseband Coding
with step size Δs = 128 and lower limit X smin = 2048. Equation (10.58) yields SQNRA-law = 30log10 (4096) [ ] 1282 (4096 − 2048) + 4(23 + 23 + 43 + 83 + 163 + 323 + 643 ) − 10 log 4 = 108.37 − 69.82 = 38.55 dB For 𝜇-law: V p = C = 8160 and lies in segment s = 7 with step size Δs = 256 and lower limit X smin = 4064. Equation (10.58) yields SQNR𝜇-law = 30log10 (8160) ] [ 2562 (8160 − 4064) + 4(23 + 43 + 83 + 163 + 323 + 643 + 1283 ) − 10 log 4 = 117.35 − 78.85 = 38.50 dB For the linear quantiser with V p = C and k = 8, Eq. (10.59) yields SQNRlin = 48.16 dB. As a second example, consider a small input signal of peak value 20 dB below quantiser limit. This means that 20log10 (V p /C) = −20 dB, or V p = C/10 = 409.6 for A-law (which places it in segment s = 4 where Δs = 16 and X smin = 256); and V p = 816 for 𝜇-law (which places it in segment s = 4 where Δs = 32 and X smin = 480). The SQNR of this small signal when quantised in each quantiser is therefore [ 2 ] 16 (409.6 − 256) SQNRA-law = 30 log(409.6) − 10 log + 4(23 + 23 + 43 + 83 ) = 37.51 dB 4 [ 2 ] 32 (816 − 480) 3 3 3 3 + 4(2 + 4 + 8 + 16 ) = 37.15 dB SQNR𝜇-law = 30 log(816) − 10 log 4 SQNRlin = 48.16 + 20 log(Vp ∕C) = 48.16 − 20 = 28.16 dB We see that (as a result of the piecewise linear approximation) the SQNR of A-law and 𝜇-law PCM is not quite constant, but in this case, it is around 1 dB lower for a small input signal whose peak value is 20 dB below quantiser input limit than for a large signal that fully loads the quantiser. The linear quantiser’s SQNR has, however, dropped significantly by 20 dB for the small input signal. The results of Eqs. (10.58) and (10.59) are presented in Figure 10.14 for A-law and 𝜇-law PCM schemes and for k-bit linear ADC at k = 8, 12, 13, from which we make the following important observations. ●
●
A-law PCM maintains a near constant SQNR of about 38 dB over a wide range of peak input levels. It is interesting to observe that the SQNR of A-law PCM at a peak input level V p = −36 dB (which corresponds to V p = X 1max = 64 in the ±4096 normalised range) is 36 dB, a drop of only 2 dB from 38 dB. On the other hand, the SQNR of an 8-bit linear ADC at this input level has fallen to an unacceptable value of 12 dB. The input level X 1max marks the beginning of the linear portion of the A-law compression curve. Equation (10.32) shows that the A-law curve is linear in the region x ≤ 1/A or −38.9 dB, which corresponds to input values X ≤ 46.76 when the normalised range ±4096 is used, with A = 87.6. However, in the practical implementation of this curve using a piecewise linear approximation, the terminal linear portion starts slightly earlier at X = 64, which is −36.12 dB relative to 4096. The SQNR of (8-bit) A-law PCM at input levels below −36 dB is the same as that of a 12-bit linear ADC. This agrees with our earlier observation that small input values are finely quantised at the equivalent of 12 bits/sample. Inputs to the A-law quantiser at levels below −36 dB effectively see a linear ADC of (normalised) step size Δ = 2, and therefore the SQNR decreases linearly with the peak input level in step with that of a 12-bit linear ADC.
10.5 Differential PCM (DPCM)
Li
50
Li =
,k
DC
,k
DC
A ar ne
A ar ne
A-law PCM
40
13
A ar ne ,k
DC
SQNR (dB)
=
12
μ-law PCM
Li
30
=
20
8
GcA-law = 24 dB 10 Gcμ-law = 30 dB 0
0
–10
–20
–30
–40
–50
–60
–70
–80
Peak input level, Vp (dB relative to quantiser limit C) Figure 10.14 SQNR versus peak input signal level for A-law, 𝜇-law, and linear ADC. The input signal is assumed to have uniformly distributed samples. ●
●
●
In the case of a 𝜇-law PCM, small input levels below −48 dB have the same SQNR as a 13-bit linear ADC, which again confirms our earlier observation that small input samples are quantised in 𝜇-law at the resolution of a 13-bit linear ADC. The companding gains of A-law and 𝜇-law PCM schemes are indicated in Figure 10.14 as GcA−law = 24 dB, Gc𝜇−law = 30 dB. As discussed earlier, every 6 dB of companding gain is exchanged for a saving of 1 bit/sample. This is the reason why, at small input levels, 𝜇-law PCM delivers the SQNR of a 13-bits/sample linear ADC using only 8 bits/sample, a saving of 5 bits/sample. Similarly, A-law PCM achieves the SQNR of a 12-bits/sample linear ADC using only 8 bits/sample, a saving of 4 bits/sample. Improvements in the SQNR of A-law and 𝜇-law PCM at low input levels have been achieved at a price. The larger input levels are more coarsely quantised in log-PCM, compared to linear ADC. The effect of this can be seen in Figure 10.14, which shows that, for a large input signal that fully loads the quantiser, the SQNR of an 8-bit linear ADC is better by 10 dB than that of log-PCM. This is the companding penalty earlier discussed. However, log-PCM gives a subjectively more satisfying result, maintaining a near-constant SQNR over a wide input range.
10.5 Differential PCM (DPCM) It was pointed out earlier from Eq. (10.12) that SQNR is inversely proportional to the square of the quantiser range. This means that an improvement in SQNR can be realised by reducing the range of sample values presented to the quantiser. Each 6 dB improvement in SQNR can be exchanged for a unit reduction in the number of bits per sample so that the same SQNR as in a PCM system is achieved using fewer bits/sample and hence a smaller transmission bandwidth. In fact, the underlying principle in all differential pulse code modulation (DPCM), low bit rate LBR speech coding, and data compression techniques is to avoid the transmission of redundant information. The original signal is processed in some way at the transmitter to obtain a signal of reduced information content
661
662
10 Digital Baseband Coding
Analogue input
Bandlimited Sampled Prediction error signal signal e(nTs) Quantiser Lowpass s(t) Sample s(nTs) Σ and hold filter (LPF) + (2k levels) –
eq(nTs)
Encoder
DPCM signal kfs bits/s
+ + Σ
(a) Transmitter sˆ(nTs)
Predictor
sq(nTs)
Special case: integrator
DPCM signal (b) Receiver
Decoder
eq(nTs)
fs = 1/Ts Clock
+
Σ + sˆ(nTs)
sq(nTs)
Reconstruction LPF
Analogue output s(t)
Predictor
Special case: integrator
Figure 10.15
DPCM system.
that requires a proportionately lower transmission bandwidth. An acceptably close copy of the original signal is obtained at the receiver by processing the received signal to recover the information. In DPCM the required quantiser range is reduced by encoding the difference e(nT s ) between the actual signal sample s(nT s ) and a predicted value ̂s(nT s ) generated by a predictor circuit e(nT s ) = s(nT s ) − ̂s(nT s )
(10.60)
In a properly designed predictor, the error e(nT s ) is small, allowing the number of quantiser levels 2k and hence the number of bits/sample k to be significantly reduced. The sampling rate f s may be chosen not just to satisfy the sampling theorem but also to be above the rate that would be used in ordinary PCM. This maintains a high correlation between adjacent samples and improves prediction accuracy, thereby keeping e(nT s ) small. However, k is reduced by a larger factor than f s is increased, so that the bit rate Rb = kf s is lower than that of a PCM system of the same SQNR. Figure 10.15 shows a block diagram of a DPCM system. The lowpass filter (LPF) serves to minimise aliasing by limiting the bandwidth of the input signal before it is sampled at intervals T s in a sample-and-hold circuit. A summing device produces the difference or error e(nT s ) between the sampled signal s(nT s ) and the output of a predictor. It is this error signal that is quantised as eq (nT s ) and encoded to produce an output DPCM bit stream. If the analogue signal s(t) changes too rapidly, the predictor will be unable to track the sequence of samples s(nT s ) and the error signal e(nT s ) will exceed the range expected at the quantiser input, resulting in clipping. This type of distortion is known as slope overload. The variance of the error signal e(nT s ) is much smaller than that of s(nT s ). Thus, e(nT s ) can be more accurately represented than s(nT s ) using the same number of quantisation levels. This implies improved SQNR for the same bit rate. Alternatively, e(nT s ) can be represented with the same accuracy as s(nT s ) using fewer quantisation levels, which implies the same SQNR at a reduced bit rate. Note from the block diagram that the input of the predictor at the transmitter is the original sample plus a small quantisation error sq (nT s ) = ̂s(nT s ) + eq (nT s ) = ̂s(nT s ) + e(nT s ) + q(nT s ) = ̂s(nT s ) + s(nT s ) − ̂s(nT s ) + q(nT s ) = s(nT s ) + q(nT s )
(10.61)
10.5 Differential PCM (DPCM)
Predictor sq(nTs)
Delay Ts
Delay Ts
Delay Ts
a1
Delay Ts
a2
a3
ap
a3sq(nTs – 3Ts)
a2sq(nTs – 2Ts) a1sq(nTs – Ts)
Σ
apsq(nTs – pTs)
ˆ s) s(nT Figure 10.16
Predictor realised using a tapped-delay-line filter.
where q(nTs) is the quantisation error associated with the nth sample. Identical predictors are used at both the transmitter and the receiver. At the receiver, the DPCM bit stream is passed through a decoder to recover the quantised error sequence eq (nT s ). This is added to the output of a local predictor to give a sequence of samples sq (nT s ), according to the first line of Eq. (10.61), which when passed through a lowpass reconstruction filter yields the original analogue signal s(t), degraded only slightly by quantisation noise. The predictor is a tapped-delay-line filter, as shown in Figure 10.16. Using the quantised samples sq (nT s ), rather than the unquantised samples s(nT s ), as the predictor input is important to avoid the accumulation of quantisation errors. The predictor provides an estimate or prediction of the nth sample ̂s(nT s ) from a linear combination of p past values of sq (nT s ). It is therefore referred to as a linear prediction filter of order p ̂s(nT s ) = a1 sq (nT s − Ts ) + a2 sq (nT s − 2Ts ) + · · · =
p ∑
aj sq (nT s − jT s )
(10.62)
j=1
Note that, taking t = nT s as the current sampling instant then s(nT s ) denotes the current sample of signal s(t), s(nT s − T s ) denotes the previous sample – at time t = (n − 1)T s , and s(nT s − jT s ) denotes a past sample at time t = (n − j)T s . The optimum values for the coefficients aj (also called tap gains) depend on the input signal. In the simplest case, all the coefficients are equal, i.e. a1 = a2 = … = ap , and the predicted value is a scaled sum or integration of the last p samples. The DPCM can then be implemented using an integrator connected as indicated in the shaded block in Figure 10.15. Except for delta modulators discussed later, DPCM systems are in general more complex than PCM since they require a prediction filter in addition to all the other components of a PCM system. Furthermore, they are subject to quantisation noise as in PCM, and to slope overload distortion, which is not present in PCM. However, DPCM offers an important advantage over PCM in that it requires a lower bit rate than a PCM system of comparable SQNR. The ITU-T has adopted a 32 kb/s DPCM standard (identified as G.726) for voice telephony. This corresponds to using k = 4 bits/sample at a sampling rate of f s = 8 kHz. The standard provides for operations at other bit rates, namely 16, 24, and 40 kb/s (corresponding, respectively, to k = 2, 3, and 5 bits/sample at the same sampling rate). This allows transmission to be adapted to available channel capacity. Another ITU-T
663
664
10 Digital Baseband Coding
standard (G.722) specifies the use of DPCM to transmit wideband audio (of bandwidth 7 kHz) at the same bit rate (64 kb/s) employed by standard PCM for standard telephony speech (of 3.4 kHz bandwidth). In this case, k = 4 and f s = 16 kHz. Wideband audio gives a significant improvement in the fidelity of the received sound for applications, such as audio conferences and loud speaking telephones. There are two special cases of DPCM worth considering further, one known as delta modulation, in which k = 1, and the other known as adaptive DPCM, in which the step size is adjusted depending on the difference signal e(nT s ), in order to minimise noise arising from slope overload distortion and quantisation error.
10.5.1 Adaptive Differential Pulse Code Modulation (ADPCM) In the ITU-T standards referred to above, adaptive quantisation and adaptive prediction are employed to reduce the required number of bits per sample from k = 8 for standard PCM to about k = 4. Such a system is referred to as ADPCM. The speech quality of 32 kb/s ADPCM is about the same as 64 kb/s PCM. Ordinary DPCM (with k = 4, f s = 8 kHz) delivers a poorer quality because the fixed step size Δ would have to be large to avoid overload distortion, or with a small step size the prediction error e(nT s ) would frequently exceed the quantiser range. It is for this reason that ordinary DPCM is not used in practice. Adaptive quantisation uses a time-varying step size Δ in the quantiser. The value of Δ is changed in proportion to the variance of the prediction error, which is the input signal of the quantiser. In this way the ratio between signal power and quantisation noise power in the adaptive quantiser is maintained at an acceptable and approximately constant value. Adaptive prediction employs samples of the quantiser output as well as past prediction errors to compute optimum values for the predictor coefficients aj . This leads to a significant improvement in prediction accuracy and allows fewer bits to be used to code the small prediction errors involved. ITU standard G.721 specifies a transcoding algorithm in which the coder takes A-law or 𝜇-law PCM bit stream as input and yields an ADPCM output bit stream. At the receiver, a decoder accepts the ADPCM format and converts it back to PCM.
10.5.2 Delta Modulation Figure 10.17 shows the block diagrams of the DM transmitter and receiver. By comparing Figure 10.17 with the DPCM block diagram in Figure 10.15, we see that DM is a special case of DPCM with several distinctive features. There are only two quantisation levels (i.e. N = 2) in the quantiser, labelled as binary 0 and 1 and separated by the step size Δ, as shown in Figure 10.18. With N = 2, we know that k = 1 bit/sample. The difference signal e(nT s ) is approximated to the nearer of the two levels, and at each sampling instant the encoder output is a single bit, which is a binary 1 if e(nT s ) is positive and a binary 0 if e(nT s ) is negative. Let us examine the quantisation error and important design parameters of the DM scheme. 10.5.2.1 Quantisation Error
To address this issue, we assume that there is no overload distortion, so that the difference signal e lies within the range ±Δ. Let P1 be the probability that the difference signal is positive and P2 the probability that it is negative. Then P1 + P2 = 1 Consider first the values of e in the positive range 0 to Δ. These are all quantised to output level Δ (represented as binary 1) with a quantisation error e − Δ. The maximum quantisation error is Δ, which is twice that of PCM. Following the same arguments leading to Eq. (10.18), we obtain the MSQE for positive differences Δ
MSQE1 =
∫0
(e − Δ)2 P1 de∕Δ
10.5 Differential PCM (DPCM)
Analogue input
(a)
Bandlimited Sampled signal signal e(nTs) Quantiser s(t) Sampler s(nTs) Σ LPF fs = 1/Ts + (two levels) – +
sq(nTs – Ts)
(b)
DM signal
eq(nTs) =±Δ Decoder Σ + + sq(nTs – Ts)
sq(nTs)
Delay Ts
eq(nTs)
Σ
Encoder
DM signal fs bits/s
+ sq(nTs)
Reconstruction s(t) Analogue output LPF
Delay Ts
Figure 10.17
Delta modulation (DM) system: (a) transmitter; (b) receiver.
Figure 10.18
Quantisation levels in DM.
Input, e(nTs) +Δ
1 de
e 0
–Δ
0 Output
Replacing e − Δ by 𝜀, then de = d𝜀 and the limits e = (0, Δ) become 𝜀 = (−Δ, 0), so that the integration simplifies to ( ) P1 0 2 P1 𝜀3 ||0 𝜀 d𝜀 = MSQE1 = Δ ∫−Δ Δ 3 ||−Δ P Δ2 = 1 3 Similarly, the MSQE for negative differences is MSQE2 = P2 Δ2 ∕3. The total quantisation noise power is therefore MSQE = MSQE1 + MSQE2 P1 Δ2 P2 Δ2 Δ2 + = (P + P2 ) 3 3 3 1 2 Δ (10.63) = 3 The scheme illustrated in Figure 10.18 and analysed above performs quantisation by truncation. It is stated in Eq. (10.5) that the quantisation noise power of a truncating quantiser is four times that of a rounding quantiser, the latter being the approach used in the linear ADC and PCM systems earlier discussed. See, for example, Eq. (10.9). =
665
666
10 Digital Baseband Coding
Note, however, that the noise power in a DM system is distributed over a frequency range from 0 to f s , which is much wider than the receiver bandwidth B. The effective noise power is therefore a fraction B/f s of the MSQE. Using the minimum allowed step size obtained below in Eq. (10.66) for a sinusoidal signal of peak value V m and frequency f m , we obtain 1 2 2 Signal power = 2 Vm∕Δ B Effective noise power 3 fs ( )2 2 fs 3f V = s m × 2B 2𝜋fm Vm 3fs3 = 8𝜋 2 Bf 2m
SQNR =
(10.64)
It can be seen that the SQNR of a DM system increases as the cube of f s , so that doubling the sampling frequency yields a 9 dB improvement in SQNR. 10.5.2.2 Prediction Filter
A DM system uses a prediction filter of order p = 1, with tap gain a1 = 1. It follows from Eqs. (10.62) and (10.61) that the predicted sample is simply the previous sample plus a small quantisation noise ̂s(nT s ) = sq (nT s − Ts ) = s(nT s − Ts ) + q(nT s − Ts )
(10.65)
The predictor is therefore a delay line, with delay equal to the sampling interval T s . 10.5.2.3 Design Parameters
A sampling rate f s , much higher than Nyquist frequency, is employed in DM to reduce the difference e(nT s ) between adjacent samples, thereby avoiding overload distortion and minimising quantisation noise. To see the factors that influence the choice of f s , consider a sinusoidal input signal s(t) = Vm sin(2𝜋fm t) The maximum change in s(t) during one sampling interval T s is given by | ds(t) | 2𝜋fm Vm | | × Ts = 2𝜋fm Vm Ts = | dt | fs | |max To avoid overload distortion, this change must not exceed the step size Δ. Thus 2𝜋fm Vm ≤Δ fs
(10.66)
Equation (10.66) is an important result, which provides the interrelationship that must be satisfied by the parameters of a DM system, namely sampling frequency, step size, message signal frequency, and message signal amplitude. It is apparent that, irrespective of the values of the other parameters, Eq. (10.66) can be satisfied by making f s sufficiently large, the penalty being increased bit rate and hence transmission bandwidth. 10.5.2.4 Merits and Demerits of DM
DM has two important advantages over standard PCM. ●
It can be realised using a very simple codec that does not require the type of quantiser and encoder found in PCM. It is therefore more reliable and more economical. The quantiser is a simple pulse generator (or comparator) that gives an output +V volts (or binary 1) when its input e(nT s ) is positive and −V volts (or binary 0) when e(nT s ) is negative. The predictor uses only a single tap gain and may be replaced by an integrator. A simple analogue
10.5 Differential PCM (DPCM)
Analogue input (a)
(b)
DM signal
Sampled Bandlimited signal signal s(nTs) s(t) Sampler + LPF Comparator – Clock sq(nTs – Ts) fs = 1/Ts Integrator
Integrator
sq(nTs)
Reconstruction LPF Granular noise
DM signal (binary ±V) fs bits/s
s(t)
Analogue output
Overload distortion
s(t)
(c)
Step size, Δ sq(nTs)
DM signal
Figure 10.19
●
Ts 1 1 1 1 0 0 0 0 0 1 0 1 0 1 0 1 1 1 1 1
t
Simplified DM system and waveforms: (a) transmitter; (b) receiver; (c) waveforms.
RC LPF could be used. The output voltage of the integrator rises or falls by one step Δ in response to each pulse input. This enables the integrator to track the input voltage. The receiver then uses a similar integrator whose output rises by Δ when its input is a binary 1 and falls by the same amount when its input is binary 0. In this way both integrators produce the same staircase waveform sq (nT s ), and a lowpass reconstruction filter will easily smooth out the integrator output at the receiver to yield the original analogue signal. Figure 10.19 shows a block diagram of the simplified DM codec, and the associated waveforms including the generated DM bit stream for an arbitrary input signal s(t). DM is more robust to transmission errors than PCM, and intelligibility can be maintained at bit error rates as high as one in a hundred (i.e. 10−2 ). Every transmitted bit has the same weight and the maximum error due to one bit being received in error is equal to a quantiser step size Δ. In PCM, on the other hand, an error in the MSB can have a significant effect on the received signal, causing an error ranging in magnitude from Δ to 2k Δ. This robustness to noise makes DM the preferred technique in certain military applications. The US military and NATO have operational DM systems at 16 and 32 kb/s. DM has, however, the following disadvantages:
●
●
Oversampling with f s > > 2f m is required to allow the use of a small step size Δ while ensuring that the difference between adjacent samples rarely exceeds Δ. With 1 bit/sample and f s samples per second, the bit rate of DM is f s bits/s, which may in some cases exceed that of standard PCM and therefore require a larger bandwidth. With a fixed step size, the maximum error per sample is constant at ±Δ for all samples. However, while the quantisation noise power is fixed, the signal power varies across the range of input levels. Thus, SQNR varies with input level and may fall to unacceptably small values at low signal levels. The useful dynamic range of input signals over which the SQNR does not change significantly is therefore severely limited.
667
668
10 Digital Baseband Coding ●
●
When there is a roughly constant portion of the input signal s(t), the staircase approximation sq (nT s ) hunts about this level, as shown in Figure 10.19c. This gives rise to a type of noise known as granular noise, which can only be minimised by making Δ as small as possible, subject to Eq. (10.66). However, a small Δ makes the system prone to overload distortion, as discussed below. Granular noise manifests itself in the DM bit stream as a string of alternating 0’s and 1’s. Figure 10.19c also shows an incidence of overload distortion in which the input waveform s(t) changes too quickly to be tracked by sq (nT s ). To avoid overload distortion, Eq. (10.66) must be satisfied in the choice of step size and sampling frequency. This requires one of the following options: – Make f s sufficiently large, so that fs ≥ 2𝜋fm Vm ∕Δ. This increases transmission bandwidth requirements. – Limit the message signal amplitude to Vm ≤ Δfs ∕2𝜋fm , which limits the allowable dynamic range of the input signal. Note that the allowable peak value Vm decreases with the frequency f m of the input signal. – Increase the step size to Δ ≥ 2𝜋fm Vm ∕fs . This, however, increases granular noise. – Limit the input signal frequencies to fm ≤ Δfs ∕2𝜋Vm . The problem here is that the range of f m depends on the type of information signal and cannot be significantly reduced without compromising fidelity.
10.5.2.5 Adaptive Delta Modulation (ADM)
We have seen the conflicting requirement of a large step size Δ to combat overload distortion, and a small step size to minimise granular noise. Adaptive delta modulation (ADM) provides for the control of both types of noise by varying the step size. This technique belongs in the general class of adaptive delta pulse code modulation (ADPCM). It is introduced here to show how both overload distortion and granular noise may be reduced. In basic ADM, the step size is changed in discrete amounts. To start with, the step size Δ is set to a very small value 𝛿 to minimise granular noise. This (the smallest) step size is maintained for as long as the DM bit stream consists of a string of alternating 1’s and 0’s, or two successive 1’s, or two successive 0’s. After encountering three successive 1’s or 0’s the step size is increased to Δ = 2𝛿, and Δ is increased even further to 4𝛿 when four consecutive 1’s or 0’s occur. The larger step sizes allow the integrator output, which is the staircase approximation sq (nT s ), to catch up more quickly with the sequence of input samples s(nT s ). ADM has also been implemented with the step size continuously varied in proportion to the input signal. This variant of ADM is referred to as continuously variable slope delta modulation (CVSDM). 10.5.2.6 Delta Sigma Modulation
It was noted earlier that the amplitude of the higher-frequency components in the input signal must be minimised to avoid overload distortion. This can be done by first passing the input signal through an LPF whose gain decreases linearly with frequency. Recall that such an LPF is actually an integrator. At the receiver, a highpass filter (HPF) having a gain response that increases linearly with frequency must be employed to remove the spectral distortion imposed at the transmitter. Note that this HPF performs the operation of differentiation. However, a DM receiver already contains an integrator. The resulting receiver will consist of an integrator followed by a differentiator, and the two components can therefore be eliminated altogether. This leaves a lowpass reconstruction filter as the only component required at the receiver. The scheme described above is shown in Figure 10.20 and is traditionally known as delta sigma modulation (DSM), but should more appropriately be called sigma delta modulation (SDM). SDM allows the use of a greatly simplified receiver, and the transmission of signals with allowable peak levels that are independent of frequency.
10.6 Low Bit Rate Speech Coding The purpose of LBR speech coding is to achieve a digital representation of speech signals with acceptable fidelity using as few bits as possible. This is essential in bandwidth- and/or power-limited voice communication services such as mobile communications and Internet telephony, and in voice storage services such as voicemail,
10.6 Low Bit Rate Speech Coding
Input signal
(a)
s(t)
Integrator
SDM signal
(b) Figure 10.20
Delta modulator
SDM signal (binary ±V)
Reconstruction LPF
s(t)
Analogue output
Sigma delta modulation system (SDM): (a) transmitter; (b) receiver.
store-and-forward messaging, and automatic voice response systems. Transmission bandwidth and storage capacity requirements increase proportionately with bit rate. For example, standard PCM with a bit rate of 64 kb/s requires a minimum transmission bandwidth of 32 kHz, and a storage capacity of 1.44 MB for a call of duration of three minutes. If we somehow manage to reduce the bit rate to, say, 6.4 kb/s, then the bandwidth and storage capacity requirements fall proportionately to 3.2 kHz (minimum) and 144 kB, respectively. Figure 10.21 shows the trade-offs involved in LBR speech coding. The main characteristics against which the performance of a speech codec is judged include: ●
Quality: in general, the achievable speech quality falls as the bit rate is reduced. How much degradation there is depends on the type of algorithm used to reduce the bit rate. Obviously, if a bit rate of 32 kb/s were achieved simply by reducing the number of bits per sample from k = 8 to k = 4 in standard PCM, there would be a drop in SQNR of 24 dB and hence a significant degradation in quality. However, 32 kb/s ADPCM and even some 16 kb/s codecs provide speech quality that is practically indistinguishable from that of 64 kb/s PCM. Speech quality is usually specified using a subjective mean opinion score (MOS), which is obtained by averaging the judgement of many listeners expressed in the form of a score on a scale from 1 to 5. Table 10.7 provides a
Vocoder 2.4 kb/s Figure 10.21
Trade-offs in low bit rate speech coding.
Processing Delay
Sophistication
Computational Complexity
Transparency
Speech Quality
Bandwidth
Bit Rate
Standard PCM 64 kb/s
669
670
10 Digital Baseband Coding
Table 10.7
Speech quality using mean opinion score (MOS).
MOS
Quality description
Quality class
1
Bad, unacceptable
Synthetic
2
Poor
3
Fair
Professional
4
Good
Near transparent, toll quality
5
Excellent
Communication
●
●
●
●
classification of the scores. Note, however, that the MOS recorded for a speech codec can vary significantly with tests and language. An MOS of < 3.0 corresponds to speech of synthetic quality, which may have high intelligibility (i.e. the speech is understandable) but sounds unnatural and lacks the attributes that allow recognition of the speaker and their emotion. Communication quality speech of an MOS of between 3.5 and 4.0 contains perceptible, but not annoying, distortion, while speech with perceptible and slightly annoying distortion could be classed as professional quality, with an MOS of between 3.0 and 3.5. MOS values in excess of 4.0 indicate high-quality and natural-sounding speech with practically imperceptible distortion. A-law and 𝜇-law PCM produce high-quality speech with an MOS of about 4.2. MOS scores of 4.0 and above are regarded as toll quality. A vocoder (discussed below) produces synthetic speech with MOS of about 2.2. Transparency: significant reductions in bit rate are achieved by exploiting signal characteristics that are specific to speech. Such codecs will cause excessive distortion to nonvoice signals, such as voiceband data modem signals. It then becomes essential that nonvoice signals are recognised and diverted from the codec for processing using a different algorithm. In general, a codec’s transparency (its ability to handle both speech and nonspeech signals) improves with bit rate. Computational complexity: the amount of computation required to maintain acceptable quality in LBR coded speech increases significantly as the bit rate is reduced. Sophisticated DSP hardware must be employed to minimise processing delay, and this increases both codec cost and operational power consumption. The latter is of concern especially in battery-operated portable units. However, due to advances in integrated circuit technology, DSP units capable of increasingly complex operations are now available in the market at very low costs. Robustness to transmission errors: as the original speech signal is represented using fewer and fewer bits, the role of each transmitted bit becomes more crucial, and therefore the impact of transmission errors is more drastic. For example, a single bit error in a vocoder could render a 20 ms speech segment unintelligible. By contrast, the impact of a single bit error in standard PCM is restricted to one sample and will be virtually imperceptible if it involves the least significant bit (LSB). Error protection of at least some of the more crucial bits is therefore required in LBR codecs. It is for this reason that the full-rate GSM (i.e. the Global System for Mobile Communications, the Pan-European digital cellular mobile system) speech codec operating at 13 kb/s devotes an additional 3 kb/s for error protection. Processing delay: most LBR codecs must accumulate a statistically significant block of samples (typically over 10–25 ms) in a storage buffer, from which certain parameters are extracted. To this buffering or frame delay is added the time it takes to complete the processing of the accumulated samples. The total processing delay may be up to three times the frame delay, depending on computational complexity. An isolated delay of this magnitude may contribute to an increase in sidetone but is otherwise imperceptible and perfectly acceptable to a user. However, if such a codec is connected to the public telephone network, or several of them are present in a
10.6 Low Bit Rate Speech Coding
long-distance link then the total delay may lead to annoying echo, necessitating the use of echo control circuits (e.g. echo suppressors or cancellers). In addition to the above characteristics, other important considerations include robustness to background noise and the effect of tandem operation with other codecs. The latter arises because, in many cases, a speech codec must operate on the output of coders located in other parts of the network. The distortion resulting from such multiple coding/decoding must not be excessive. The priority assigned to each of these characteristics is determined by the application. For example, processing delay is not crucial in applications such as voice messaging and videoconferencing. In the latter, the accompanying video signal usually has a much larger processing delay than that of speech. Extra delay would normally be inserted in the speech processing to synchronise with the processed video. The three broad classes of coders, namely waveform coders, vocoders, and hybrid coders are briefly introduced in the following sections.
10.6.1 Waveform Coders A waveform coder attempts an accurate digital representation of the speech waveform. If this is done directly in the time domain, the coder is described as a predictive coder. It is also possible to first divide the speech signal into contiguous frequency bands using bandpass filters, before coding each sub-band (SB) using a separate waveform coder. The coder is then said to be a frequency domain coder. The main advantage of such an SB coding approach is that the number of bits per sample assigned to each SB can be varied according to the information content of the band. Those SBs with negligible information content or energy can be skipped altogether. The result is that a speech quality comparable to that of predictive waveform coders can be achieved at a lower bit rate. Waveform coders are generally the least complicated of the three classes of coders. They introduce the lowest processing delays, and produce speech of high quality but require high bit rates ≥16 kb/s. The distortions at lower bit rates are excessive, and other techniques must be adopted in order to reduce the bit rate any further while maintaining acceptable quality. The most important examples of predictive waveform coders include DM, PCM, and ADPCM, which have been discussed earlier. The ITU-T G.726 standard, which effectively superseded the G.721 (32 kb/s ADPCM) standard, provides ADPCM speech coding at variable rates, namely 40, 32, 24, and 16 kb/s. This allows the flexibility of sacrificing speech quality in order to maintain transmission under heavy traffic conditions. The ITU-T G.727 provides for coding at the same variable bit rates as in G.726, but uses a technique known as embedded ADPCM. This involves 32, 16, 8, or 4 uniform quantisation levels, the levels for coarse quantisation being a subset of levels in finer quantisation. Using all 32 levels, requiring 5 bits/sample, gives the best speech quality at a bit rate of 40 kb/s. As traffic load increases, the network can adopt lower bit rates by simply discarding one, two, or three of the less significant bits. Discarding the LSB, for example, effects coarser quantisation using only 16 of the 32 possible levels, with the step size doubled. The ITU-T G.722 standard for coding wideband (7 kHz) speech at rates of 64, 56, and 48 kb/s is a good example of an SB waveform coder. The input speech signal, sampled at 16 kHz with a resolution of 14 bits/sample, is split into two 4 kHz bands. The lower band, which contains most of the information, is sampled at 8 kHz and coded using embedded ADPCM at 6, 5, or 4 bits/sample. This yields lower-band coding bit rates of 48, 40, or 32 kb/s. ADPCM at a resolution of 2 bits/sample is used for the upper band, which gives a bit rate of 16 kb/s. Thus, a total bit rate of 64, 56, or 48 kb/s is achieved.
10.6.2 Vocoders A vocoder, or vocal tract coder, does not code the speech waveform at all. Rather, it explicitly models the vocal tract as a filter in order to represent the speech production mechanism. At the receiver, the representation is animated
671
672
10 Digital Baseband Coding
to reproduce a close version of the original speech signal. The main source of degradation in reproduced speech quality is not quantisation error but the use of a model that cannot accurately represent the speech production mechanism of the vocal tract. For example, each segment of speech (of duration, say, 20 ms) is classified as either voiced or unvoiced and reproduced at the receiver by exciting the filter with a pulse train or random noise, respectively. In practice, the speech segment will be partially voiced and unvoiced. This ‘black or white’ classification alone gives a vocoder a synthetic speech quality. Vocoders do give good intelligibility at very LBRs (e.g. 2.4 and 4.15 kb/s) and therefore are widely used in applications that require intelligibility without a strong need for speaker identity/emotion recognition. However, they involve complex processing and introduce delays above about 20 ms. Three types of vocoders are briefly discussed below. 10.6.2.1 IMBE
The improved multiband excited (IMBE) vocoder employs filters to break the speech signal into narrow frequency bands, each spanning about three harmonics. The energy in each band is measured and a voicing decision is made that identifies each band as voiced or unvoiced. This information (namely the energy and voice flag of each band) is coded and transmitted. At the receiver, each filter is excited at its identified energy level using a sinusoidal oscillator for a voiced band and a noise source for an unvoiced band. The synthesised speech signal is obtained as the sum of the output of all the filters. 10.6.2.2 LPC
The linear predictive coding (LPC) vocoder was developed by the US Department of Defense (DoD) and adopted as Federal Standard (FS) 1015. Figure 10.22 shows a block diagram of the speech synthesis model. LPC-10 uses a predictor (i.e. filter) of order p = 10 for voiced speech and p = 4 for unvoiced speech. At the transmitter, each 22.5 ms segment of the speech signal is analysed to obtain the following parameters: ● ● ● ●
A voiced/unvoiced flag, which is coded with 1 bit. A gain factor G to model the energy of the segment, which is quantised using 5 bits. A pitch period in the range 51.3– 400 Hz, which is quantised using 6 bits. Ten predictor coefficients a1 to a10 , which are quantised using 5 bits each for a1 , a2 , a3 , and a4 . If it is a voiced segment then a further 21 bits are used to code six more coefficients, with 4 bits used for each of a5 to a8 , 3 bits Pitch period Voiced/unvoiced switch Pulse generator
Filter coefficients
Gain
G
White noise generator Figure 10.22
LPC speech synthesis model.
Filter (vocal tract)
ˆ s(n)
DAC
ˆ s(t) Synthesised speech
10.6 Low Bit Rate Speech Coding
used for a9 , and 2 bits for a10 . Higher coefficients are not used in the case of an unvoiced segment, and the 21 bits are devoted to error protection. One bit, alternating between 1 and 0, is added to each segment for synchronisation at the receiver. Thus, we have 54 bits per 22.5 ms, or a bit rate of 2400 b/s. The above parameters are used at the receiver to synthesise the speech signal, as shown in Figure 10.22. 10.6.2.3 MELP
A 2.4 kb/s vocoder was adopted in 1996 to replace FS 1015. It achieves professional speech quality by using mixed excitation linear prediction (MELP). The excitation consists of a mixture of periodic pulses and white noise, which varies across the frequency band according to an estimate of the voiced speech component within the bands.
10.6.3 Hybrid Coders This class of LBR coders is so named because it combines the advantages of waveform coders and vocoders to achieve good reproduction of speech at LBRs. Hybrid coders use a technique called analysis-by-synthesis. At the transmitter various combinations of the model parameters are used to synthesise the speech as would be done at the receiver, and the combination yielding the best perceptual match to the original speech signal is selected. Hybrid coders are highly complex and introduce processing delays at least of the same order as vocoders. The distinction amongst various implementations of hybrid coders lies mainly in the excitation signal pattern, the analysis procedure, and the type of information transmitted to the receiver. 10.6.3.1 APC
Adaptive predictive coding (APC) is used in the toll-quality 16 kb/s Inmarsat-B standard codec. Here segments of the speech signal are processed to remove voice model information leaving a residual signal, a quantised version of which is scaled by various trial gains and used as the excitation signal to synthesise the speech segment. The scaled residual yielding the best perceptual match to the original speech segment is chosen and is encoded along with the voice model parameters and transmitted. At the receiver, a good copy of the original speech signal is reproduced using the scaled residual signal as the excitation signal of a filter constituted from the voice model parameters. 10.6.3.2 MPE-LPC
Multipulse excited linear predictive coding (MPE-LPC) employs perceptual weighting to obtain an improved modelling of the excitation. This signal is constituted as a sequence of pulses with amplitude, polarity, and location determined to minimise the total weighted squared error between synthesised and original speech signals. The 9.6 kb/s Skyphone codec, developed by BT (formerly British Telecom), is based on this algorithm. It was adopted by Inmarsat and the Airlines Electronic Engineering Committee (AEEC) for aeronautical satellite communications. If in MPE the (unequal amplitude) pulses are regularly spaced within the excitation vector, we have what is termed regular pulse excitation (RPE). Regular pulse spacing shortens the search procedure and hence reduces processing delay. RPE, combined with long-term prediction (LTP), is the basis of the 13 kb/s speech codec which was standardised for GSM. 10.6.3.3 CELP
Code excited linear prediction (CELP) allows the use of an MPE with lower bit rates. The excitation vector that gives the best perceptual match between synthesised and original speech signals is selected from a set or codebook of vectors. The excitation signal is usually a weighted sum of contributions from two codebooks, one codebook being adaptive and constantly updated with past excitation vectors and the other consisting of zero-mean unit-variance random sequences. The saving in bit rate comes about because fewer bits can be used to code the addresses of the
673
674
10 Digital Baseband Coding
selected vectors, allowing the excitation signal to be reconstituted at the receiver from local copies of the codebooks. An example of a CELP-based codec is the ITU-T G.729 conjugate structure algebraic (CSA) CELP codec, which delivers toll-quality speech at only 8 kb/s.
10.7 Line Codes Line codes are introduced in Chapter 1 as a means of electrically representing a bit stream in a digital baseband communication system. In this section we want to familiarise ourselves with simple line codes and to develop an appreciation of their merits and demerits in view of the desirable characteristics of line codes. You may wish to refer to Chapter 1 for a discussion of these characteristics. A more detailed treatment of line codes, including their bit error ratios and spectral analysis, is provided in Chapter 7 of [1]. Figure 10.23 shows waveforms of the line codes discussed below, under various classes, for a representative bit stream.
10.7.1 NRZ Codes Non-return-to-zero-level (NRZ-L): there are two types of NRZ-L, namely unipolar and bipolar NRZ-L. In unipolar NRZ-L, binary 1 is represented by a positive voltage pulse +V lasting for the entire bit interval 𝜏, and a binary 0 is represented by no pulse (0 V) during the bit interval. For this reason, unipolar NRZ-L is also referred to as on–off signalling. Bipolar NRZ-L represents binary 1 with a positive voltage pulse +V, and binary 0 with a negative voltage pulse −V, each pulse lasting for the entire bit interval. Note that NRZ codes are so called because the code level remains constant within a bit interval and does not return to zero. Bit stream 1 Unipolar NRZ-L V (On-off signalling) 0 V Bipolar NRZ-L –V V NRZ-M 0 V NRZ-S 0 V RZ 0 V AMI 0 –V V Biphase-L (Manchester) –V V Biphase-M –V V Biphase-S –V V CMI –V V HDB3 0 –V V 3B4B –V 1 Bit stream
Figure 10.23
0
1
0
0
0
0
0
1
1
0
0
0
B
0
1
1
1
0
0
Common line codes.
0
0
0
0
B
0
1
1
0
0
0
V
V
0
0
0
0
0
1
1
0
V
0
0
0
0
10.7 Line Codes
By considering the desirable characteristics of a good line code, you can readily convince yourself that although NRZ-L has 100% efficiency it nevertheless exhibits numerous unsatisfactory features. It is usually seen as a basic data format and referred to as uncoded data. It is used only for transmission over a very short distance, such as within one piece of equipment. A separate clock signal must be used since the NRZ-L, M, and S codes have no guaranteed clock content. Another of the many undesirable features of this code is that a level inversion during transmission causes all symbols to be incorrectly received. A differential NRZ code overcomes this problem. Differential NRZ: bits are coded using voltage transitions, rather than actual voltage levels as in NRZ-L. There are two types of differential NRZ. In NRZ-mark (NRZ-M), there is a transition at the beginning of a bit interval if the bit is 1 (called mark in the days of telegraphy), and no transition if the bit is 0. The other type is NRZ-space (NRZ-S), which codes a binary 0, formerly called space, with a transition at the beginning of the bit interval, and a binary 1 with no transition. Therefore, denoting the current input bit as x(n), the current output of the coder as y(n), and the previous coder output as y(n − 1), we may summarise the coding rule of differential NRZ as follows: Output, y(n) Input x(n)
NRZ-M
NRZ-S
0
y(n − 1)
y(n − 1)
1
y(n − 1)
y(n − 1)
where the overbar denotes a complement operation (i.e. change of state), so that V = 0 and 0 = V. Note that x(n) is binary, having two possible values: 0 and 1, and the code y(n) is also binary, with two possible voltage levels: 0 and V. In Figure 10.23 it is assumed that the output is initially high, that is y = V before the first input bit. You may wish to verify that if y is initially at a low level (0 V) then the resulting code waveform for NRZ-M and NRZ-S is the complement of the corresponding waveforms shown in Figure 10.23. A major handicap with NRZ codes is that they have very poor and nonguaranteed timing content. A long run of the same bit in NRZ-L, 0’s in NRZ-M and 1’s in NRZ-S, gives rise to a transmitted waveform that is void of level transitions. It is then impossible to extract the clock signal from the received waveform at the receiver. A small improvement in timing content is provided by the return-to-zero (RZ) code, although at the cost of doubling the bandwidth requirement compared to NRZ.
10.7.2 RZ Codes Return-to-zero (RZ): the RZ code represents binary 1 using a voltage pulse of amplitude +V during the first half of the bit interval followed by 0 V (no pulse) in the remaining half of the bit interval. Binary 0 is represented as no pulse (0 V) for the entire bit interval. Observe that a long run of 0’s in the RZ code will still cause timing problems at the receiver. Furthermore, the code exhibits another serious handicap in that it has a DC component (average value) that depends on the fraction of 1’s in the transmitted data. Most links for long-distance data transmission incorporate transformers and series capacitors, which will effectively block any DC component in a message signal. A long run of 1’s in RZ code introduces a DC offset that may result in droop and baseline wander due to charging and discharging capacitors. Another code, the alternate mark inversion (AMI) eliminates the DC offset problem, but at the expense of increased codec complexity. AMI: this code is obtained from RZ by reversing the polarity of alternate binary 1’s. As a result, three voltage levels (−V, 0, and +V) are employed, making this a three-level, or ternary, code. AMI eliminates droop and base-line wander while keeping the bandwidth requirement the same as in RZ. However, a train of data 0’s still results in a transmitted waveform without transitions, which causes timing problems
675
676
10 Digital Baseband Coding
when the receiver tries to establish bit synchronisation. The problem of a lack of transitions in some bit sequences is overcome in biphase codes, which contain a transition in each symbol whether it represents a binary 1 or 0.
10.7.3 Biphase Codes There are three types of biphase codes, namely biphase-L, which is also called Manchester code, biphase-M, and biphase-S. In biphase-L, a binary 1 is represented by a transition from high (+V) to low (−V) at the middle of the bit interval, whereas a binary 0 is represented by a transition from low (−V) to high (+V) at the middle of the bit interval. That is, to represent a binary 1, a positive pulse +V is transmitted for the first half of the bit interval followed by a negative pulse −V for the remaining half of the bit interval. A symbol of opposite polarity is used to represent a binary 0. Manchester code finds application in the ethernet standard IEEE 802.3 for local area networks (LANs). It has good timing content but suffers the same polarity inversion problem as NRZ-L, a handicap eliminated in the other biphase code types. In biphase-M, there is always a transition at the beginning of each bit interval. Binary 1 is represented with an additional transition at the middle of the bit interval, whereas binary 0 has no additional transition. Biphase-S similarly always has a transition at the beginning of each bit interval but uses an additional transition at the middle of the bit interval for a binary 0 and no additional transition for a binary 1.
10.7.4 RLL Codes Run-length-limited (RLL) codes limit the length of a run of voltage levels void of transition. One type of implementation is called bipolar with n zeros substituted (BnZS), e.g. B3ZS, B4ZS, B6ZS, and B8ZS. The case of n = 4 (i.e. B4ZS) is also called high-density bipolar with 3 zero maximum (HDB3). Coded mark inversion (CMI): this is the simplest type of RLL code. It is a binary code in which binary 1 is represented by full width alternating polarity pulses, and binary 0 by −V volts for the first half of the bit interval followed by +V volts for the remaining half. CMI is preferred to HDB3 in high bit rate systems because of its simplicity and efficiency. HDB3: this is an AMI code in which the number of successive 0’s transmitted is somehow limited to three. The fourth of four adjacent 0’s is represented using a violation (V) pulse. That is, it is represented as a 1 that violates the alternating polarity rule. In this way, a level transition is introduced, and the receiver will recognise the violating pulse as representing a binary 0. A little thought will show, however, that violation pulses can cause a build-up of DC offset. To avoid this, whenever necessary, a balancing (B) pulse is used for the first of four adjacent zeros in order to prevent two successive violation pulses having the same polarity. To identify a B-pulse, note that whenever a V-pulse occurs after only two zeros, the previous pulse is a B-pulse, and therefore represents a binary 0. These pulses are labelled in Figure 10.23. Let us take a moment to understand HDB3 coding. Figure 10.24 shows the same bit stream as before, with the bit positions numbered for easy reference. The AMI waveform for this bit stream is repeated in Figure 10.24a. Note how binary 1’s are represented using pulses of alternating polarity, so that the average value of the waveform is always approximately zero. Thus, unlike RZ code, there is no DC build-up in the waveform as binary 1’s are coded. However, there is a string of 0’s from bit 4 to bit 8, and the AMI waveform is void of transition during the entire interval. To overcome this problem of poor timing content, HDB3 limits the maximum run of 0’s to three by changing the fourth zero in a four-zero-string to binary 1 and representing it with a V-pulse to enable the receiver to spot the substitution. The implementation of this idea is shown in Figure 10.24b. Note how positions 7, 14, and 20 are coded. These are V-pulses because each of them has the same polarity as the previous pulse, whereas the rule stipulates an alternating polarity. Now we have solved the problem of poor timing content and can guarantee that in a four-bit interval there will be at least one transition in the code waveform. However, there is a new problem.
10.7 Line Codes
Bit stream
1
0
1
0
0
0
0
0
1
1
0
0
0
0
1
1
0
0
0
0
0
+V
(a)
0 –V +V
(b)
V
–V
V
+V
B
(c) V
–V Bit position
1
2
3
4
5
6
7
V
V B
8
9
V
10 11 12 13 14 15 16 17 18 19 20 21
Figure 10.24 Steps leading up to HDB3 code: (a) AMI; (b) AMI with fourth zero in a four-zero-string represented with a violation (V) pulse; (c) balancing (B) pulse introduced to complete HDB3 coding.
The code waveform contains three positive and six negative pulses, and so contains an undesirable DC component. Observe that this has arisen because successive V-pulses have the same polarity. To eliminate this build-up of DC off-set in the waveform, we make the following modification that prevents successive V-pulses from having the same polarity: before inserting a new V-pulse we check to see if it would have the same polarity as the previous V-pulse. If so, we change the first zero in the four-zero-string to a 1; represent this 1 with a pulse that obeys the alternating polarity rule and call it the B-pulse, and then insert the new V-pulse to violate this B-pulse. Figure 10.24c shows the implementation of the above modification to obtain the desired HDB3 waveform. Let’s walk through this waveform. The first V-pulse at No. 7 does not have the same polarity as the previous V-pulse (in this case simply because there is no previous V-pulse), so no modification is necessary. Next, there is a four-zero-string from No. 11–14, and a V-pulse is therefore required at No. 14. This would have the same polarity as the previous V-pulse at No. 7. A modification is needed. Therefore, insert a B-pulse at No. 11 and then insert the V-pulse at No. 14 to violate this B-pulse. After this, there is a four-zero-string from No. 17– 20, meaning that a V-pulse is needed at No. 20. We see that this V-pulse would have to be positive (in order to violate the pulse at No. 16), and therefore would have the same polarity as the previous V-pulse at No. 14. Therefore, insert a B-pulse as shown at No. 17 before inserting the V-pulse at No. 20. Note that at the receiver it is a straightforward matter for the decoder to correctly interpret the code waveform as follows: ● ● ● ●
Every 0 V (for the entire bit interval) represents binary 0. Every V-pulse represents binary 0. Every pulse that is followed by a V-pulse after only two bit intervals is a B-pulse and hence represents binary 0. Every other pulse represents binary 1.
10.7.5 Block Codes We have so far discussed line codes that work on 1 bit at a time to select the code symbols. In block codes, on the other hand, blocks of input bits are coded into code symbols according to rules defined in a coding table. As shown in Figure 10.25, a ternary block code uses m ternary symbols to represent n bits, and is designated
677
678
10 Digital Baseband Coding
n bits
Binary block code
m binary symbols
n bits
Ternary block code
m ternary symbols
Figure 10.25
Block coding.
nBmT. A binary block code uses m binary symbols to represent n bits, and is designated nBmB. Usually, not all the possible symbol combinations or codewords are used in the block code output. This allows the flexibility of choosing those codewords with the desired characteristics. For example, balanced codewords containing equal amounts of positive (+) and negative (−) pulses, and hence no DC offset, are preferred to unbalanced codewords. Furthermore, codewords with more frequent transitions, such as − + − + − +, are preferred to those with fewer transitions, such as − − − + + +. From the definition of code efficiency in Chapter 1, it follows that the efficiency of nBmT and nBmB codes are given, respectively, by n 𝜂nBmT = × 100% mlog2 (3) n 𝜂nBmB = × 100% (10.67) m Equation (10.67) is obtained by noting that, in the nBmT code, n bits of information are carried using m ternary (i.e. three-level) symbols, each of which has a potential information content of log2 (3) bits. Similarly, in the nBmB code, m binary symbols (which have a potential information content equal to m bits) are employed to carry only n bits of information, yielding the efficiency stated above. For a given m, efficiency is maximised in a binary block code by choosing n = m − 1. With this relation, the coding efficiency increases with m, but at the expense of increased codec complexity, increased packetisation delay (to assemble up to m − 1 bits before coding) and an increased likelihood of a long run of like pulses. Table 10.8 shows the coding table of three simple block codes. The last column gives the disparity, which is the digital sum of each codeword, a digital sum being a count of the imbalance between negative and positive pulses in a sequence of symbols. Observe that CMI and Manchester codes can be treated as a 1B2B block code, which takes one bit at a time and represents it using two binary symbols. For example, the coding table for CMI shows that binary 0 is represented by a negative pulse followed by a positive pulse. This is denoted − + and was earlier presented as two half-width pulses. Binary 1 is represented using either two negative or two positive pulses, denoted − − and + +, respectively, and introduced earlier as a single full-width pulse. The waveform of a 3B4B code obtained using the coding table in Table 10.8 is shown at the bottom of Figure 10.23. To see how the 3B4B code waveform was obtained, note that the input bit stream is taken in blocks of three bits. See the demarcation of the bit stream listed below the 3B4B waveform. The encoder maintains a running digital sum (RDS), which is the cumulative sum of the disparity of each transmitted codeword. To code an input block that has two codeword options, the negative-disparity option is selected if RDS is positive or zero, and the positive-disparity option is chosen if RDS is negative. Let us assume that initially RDS = 0. Then, with the bit stream as in Figure 10.23, the first input block is 101, which is represented with the codeword + − + −, according to Table 10.8. The 3B4B waveform during this interval corresponds to this code. The RDS stays at zero. The next input block is 000, which (since RDS = 0) is represented using its negative-disparity code option − − + −, according to Table 10.8. RDS is updated to −2, by adding the disparity of the new codeword. Again, the portion of the 3B4B waveform in Figure 10.23 during this interval corresponds to the new codeword. The coding continues in
10.7 Line Codes
Table 10.8
Coding table for 3B4B and 1B2B codes. Output codeword
Type
Input
CMI
0
= 1B2B
1
Negative
0
Positive
0
−+ −−
Disparity
++
±2
Manchester
0
−+
0
= 1B2B
1
+−
0
001
−−++
0
010
−+−+
0
011
−++−
0
100
+−−+
0
101
+−+−
0
3B4B
110
0
++−−
000
−−+−
++−+
±2
111
−+−−
+−++
±2
this manner with the RDS being updated after each codeword until the last input block 000, which is represented with its positive-disparity code option + + − + to raise the RDS from −2 to 0. Block codes have several advantageous features. ● ●
●
Good timing content: codewords are selected that have sufficient transitions. No baseline wander: DC offset is eliminated by using balanced codewords whenever possible. When an input block must be represented using unbalanced codewords then two options that balance out each other are provided for the block. Greater efficiency: this means that a lower signalling rate can be used to provide the desired bit rate, which allows a greater spacing between repeaters without excessive degradation of transmitted symbols.
However, block codes have the demerit of increased codec complexity, which translates to higher costs. They are typically used on long-distance transmission links where savings in number of repeaters justify the higher complexity. Worked Example 10.6 Determine the code efficiency, signalling rate, and symbol period for the following baseband transmission systems (a) 139 264 kb/s transmitted using CMI (b) 139 264 kb/s transmitted using 6B4T. (a) CMI is a 1B2B code. Its efficiency therefore follows from Eq. (10.67) 𝜂CMI =
1 × 100% = 50% 2
679
680
10 Digital Baseband Coding
Signalling rate Rs is the number of symbols transmitted each second. In this case (1B2B), two binary symbols are used for each bit, and since 139 264 000 bits are transmitted each second, it follows that Rs = 2 × 139264000 = 278528000 baud = 278.53 MBd We have used the word baud for symbols/second, as is common practice. The symbol period T s is the reciprocal of Rs . Thus 1 1 = 3.59 ns Ts = = Rs 278528000 (b) 6B4T is a ternary code with n = 6, m = 4. Eq. (10.67) gives 𝜂6B4T =
6 × 100% = 94.64% 4log2 (3)
In this case, four ternary symbols are used to carry 6 bits. With 139 264 000 bits transmitted each second, the number of ternary symbols transmitted per second (i.e. the signalling rate) is 139264000 ×4 6 = 92.84 MBd
Rs =
The symbol period T s = 1/Rs = 10.77 ns. Observe that the more efficient ternary system uses a lower signalling rate to accommodate the same information transmission rate as the less efficient CMI. However, CMI has the advantage of simplicity. The decision on which code is used will be dictated by the priorities of the application.
10.8 Summary This now completes our study of digital baseband coding. We have acquired a thorough grounding in the principles of uniform and nonuniform quantisation and PCM encoding of analogue signals for digital transmission. The continuous-value sampled signal at the quantiser input may be converted to discrete values at the output by rounding or by truncation. Both approaches inevitably introduce approximation errors which are perceived as noise in the signal. The quantisation noise power when truncation is used is 6 dB higher than that of rounding. The input range ±C of the quantiser may be partitioned into quantisation intervals using either a midrise or mid-step approach. Mid-step partitioning creates a dead zone that enables all small fluctuations about zero at the quantiser input (e.g. due to noise in the absence of a signal) to be mapped to zero at the output. In midrise partitioning, zero is an interval boundary so that there are N/2 quantisation intervals for the positive input range and N/2 intervals for the negative, giving a total of N = 2k intervals (where k is the number of bits per sample). Mid-step, on the other hand, has N − 1 quantisation intervals, comprising N/2 − 1 intervals for the positive input range, N/2 − 1 intervals for the negative input range, and one interval (= 0 V output) for the dead zone. Quantisation intervals may be equally spaced across the entire quantiser input range leading to uniform quantisation. The approximation errors in each interval depends exclusively on the size (called step size) of the interval. If the input signal is small then these errors may become comparable to input sample values leading to a low ratio between signal power and quantisation noise power and hence an unacceptably poor signal quality at the quantiser output. One solution to this problem is to use finer quantisation involving sufficiently small step sizes,
Questions
but this requires large k, which leads to high bit rate and ultimately high storage and transmission bandwidth requirements. A better solution is to use nonuniform quantisation in which small inputs are finely quantised but large inputs are coarsely quantised. With the goal of achieving a constant SQNR across the entire input range, we found that the optimum quantiser transfer characteristic is a logarithmic function which delivers step sizes that increase exponentially away from the origin. Constraints of practical implementation necessitated a piecewise linear approximation of this logarithmic function, leading to two global standards, namely A-law and 𝜇-law PCM. We discussed both standards in detail, introduced the concepts of companding gain and penalty, learnt how to carry out PCM coding and decoding, and examined the extent of variation of SQNR across the quantiser input range in each standard. In seeking to achieve an acceptable SQNR when converting an analogue signal to digital, another approach is to quantise and encode the error between the input sample and a local prediction of the sample rather than coding the actual sample. Prediction errors can be made very small by reducing the time between adjacent samples (i.e. increasing the sampling rate) or by using a more sophisticated and adaptive prediction filter. This means that the quantiser range can be made very small, which reduces the number of intervals required for fine quantisation and therefore reduces the bit rate of the resulting digital signal. This is a class of differential quantisation and includes DPCM and DM as special cases which we briefly discussed. If the analogue signal is speech then further reduction in bit rate (from 64 kb/s for a standard PCM codec through various rates down to less than 1 kb/s for a MELP vocoder) can be achieved using a wide range of LBR speech coding techniques, which we briefly reviewed. In general, LBR codecs achieve lower bit rates through lossy compression that sacrifices signal quality and increases processing delay. The final section of the chapter was devoted to line coding to explore how the above digitised signals, and indeed all types of digital data, are electrically represented and transmitted as voltage waveforms in digital baseband systems. We reviewed a range of basic line codes and then devoted some time to learning, using HDB3 and block codes, how line codes are designed to satisfy the important requirements of a guaranteed clock content and zero DC content. The importance of a high line code efficiency in reducing transmission symbol rate was also explored through a worked example. Baseband transmission is limited to fixed wire-line channels. To exploit the huge bandwidth of optical fibre in addition to the broadcast and mobility capabilities of radio channels (albeit with smaller bandwidths), we must modulate a suitable carrier with the digital data before transmission. The next chapter therefore deals with digital modulation, which allows us to apply the principles learnt in Chapters 7 and 8 to digital signals, and to quantify the impact of additive white Gaussian noise on such digital modulated systems.
Reference 1 Otung, I. (2014). Digital Communications: Principles & Systems. London: Institution of Engineering and Technology (IET). ISBN: 978-1849196116.
Questions 1
The output of a midrise rounding uniform quantiser of input range ±5 V is coded using 8 bits/sample. Determine: (a) The maximum quantisation error. (b) The quantisation error associated with an input sample of value −2.3 V. (c) The quantisation noise power.
681
682
10 Digital Baseband Coding
(d) The dynamic range of the system. (e) The SQNR when the input is a speech signal that fully loads the quantiser. (f) The SQNR during a weak passage when the peak value of the input speech signal is only 0.5 V. 2
Determine the SQNR of a linear ADC as a function of number of bits per sample for an input signal that has a Gaussian PDF with zero mean. Ignore sample values that occur on average less than 0.1% of the time. How does your result compare with Eq. (10.17), which gives the SQNR for a speech input signal?
3
Determine the values of A and 𝜇 that yield a companding gain of 48 dB in A-law and 𝜇-law PCM, respectively. Why aren’t such (or even larger) values of A and 𝜇 used in practical PCM systems to realise more companding gain and hence greater savings in bandwidth?
4
Produce a diagram like Figure 10.11 for a digital translation view of 𝜇-law PCM.
5
An analogue input signal of values in the range ±2 V fully loads the quantiser of a 𝜇-law PCM system. Determine: (a) The quantisation error incurred in transmitting the sample value −1.13 V. (b) The PCM code for the sample value −1.9 V. (c) The voltage value to which the code 10011110 is converted at the receiver. (d) The maximum quantisation error in the recovered sample in (c). (e) The minimum and maximum quantisation errors of the whole process.
6
Determine the maximum SQNR in the following nonlinear PCM systems, and comment on the trend of your results. (a) A-law with A = 1 and k = 8 bits/sample. (b) A-law with A = 100 and k = 8 bits/sample. (c) A-law with A = 1000 and k = 8 bits/sample. (d) A-law with A = 100 and k = 6 bits/sample. (e) A-law with A = 100 and k = 10 bits/sample.
7
The message signal vm (t) = 5cos(2000𝜋t) is coded using DM and a sampling frequency of 10 kHz. Determine: (a) The minimum step size to avoid overload distortion. (b) The quantisation noise power when the minimum step size is used. (c) The SQNR. How does this compare with the maximum SQNR realisable using a linear ADC of the same bit rate?
8
Sketch the HDB3 and 3B4B waveforms for the following bit stream: 111011000100000110000111
9
Determine the code efficiency, signalling rate, and symbol period of a baseband transmission operating at 140 Mb/s and employing the 4B3T line code.
683
11 Digital Modulated Transmission
Is your goal big enough to excite you, rewarding enough to motivate you, challenging enough to dare you, and precise enough to direct you? In this Chapter ✓ Why digital modulated transmission? ✓ Two views of digital modulated transmission: (i) frequency, phase, or amplitude modulation of a sinusoidal carrier with a digital signal; (ii) coding of binary data using sinusoidal (i.e. bandpass) pulses. ✓ Signal space: a simple yet powerful tool for representing and analysing both baseband and modulated digital systems. ✓ Noise effects and bit error ratio. You will be able to evaluate the impact of noise on all types of binary modulated and baseband systems. You will also have the foundation needed later to extend the analysis to multilevel digital systems. ✓ Binary modulation and coherent detection: how ASK, PSK, and FSK signals are generated at a transmitter and detected at a receiver. You will also learn the effect of frequency spacing on the bandwidth and bit error ratio of FSK. ✓ Noncoherent detection: this avoids the complexity of phase synchronisation in coherent detectors but suffers from an inferior noise performance. ✓ M-ary transmission: a detailed presentation of the generation, detection, and bit error ratios of multilevel ASK, PSK, FSK, and hybrid systems. ✓ Design considerations: a lucid discussion that gives you an excellent insight into the interplay of various design parameters, namely bandwidth, signal power, bit rate, and bit error ratio.
11.1 Introduction Digital baseband coding, discussed at length in Chapter 10, conveys information using symbols (or pulses) that contain significant frequency components down to, or near, DC. This technique is impracticable in several important situations: ●
When it is required to confine the transmitted frequencies in a digital system within a passband centred at a frequency f c > > 0. This situation arises in the exploitation of radio and optical fibre transmission media whose usable passband is naturally limited to frequencies well above DC. Radio is the only type of medium that allows
Communication Engineering Principles, Second Edition. Ifiok Otung. © 2021 John Wiley & Sons Ltd. Published 2021 by John Wiley & Sons Ltd. Companion Website: www.wiley.com/go/otung
684
11 Digital Modulated Transmission
●
broadcast and mobility capabilities, whereas optical fibre is preferred to coaxial cable for high-capacity fixed telecommunication links. These two media, radio and optical fibre, are involved at some stage in most modern transmissions. When the available bandwidth of the transmission medium is insufficient to convey the baseband pulses at the desired symbol rate without significant distortion. An important example here is the global wire-line telephone network, which was optimised in the early days for the transmission of analogue voice signals, containing frequencies between 300 and 3400 Hz. However, with digital transmission becoming the preferred method of communication, a means had to be devised to transmit digital signals over these voice-optimised lines in order to exploit the huge financial investment that they represent. This limited-bandwidth situation also arises in radio where separate frequency bands must be allocated to different users to allow simultaneous transmission by many users on the same link.
The means of utilising the above media for digital communication is by digital modulated transmission in which one or more parameters of a sinusoidal carrier are varied by the information-bearing digital signal. The basic techniques of digital modulated transmission, namely amplitude shift keying (ASK), frequency shift keying (FSK), and phase shift keying (PSK), are presented in Chapter 1, which is worthwhile reviewing at this point. An important clarification of terminology is in order here. Throughout this chapter, we will use ASK, FSK, and PSK to refer to their binary implementation. Multilevel transmission will be explicitly identified with terms such as quadriphase shift keying (QPSK), 4-ASK, 4-FSK, M-ary, etc. Furthermore, ASK is treated as on–off keying (OOK) in which one of the two amplitude levels of the transmitted sinusoid is zero. There is nothing to be gained from making both levels nonzero, except a poorer noise performance (i.e. higher bit error ratio [BER]) for a given transmitted power. We may approach the subject of digital modulated transmission in two ways. ●
●
The more obvious approach is to treat the technique as that of sinusoidal carrier modulation involving digital signals. The theories of amplitude and angle modulations presented in Chapters 7 and 8 can then be applied, but with the modulating signal vm (t) being digital. Thus, ASK, for example, is obtained using an AM modulator with modulation factor m = 1; and the (digital) message signal is recovered at the receiver by AM demodulation using an envelope detector. The circuit components in this case are very similar to those of analogue modulation, except that the recovered message signal will be applied to a regenerator to obtain a pure digital signal free from accumulated noise effects. A less obvious but greatly simplifying approach is to treat the technique as an adaptation of baseband transmission to bandpass channels. The basis of this approach is that the modulated sinusoidal carrier can only take on a discrete number of states in line with the discrete values of the digital modulating signal. Thus, we simply treat the modulated carrier transmitted during each signalling interval as a ‘bandpass’ pulse or symbol. Under this approach, the process of modulation simplifies to symbol generation. More importantly, the receiver does not need to reproduce the transmitted waveform but merely determine which symbol was transmitted during each interval. Demodulation therefore simplifies to symbol detection, and subsequent digital baseband signal regeneration. Binary ASK, FSK, and PSK each involves the transmission of two distinct symbols – shown in Figure 11.1, whereas, in general, M-ary modulated transmission involves M distinct bandpass symbols, as shown in Figure 11.2 for M = 4. To understand why these sinusoidal symbols are bandpass, we may consider the case of binary ASK shown in Figure 11.1. Here, a symbol g(t) is transmitted for the bit duration T s if the bit is a 1, and no symbol is transmitted if the bit is a 0. We see from Figure 11.3 that the symbol g(t) can be written as the product of a rectangular pulse of duration T s and a sinusoidal function (of limitless duration). That is g(t) = V rect(t∕Ts ) cos(2𝜋fc t)
(11.1)
Recall (e.g. Figure 4.29) that a rectangular pulse rect(t/T s ) has a sinc spectrum of null bandwidth 1/T s , and that the effect of multiplication by cos(2𝜋f c t) is to shift this spectrum from its baseband (centred at f = 0) to a passband
11.1 Introduction
Bit stream →
1
0
1
1
1
0
0
1
V ASK
0 –V V
FSK
0 –V V
PSK
0 –V Bit duration
Figure 11.1
Waveforms of ASK, FSK, and PSK.
Bit stream → V 2V/3 V/3 0 4-ASK –V/3 –2V/3 –V V 4-FSK
1
0
0
1
1
1
0
0
0 –V V
4-PSK
0 –V
Phase = –90˚
Phase = 90˚ Symbol duration
Figure 11.2
Waveforms of 4-ASK, 4-FSK, and 4-PSK.
Phase = 0˚
Phase = 180˚
685
686
11 Digital Modulated Transmission
g(t)
V
rect(t/Ts) t
Ts
|G(f)|
VTs/2
0
–fc Figure 11.3
fc
fc + 1/Ts
f
Bandpass symbol g(t) and its amplitude spectrum |G(f )|.
centred at ±f c . Therefore, the spectrum G(f ) of g(t) is as shown in Figure 11.3, which is clearly a bandpass spectrum, of bandwidth 2/T s , having significant (positive) frequencies centred around f c G(f ) =
VT s {sinc[(f − fc )Ts ] + sinc[(f + fc )Ts ]} 2
(11.2)
Our presentation in this chapter follows the second approach. This will allow us to apply the important concepts related to matched filtering (Chapter 12). The matched filter can be realised using a correlation receiver or a coherent detector. The main task at the receiver is to determine (in the presence of noise) which symbol was sent during each interval of duration T s . If a matched filter is employed then what matters is not the symbol shape but the symbol energy E compared to noise power. In the case of sinusoidal symbols used in digital modulated transmission, the symbol energy is given by Es = Symbol Power × Symbol Duration = A2c Ts ∕2
(11.3)
where Ac is the sinusoidal symbol amplitude, and we have assumed (as is usual) a unit load resistance in the computation of power. Each modulated carrier state or symbol can therefore be fully represented by a point on a signal-space or constellation diagram, the distance of the point from the origin being the square root of the energy of the symbol. In what follows, we first discuss important general concepts that are applicable to both baseband and modulated digital systems. These include signal orthogonality, signal space, digital transmission model, noise effects, and bit error ratio. Working carefully through these sections will give you a sound understanding of vital principles and equip you to easily deal with the special cases of digital modulated transmission discussed in the remaining sections of the chapter. These special cases are briefly discussed under the headings of binary modulation (Section 11.7), coherent binary detection (Section 11.8), noncoherent binary detection (Section 11.9), and M-ary transmission
11.2 Orthogonality of Energy Signals
(Section 11.10). The chapter ends with a brief comparison of various digital modulation techniques in terms of the important system design parameters, namely bit rate, transmission bandwidth, signal power, and BER.
11.2 Orthogonality of Energy Signals The concepts of signal correlation and orthogonality are discussed at length for various types of random and deterministic signals. Since in digital transmission information is conveyed using a finite set of M distinct symbols g0 (t), g1 (t), …, gM − 1 (t), where M = 2 for binary, M > 2 for multilevel, or M-ary transmissions, and each symbol gk (t) is an energy signal of duration T s , we wish in this section to apply these concepts to energy signals. Given two signals g1 (t) and g2 (t), if their product g1 (t)g2 (t) has zero area over an interval of duration, say, T s , whereas the areas of g12 (t) and g22 (t) are both nonzero over the same interval then the two signals are said to be orthogonal over the interval T s . The principle of orthogonality finds extensive applications in communication systems, as we will learn in this chapter. For example, if a binary transmission system sends bit 1 using signal g1 (t) and bit 0 using g2 (t) then we can determine at the receiver which bit was sent during each interval as follows. Multiply the incoming signal by g1 (t) and compute the area of the resulting product. Note that this computation of area is the mathematical operation of integration which is very easily carried out in real systems using a suitably designed lowpass filter (LPF). If the area is zero – in practice smaller than a set threshold – we conclude that the incoming signal is g2 (t) and hence that bit 0 was sent. But if this area is significant (i.e. larger than the set threshold) then it is concluded that the incoming signal is g1 (t) and hence that bit 1 was sent. Orthogonality can also exist right across a set of signals, with every pair in the set being orthogonal. Energy signals g1 (t), g2 (t), g3 (t), …, gN (t), each of duration T s , are said to be orthogonal with respect to each other if { Ts 0, k≠m (11.4) gk (t)gm (t)dt = ∫0 E , k=m k
where Ek , the energy of gk (t), is nonzero and positive. When two energy signals g1 (t) and g2 (t) are orthogonal then their energies add independently. That is, the energy E of the sum signal g(t) = g1 (t) + g2 (t) is given by the sum of the energies E1 and E2 of g1 (t) and g2 (t), respectively. You can see that this is the case by observing that Ts
E=
∫0 Ts
=
∫0 Ts
=
∫0
g2 (t)dt [g1 (t) + g2 (t)]2 dt g12 (t)dt +
Ts
∫0
g22 (t)dt + 2
Ts
∫0
g1 (t)g2 (t)dt
= E1 + E2 + 0 If in Eq. (11.4) Ek = 1 for k = 1, 2, 3, …, N, then the waveforms g1 (t), g2 (t), g3 (t), …, gN (t) are said to be orthonormal. Thus, orthonormal signals are unit-energy orthogonal signals. Orthogonality can also be defined for periodic power signals. Power signals g1 (t), g2 (t), g3 (t), …, of period T are said to be orthogonal with respect to each other if { T∕2 0, k≠m 1 (11.5) gk (t)gm (t)dt = T ∫−T∕2 Pk , k = m If the power signal is nonperiodic then Eq. (11.5) is applied in the limit T → ∞. The following examples of orthogonal signal sets are of special interest in digital communications. It is easy to show the orthogonality of each of the sets given below simply by evaluating Eq. (11.4) for any two functions gk (t)
687
688
11 Digital Modulated Transmission
g–2(t) A–2 t
τ
g–1(t)
A–1 t
τ
g0(t)
A0 t
τ g1(t) A1 g2(t)
t
τ
A2 τ Figure 11.4
t
Orthogonal set of rectangular pulses gk (t) = Ak rect([t−k𝜏]/𝜏), k = … −2, −1, 0, 1, 2, . . . .
and gm (t) in the set to show that the integration yields zero in all cases except when k = m. A demonstration of this verification is given for the final set in Eq. (11.10) below: ●
Harmonically related sinusoidal signals ) ( k• • t + 𝜙k gk (t) = Ak cos 2𝜋 Ts k = 1, 2, 3, · · ·
●
(11.6)
form an orthogonal set over the interval 0 ≤ t ≤ T s . Nonoverlapping pulses ) ( t − (k + 𝛾)𝜏 gk (t) = Ak rect 𝜏 k = · · · , −2, −1, 0, 1, 2, · · · ; 0≤𝛾 3 cannot be visualised or sketched in real-life space, which is limited to three dimensions, but they remain an important mathematical concept. Distances in this space represent the square root of energy. In particular, the distance of a point from the origin gives the square root of the energy of the transmitted state that the point represents. Figure 11.5 shows signal-space examples with N = 1, M = 2 in (a); N = 2, M = 8 in (b); and N = 3, M = 4 in (c).
11.3.1 Interpretation of Signal-space Diagrams Let us take a moment to understand what signal-space or constellation diagrams tell us about the transmission systems that they represent. ●
●
Number of transmitted states: information (in the form of a string of 0s and 1s, or bit stream) is conveyed using 2, 8, and 4 distinct symbols or transmitted signal states in (a), (b), and (c), respectively. The assignment of states may be as follows. In Figure 11.5a, state S0 represents bit 0, and S1 represents bit 1. In Figure 11.5b, each state represents a block of three bits; for example, S0 represents 000, S1 represents 001, …, and S7 represents 111. Similarly, each state in Figure 11.5c represents a block of two bits, with S0 representing 00, S1 representing 01, and so on. Generally, the number of states M in a signal-space diagram gives the order of the modulation and each state in the diagram represents log2 M bits. Type of modulation scheme: the distribution and location of states in the signal-space diagram reveal the type of modulation scheme employed in the transmission system. If all M states lie on a circle, the modulation scheme is PSK. If all M states lie along a unipolar axis, the modulation scheme is ASK. If the M states are distributed
11.3 Signal Space
S0
(a)
–3 –2 –1 0
S1 1
2
α0
3
α1 S0
(b)
+1 S1
2 1
S7
–1
+1
–1 S4
S6
●
S3
E0
S2
Figure 11.5
α′2 3 S2
α0
(c) 1
α′1
S3
1
2
3
α′0
2
S0 S5
S1
3
Signalspace diagrams: (a) N = 1, M = 2; (b) N = 2, M = 8; (c) N = 3, M = 4.
in one plane but not all on one circle, the scheme is amplitude and phase shift keying (APSK), sometimes called quadrature amplitude modulation (QAM). If the M states are located at the same distance from the origin, with one state on each mutually orthogonal axis, and each axis representing a sinusoidal basis function having a unique frequency then the scheme is FSK. Energy of each transmitted state: the energy of each state equals the square of the distance of that state from the origin. Using arbitrary energy units for the moment, we have in Figure 11.5a E0 = 0 E1 = 3 2 = 9 In Figure 11.5b E0 = 1 2 + 1 2 = 2 E1 = 12 = 1,
and so on.
And in Figure 11.5c E1 = 2 2 = 4 E3 = 32 + 32 + 22 = 22,
and so on.
In general, the energy Ek of state k (representing symbol Sk ) is given by Ek = D2k ●
(11.15)
where Dk is the distance of state k from the origin. Average energy per symbol Es : the average energy per symbol is an important parameter which is obtained by dividing the total energy in the constellation by the number of symbols Es =
M−1 M−1 1 ∑ 1 ∑ 2 Ek = D M k=0 M k=0 k
You may wish to verify that the constellation of Figure 11.5b has Es = 1.5.
(11.16)
691
692
11 Digital Modulated Transmission ●
Average energy per bit Eb : another important parameter which we obtain directly from a signal-space diagram is the average energy per bit. This is the average energy per symbol divided by the number of bits per symbol, which is log2 M. Thus Eb =
M−1 ∑ Es 1 = D2 log2 M M log2 M k=0 k
(11.17)
It follows that ⎧ (M − 1)(2M − 1)d2 , ⎪ 6 log2 M Eb = ⎨ ⎪d2 ∕log2 M, ⎩
M-ary ASK M-ary PSK, M-ary FSK M = 2m ;
m = 1, 2, 3, · · ·
(11.18)
where, in M-ary ASK, d is the distance between adjacent states, whereas, in M-ary PSK and M-ary FSK, d is the distance of each state from the origin. ●
Signal power P: noting that power Energy P= Second Energy Symbol Energy Bit ≡ × ≡ × Symbol Second Bit Second it follows that P = Es Rs = Es ∕Ts (11.19)
= Eb Rb = Eb ∕Tb
where Rs = symbol rate, T s = symbol duration, Rb = bit rate, and T b = bit duration. Thus, if we know the symbol rate of the transmission system then we may obtain the transmitted signal power using Eq. (11.19) along with the average energy per symbol determined from the signal-space diagram. ●
Amplitude of each symbol: since, in a digital modulated system, the symbols are sinusoidal, we may determine the amplitude Ak of the kth symbol from its energy Ek as follows Ek = D2k = Pk Ts =
A2k 2
(i.e. Power × Duration)
Ts
which yields √ Ak = Dk ●
√ 2 = Dk 2Rs = Dk Ts
√
2Rb log2 M
(11.20)
Transmitted symbols: the symbol gk (t) transmitted for each state Sk follows from Eq. (11.12), with the coefficients equal to the distance moved along (or parallel to) each axis in order to get to point Sk starting from the origin. In general gk (t) = sk0 𝛼0 (t) + sk1 𝛼1 (t) + · · · + sk,N−1 𝛼N−1 (t)
11.3 Signal Space
where skj is the component of state Sk along the kth axis in signal space, with the kth axis representing the basis function 𝛼 k (t). Thus, if the signal-space diagrams in Figure 11.5 are based on the sinusoidal basis functions in Eq. (11.9) and (11.10), it follows that in Figure 11.5a g0 (t) = s00 𝛼0 (t) = 0 g1 (t) = s10 𝛼0 (t) = 3𝛼0 (t) ( ) √ n• • t rect(t∕Ts ) = 3 2∕Ts cos 2𝜋 Ts In Figure 11.5b g0 (t) = s00 𝛼0 (t) + s01 𝛼1 (t) = −𝛼0 (t) + 𝛼1 (t) [ ( ) ( )] √ √ n n =− 2∕Ts cos 2𝜋 • •t + 2∕Ts sin 2𝜋 • •t rect(t∕Ts ) Ts Ts g1 (t) = s10 𝛼0 (t) + s11 𝛼1 (t) ( ) √ n• • = 𝜙1 (t) = 2∕Ts cos 2𝜋 t rect(t∕Ts ) Ts And in Figure 11.5c g1 (t) =
2𝛼0′ (t)
) ( ) ( √ t − Ts ∕2 n• • = 2 2∕Ts cos 2𝜋 t rect Ts Ts
g3 (t) = 3𝛼0′ (t) + 3𝛼1′ (t) + 2𝛼2′ (t) [ [ ( ) ) ( ) ]] ( √ t − Ts ∕2 2 n • n 3n • = 3 2∕Ts cos 2𝜋 • t + cos 2𝜋 • •t + cos 2𝜋 • t rect 2Ts Ts 3 2Ts Ts Signal-space diagrams are hugely important and very widely used in digital transmission system modelling and analysis. However, a signal-space diagram does not provide any information about symbol duration T s or the form of the basis functions (whether baseband or bandpass) or the carrier frequency f c (if bandpass). This information must be obtained from some other specification of the system. Baseband systems use rectangular (or shaped) basis functions, whereas modulated systems use sinusoidal basis functions.
11.3.2 Complex Notation for 2D Signal Space The signal space of M-ary PSK and M-ary APSK (e.g. Figure 11.5b) is two-dimensional (2D), consisting of two basis functions, 𝛼 0 (t) and 𝛼 1 (t), which are, respectively, a cosine pulse and negative sine pulse, as defined in Eq. (11.9). M-ary ASK is a special case having only one dimension with just 𝛼 0 (t) as its basis function. Since −sin𝜃 leads √ cos𝜃 by 90∘ just as the imaginary number j = −1 (introduced in Eq. (2.42)) leads the real number +1 by 90∘ , it is both mathematically consistent and convenient to adopt a complex notation for representing states in such signal spaces, namely ASK, PSK, and APSK. With this notation, 𝛼 0 (t) is treated as the reference axis oriented in the positive real or 0∘ direction and is referred to as the in-phase axis, whereas 𝛼 1 (t) is treated as the imaginary axis oriented in the 90∘ direction, referred to as the quadrature axis. Thus, the location of any state Sk in signal space, which is at a distance Dk from the origin and at a phase 𝜃 k (≡ angle measured counterclockwise from the 0∘ direction), is given by the complex number Sk = xk + jyk where xk = Dk cos 𝜃k ;
yk = Dk sin 𝜃k
k = 0, 1, 2, · · · , M − 1
(11.21)
693
694
11 Digital Modulated Transmission
The components xk and yk are distances in signal space and therefore are in unit of square root of energy or 1 (J) /2 . Following the discussion leading √ to Eq. (11.20), these components may be scaled to amplitudes having a unit of volt by multiplying by the factor 2∕Ts , where T s is the duration of each transmitted bandpass symbol. This leads to √ AkI = xk 2∕Ts ; √ (11.22) AkQ = yk 2∕Ts AkI and AkQ are, respectively, the in-phase and quadrature components of the transmitted bandpass pulse gk (t) corresponding to state Sk . This pulse is sinusoidal and has amplitude Ak given by √ √ 2(xk2 + y2k ) (11.23) Ak = A2kI + A2kQ = Ts Since all states in M-ary ASK lie along the 𝛼 0 (i.e. real) axis, there is no quadrature component in ASK and 𝜃 k = 0 for all k. We recall the definition in Section 3.5.6 of the correlation coefficient 𝜌 of two energy signals g0 (t) and g1 (t) of duration T s and respective energies E0 and E1 T
𝜌=
∫0 s g0 (t)g1 (t)dt Average Energy
T
=
2 ∫0 s g0 (t)g1 (t)dt E0 + E1
(11.24)
and note from this definition and the definition of orthogonality in Eq. (11.4) that the correlation coefficient of two orthogonal energy signals is zero. Furthermore, in this 2D signal space, the correlation coefficient 𝜌km between two states Sk and Sm having respective components (xk , yk ) and (xm , ym ) simplifies to 𝜌km =
2(xk xm + yk ym ) (xk2
2 + y2k ) + (xm + y2m )
(11.25)
11.3.3 Signal-space Worked Examples The following worked examples provide a more detailed discussion and further insight into the signal-space concept. Worked Example 11.1 Baseband System A baseband transmission system conveys information using the symbols g0 (t), g1 (t), and g2 (t) shown in Figure 11.6a. (a) Determine a suitable set of two basis functions 𝛽 0 (t) and 𝛽 1 (t). (b) Sketch the signal-space or constellation diagram for this system.
(a) The form of the transmitted symbols suggests that a suitable set of basis functions would be two half-width rectangular pulses 𝛽 0 (t) and 𝛽 1 (t), as shown in Figure 11.6b, one occupying the interval 0 ≤ t ≤ T s /2, the other the interval T s /2 ≤ t ≤ T s , and both having amplitude V 𝛽 that gives unit energy. Their formal mathematical expressions are ) ( t − Ts ∕4 𝛽0 (t) = V𝛽 rect T ∕2 ) ( s t − 3Ts ∕4 𝛽1 (t) = V𝛽 rect Ts ∕2
11.3 Signal Space
V
g0(t)
(a)
V
Ts
g1(t)
V
t
Ts
–V
Ts
t
–V Vβ
β0(t)
Vβ
Ts /2
t
Ts
t
β1 V T /2 s
S1
V T /2 s
–V T /2 s S2
Figure 11.6 diagram.
Ts
β(t) = β0(t) + β1(t)
S0
(d)
β1(t)
t
Ts /2
(c)
t
–V
(b)
Vβ
g2(t)
β0
–V T /2 s
Worked Example 11.1: (a) transmitted symbols; (b) orthonormal basis functions; (c) sum pulse; (d) signal space
A general method for selecting basis functions will be given shortly. Since 𝛽 0 (t) and 𝛽 1 (t) are nonoverlapping, their product (and hence its integral) is zero, whereas the area under the square of each pulse is nonzero. Therefore, by the definition of orthogonality given in Eq. (11.4), the two pulses are orthogonal. Note that, as expected of orthogonal signals, the energy of the sum of the two pulses equals the sum of the energy of each pulse. The sum pulse 𝛽(t) is shown in Figure 11.6c and has energy E𝛽 = V𝛽2 Ts whereas the individual pulses 𝛽 0 (t) and 𝛽 1 (t) have energies E𝛽0 = E𝛽1 = V𝛽2 Ts ∕2
(11.26)
which shows that E𝛽 = E𝛽0 + E𝛽1 and confirms that 𝛽 0 (t) and 𝛽 1 (t) are orthogonal. In general, pulses that occupy nonoverlapping time intervals are orthogonal as stated in Eq. (11.7) for rectangular pulses. However, this rule holds regardless of pulse shape (provided of course the pulses are not identically zero). Basis functions in baseband systems therefore consist of fractional-width rectangular pulses, one for each interval between transitions in the voltage level of transmitted symbols. In this example, we have two separate time instants at which voltage level transitions occur, one at
695
696
11 Digital Modulated Transmission
t = T s /2 in g0 (t) only, and the other at t = T s in all three transmitted symbols. Thus, we require two basis functions, as shown in Figure 11.6b. One important exception to this rule is when there are only two opposite polarity (or antipodal) symbols g0 (t) and g1 (t), as in Manchester line code. In this case only one basis function 𝛽 0 (t) is required, which is the symbol g0 (t) appropriately scaled to unit energy. The two symbols are then given by g0 (t) = K𝛽 0 (t) and g1 (t) = −K𝛽 0 (t), where K is a constant.Finally, we complete the specification of the basis functions by determining the height V 𝛽 that gives them unit energy. Thus, from Eq. (11.26) √ (11.27) V𝛽 = 2∕Ts (b) The transmitted symbols have amplitude V, arrived at by dividing the basis functions by V 𝛽 and then multiplying by V. So we may express the transmitted symbols in terms of 𝛽 0 (t) and 𝛽 1 (t) as V V 𝛽 (t) + 𝛽 (t) V𝛼 0 V𝜙 1 √ = V Ts ∕2[−𝛽0 (t) + 𝛽1 (t)] √ g1 (t) = V Ts ∕2[𝛽0 (t) + 𝛽1 (t)] √ g2 (t) = −V Ts ∕2[𝛽0 (t) + 𝛽1 (t)] g0 (t) = −
(11.28)
The signal-space diagram is therefore as shown in Figure 11.6d. Note that the transmitted symbols have equal energy E0 = E1 = E2 = V 2 Ts
(11.29)
If you have studied Chapter 10, you might have observed by now that this is a baseband system that employs coded mark inversion (CMI) line code in which bit 0 is conveyed by g0 (t) and bit 1 is conveyed alternately by g1 (t) and g2 (t). Worked Example 11.2 Modulated System In the following problems, a sinusoidal pulse is defined as a signal of duration T s that has a sinusoidal variation with an integer number of cycles within the interval 0 ≤ t ≤ T s , and is zero elsewhere. (a) Show that the sinusoidal pulses g0 (t) = V 0 cos(2𝜋f c t) and g1 (t) = −V 1 sin(2𝜋f c t) are orthogonal, where f c = n/T s and n = 1, 2, 3, … (b) Show that the set of sinusoidal pulses cos(2𝜋f s t), cos(2𝜋2f s t), …, cos(2𝜋nf s t), …, where f s = 1/T s , are mutually orthogonal. (c) Sketch the signal-space diagrams of 4-ASK, 4-PSK, and 4-FSK. (a) From Eq. (11.3), the energy of each pulse is given by E0 = V02 Ts ∕2 E1 = V12 Ts ∕2 The sum pulse g(t) = g0 (t) + g1 (t) is also of duration T s and is given in the interval 0 ≤ t ≤ T s by g(t) = V0 cos(2𝜋fc t) − V1 sin(2𝜋fc t) = V0 cos(2𝜋fc t) + V1 cos(2𝜋fc t + 𝜋∕2) √ = V02 + V12 cos(2𝜋fc t + 𝜑) ≡ V cos(2𝜋fc t + 𝜑)
(11.30)
11.3 Signal Space
where 𝜑 = arctan(V1 ∕V0 ). Note that we applied the technique of sinusoidal signal addition learnt in Chapter 2. The energy of g(t) is T Ts = (V02 + V12 ) s 2 2 = V02 Ts ∕2 + V12 Ts ∕2
E = V2
= E0 + E1 We therefore conclude that g0 (t) and g1 (t) are orthogonal since their energies add independently. An alternative and straightforward way of solving this problem would be to show that the integral of the product signal g0 (t)g1 (t) is zero, whereas the integrals of g02 (t) and g12 (t) are nonzero. Also, note from Eq. (11.30) that g0 (t) and g1 (t) will have unit energy, and hence become a set of two orthonormal basis functions 𝜙0 (t) and 𝜙1 (t) if we set their amplitudes to √ V0 = V1 = 2∕Ts (11.31) The orthonormal basis functions of ASK, PSK, and APSK modulated systems are always sinusoidal signals of duration T s , with frequency f c equal to an integer multiple of 1/T s and amplitude given by Eq. (11.31). (b) There is no general rule for adding sinusoids of different frequencies, so in this case we apply Eq. (11.4) to prove orthogonality. Consider two different functions gm (t) = cos(2𝜋mf s t) and gn (t) = cos(2𝜋nf s t), in the given set. We have Ts
Ts
gm (t)gn (t)dt =
∫0
=
cos(2𝜋mf s t) cos(2𝜋nf s t)dt
∫0
1 2 ∫0
Ts
1 + 2 ∫0
cos[2𝜋(m + n)fs t]dt (fs = 1∕Ts )
Ts
cos[2𝜋(m − n)fs t]dt
=0 where we expanded the first integrand on the right-hand side using the trigonometric identity for the product of cosines. The resulting integrals evaluate to zero since each is the area under a sinusoidal function over a time interval T s in which the sinusoid completes an integer number of cycles. Over this interval, there is the same amount of positive and negative area, giving a total area (i.e. integral) equal to zero.Now for m = n, we have Ts
Ts
cos2 (2𝜋mf s t)dt ∫0 T = s 2 where we evaluated the integral by noting that it is the energy of a unit-amplitude sinusoidal pulse, which allows us to apply Eq. (11.3) with V = 1. From the foregoing, it follows that { Ts 0, m≠n (11.32) cos(2𝜋mf s t) cos(2𝜋nf s t)dt = ∫0 T ∕2, m = n ∫0
gm (t)gm (t)dt =
s
where f s = 1/T s . Therefore, the set of sinusoidal pulses with frequencies at integer multiples of 1/T s are mutually orthogonal over the interval 0 ≤ t ≤ T s . The set of orthonormal basis functions in FSK modulation consists of these sinusoidal pulses with amplitude given by (11.31). M-ary FSK requires M basis functions giving rise to an M-dimensional signal space. On the contrary, M-ary ASK is one-dimensional (1D), requiring only one basis function, whereas M-ary PSK is 2D,
697
698
11 Digital Modulated Transmission
requiring two basis functions, albeit of the same frequency. Hybrid modulation techniques, which combine ASK and PSK, are also 2D and are realised using a linear combination of the same two basis functions, as in PSK. 4-ASK: this has a 1D signal space with four states: S0 , S1 , S2 , and S3 . Only one basis function is involved {√ 2∕Ts cos(2𝜋nf s t), 0 ≤ t ≤ Ts 𝛼0 (t) = (11.33) 0, elsewhere where n is an integer and f s = 1/T s . Transmitted symbols differ only in amplitude, which, if equally spaced in the range from 0 to V, leads to the following symbols g0 (t) = 0
√
Ts 𝛼 (t) 2 0 √ 2V Ts 𝛼 (t) g3 (t) = 3 2 0 √ Ts 𝛼 (t) g2 (t) = V 2 0 V g1 (t) = 3
(11.34)
4-PSK: the signal space is 2D with the four states represented using a linear combination of two basis functions {√ 2∕Ts cos(2𝜋nf s t), 0 ≤ t ≤ Ts 𝛼0 (t) = 0, elsewhere { √ − 2∕Ts sin(2𝜋nf s t), 0 ≤ t ≤ Ts 𝛼1 (t) = 0, elsewhere fs = 1∕Ts
(11.35)
Transmitted symbols have the same amplitude V and frequency nf s , differing only in phase. One implementation that places states S0 , S1 , S2 , and S3 (representing di-bits 00, 01, 10, and 11, respectively) at respective phases −135∘ , 135∘ , −45∘ , and 45∘ is √ g0 (t) = −V Ts ∕4[𝛼0 (t) + 𝛼1 (t)] √ g1 (t) = V Ts ∕4[−𝛼0 (t) + 𝛼1 (t)] √ g2 (t) = V Ts ∕4[𝛼0 (t) − 𝛼1 (t)] √ g3 (t) = V Ts ∕4[𝛼0 (t) + 𝛼1 (t)] (11.36) where symbol gk (t) corresponds to state Sk , k = 0, 1, 2, 3. 4-FSK: four basis functions are needed, one for each of the four states S0 , S1 , S2 , and S3 {√ 2∕Ts cos[𝜋(n + mk)Rs t], 0 ≤ t ≤ Ts ′ 𝛼k (t) = 0, elsewhere k = 0, 1, 2, 3, · · · ;
Rs = 1∕Ts
(11.37)
where n, m ≥ 1 are integers and m determines the spacing of the frequencies of the transmitted sinusoidal symbols gk (t), which are of equal amplitude V, but different frequencies. The smallest spacing is Rs /2 when m = 1. The transmitted symbols are therefore √ Ts ′ 𝛼 (t), k = 0, 1, 2, 3 (11.38) gk (t) = V 2 k
11.4 Digital Transmission Model
S0
(a)
0
S1
S3 Ts 2
V 3
2V 3
S2 Ts 2
V
Ts 2
α0
α1 V Ts 4
S1
(b)
S3
–V Ts 4
S0 Figure 11.7
V Ts 4
–V Ts 4
α0
S2
Worked Example 11.2: signal space diagrams of (a) 4-ASK; (b) 4-PSK.
Figure 11.7 shows the signal-space diagrams of the 4-ASK and 4-PSK discussed above. The orientation of the 𝛼 1 and 𝛼 0 axes in Figure 11.7b is consistent with our definition of 𝛼 1 (t) in Eq. (11.35) as a negative sine pulse and 𝛼 0 (t) as a cosine pulse, and the fact that the cosine function leads the sine function by 90∘ and therefore lags the negative sine function by 90∘ . We have omitted the signal-space diagram of 4-FSK because it is hardly insightful and could be confusing to sketch four mutually perpendicular axes.
11.4 Digital Transmission Model The discussion so far leads us to adopt the simplified model for digital transmission shown in Figure 11.8. It consists of a symbol generator capable of generating M distinct transmitted symbols {gk (t), k = 0, 1, …, M − 1} each of duration T s ; a transmission medium that accounts for noise w(t), including contributions from the receiver; and a symbol detector that determines which of the M symbols is most likely to have been sent given the (noise-corrupted) received signal r(t). The transmission system is fully described by a set of N orthonormal basis functions 𝛼 0 (t), 𝛼 1 (t), …, 𝛼 N-1 (t), and its signal-space diagram. The signal-space diagram provides a lucid summary of the adopted coding procedure that maps each distinct input block of log2 M bits to a distinct point in N-dimensional space. There are M such input blocks, and therefore M (message) points S0 , S1 , SM-1 in the signal-space diagram, which corresponds respectively to the M transmitted symbols g0 (t), g1 (t), …, gM-1 (t). Point Sk is identified by a vector sk called the signal vector with N elements sk0 , sk1 , …, sk,N-1 , which are the components of the point along the N mutually perpendicular axis. In a commonly used arrangement, known as Gray coding, adjacent message points represent blocks of bits that differ in only one bit position. Figure 11.5b is an example of Gray coding. Unless otherwise stated, we employ a notation in which Sk represents a block of bits whose decimal equivalent is k. For example, with M = 8, S0 represents 000, S5 represents 101, etc.
699
700
11 Digital Modulated Transmission
Transmission medium
Transmitter (a)
Input bitstream
Symbol generator
g k (t) +
Σ
Receiver
r(t)
Output bitstream
Symbol detector
+ w(t)
Noise
×
sk0
Signal vector (b)
sk
α0(t) sk1
Transmitted symbol
× Σ
gk(t) =
α1(t)
N–1
Σ skjαj(t) j=0
×
Sk,N–1
αN–1(t)
×
Ts
∫0
sk0
α0(t)
(c)
Received symbol
×
g k (t)
Ts
∫0
sk1
Signal vector sk
α1(t)
×
Ts
∫0
sk,N–1
αN–1(t) matched filter (d)
Figure 11.8 correlator.
Received symbol gk(t)
Impulse response = αj(Ts – t)
yo(t) sample at t = Ts
yo(Ts) = skj
(a) Digital transmission model; (b) symbol generation; (c) symbol detection; (d) matched filter equivalent of jth
11.5 Noise Effects
We may summarise the processes performed at the transmit end of the simplified model in Figure 11.8 as follows: ●
●
During each symbol interval, a block of log2 M input bits is read and mapped to point Sk according to the agreed coding procedure. Equivalently, we say that signal vector sk is generated. Using the basis functions (which characterise the transmission system), and the signal vector sk , the transmitted symbol gk (t) is generated according to the block diagram shown in Figure 11.8b. Note that this diagram is an implementation of Eq. (11.12).
Ignoring noise effects for the moment, the receiver has the task of extracting the signal vector sk from the symbol gk (t) received during each symbol interval. Once sk has been extracted, the corresponding bit block is obtained by mapping from message point Sk to a block of log2 M bits according to the coding procedure used at the transmitter. To see how this symbol detection may be carried out, consider the result of the following integration of the product of the received symbol gk (t) and the jth basis function 𝛼 j (t). This is a correlation operation performed in a correlation receiver when it is fed with inputs gk (t) and 𝛼 j (t). The correlation receiver is discussed further in Chapter 12. Ts
gk (t)𝛼j (t)dt
∫0
Ts
=
∫0
[sk0 𝛼0 (t) + sk1 𝛼1 (t) + · · · + skj 𝛼j (t) + · · · + sk,N−1 𝛼N−1 (t)]𝛼j (t)dt Ts
= skj
∫0
𝛼j2 (t)dt + sk0 Ts
+sk,j+1
∫0
Ts
∫0
𝛼0 (t)𝛼j (t)dt + · · · + sk,j−1
𝛼j+1 (t)𝛼j (t)dt + · · · + sk,N−1
Ts
∫0
Ts
∫0
𝛼j−1 (t)𝛼j (t)dt
𝛼N−1 (t)𝛼j (t)dt (11.39)
= skj
In the above, we obtain the second line by expanding gk (t) according to Eq. (11.12), and the last line by invoking the orthonormality property of the basis functions – Eq. (11.13). We see that the operation yields the jth element skj of the desired vector sk . Thus, to determine the entire vector sk , we feed the received symbol gk (t) as a common input to a bank of N correlators each of which is supplied with its own basis function. The N outputs of this arrangement, which is shown in Figure 11.7c, are the N elements of the vector sk . We find in Chapter 12 that the correlator supplied with basis function 𝛼 j (t) is equivalent to a matched filter that gives optimum detection (in the presence of white noise) of a symbol of the same waveform as 𝛼 j (t). Thus, each correlator in the bank may be replaced by the matched filter equivalent shown in Figure 11.8d.
11.5 Noise Effects In practice, the transmitted symbol is corrupted by noise before it reaches the detection point at the receiver, and the input to the bank of correlators discussed above is the signal (11.40)
r(t) = gk (t) + w(t)
In most practical situations it is adequate to assume that w(t) is AWGN. The output of the jth correlator is Ts
∫0
Ts
r(t)𝛼j (t)dt =
[gk (t) + w(t)]𝛼j (t)dt
∫0 Ts
=
∫0
Ts
gk (t)𝛼j (t)dt +
= skj + wj
∫0
w(t)𝛼j (t)dt (11.41)
701
11 Digital Modulated Transmission
Comparing Eqs. (11.39) and (11.41), we see that the effect of noise is to shift the output of the jth correlator by a random amount Ts
wj =
∫0
(11.42)
w(t)𝛼j (t)dt
In other words, rather than the desired vector sk (which corresponds to a precise message point Sk ), we now have at the output of Figure 11.8c an output vector r = sk + w
(11.43)
where w is a random vector with components w0 , w1 , …, wN-1 , given by Eq. (11.42) with j = 0, 1, …, N-1, respectively. The received vector r corresponds to a received signal point R in signal space. This point is displaced from the message point Sk by a distance |w|. The displacement can be in any direction with equal likelihood, but smaller displacements are more likely than large ones, in line with the probability density function (pdf) of a Gaussian random variable. The picture is as shown in Figure 11.9 for two adjacent message points S1 and S2 separated by a √ distance E. The level of shading at a point gives an indication of the likelihood of the received signal point R to be around that point. Two important tasks must now be performed at the receiver in order to recover an output bit stream from the received signal r(t). First, a bank of correlators (as in Figure 11.8c) is used to extract the received vector r. Next, given vector r, or equivalently point R, which in general does not coincide exactly with any of the message points {Sk , k = 0, 1, …, M-1}, the receiver has to make a decision on which message point is most likely to have been transmitted. On the assumption that all M message points are transmitted with equal probability, the maximum likelihood rule leads to a decision in favour of the message point that is closest to the received signal point. The decision boundary for message points S1 and S2 (in Figure 11.9) is the perpendicular bisector of the line joining S1 and S2 . Therefore, a symbol error will occur whenever noise effects shift the received point R across this decision boundary. √ Clearly, the likelihood of such an error increases as the spacing E between signal points is reduced. Because of the random nature of noise, we can only talk of the probability Pe that a symbol error will occur but cannot predict with certainty the interval when this error will occur. For example, a probability of symbol error Pe = 0.01 means that on average one symbol in a hundred will be incorrectly received. Note that this does not imply that there will always be one error in every 100 transmitted symbols. In fact, there may well be some periods of time during which a thousand or more symbols are received without a single error, and others in which there are two or more errors in 100 symbols. What this statement means is that if we observe the transmission over a sufficiently long time then we will find that the ratio of the number of symbols in error to the total number of symbols transmitted is Pe = 0.01. This probability is therefore also referred to as symbol error ratio, which we determine below for a binary transmission system. Figure 11.9
Decision boundary
S0
w
702
R * S1
E
Effect of noise.
11.6 Symbol and Bit Error Ratios
11.6 Symbol and Bit Error Ratios To obtain a more quantitative expression of √the effect of noise, consider a binary system with two message points S0 and S1 that are separated by distance E in signal space as shown in Figure 11.10a. We wish to derive an expression for the probability of symbol error in this binary transmission system when incoming symbols are coherently detected in the presence of AWGN using the correlation receiver discussed in Figure 11.8c. If the system transmits at a symbol rate Rs = 1/T s , where T s equals symbol duration then its minimum baseband noise equivalent bandwidth is B=
Rs 1 = 2 2Ts
and, at the detection point, the noise power Pn (which equals the variance 𝜎 2 of the Gaussian noise vn (t) since it has zero mean) is Pn = 𝜎 2 = No B = No ∕2Ts
(11.44)
where N o is the noise power per unit bandwidth. A sample of this noise has in-phase and quadrature components vnI and vnQ (as discussed in Section 6.3) with respective energies EnI and EnQ over one symbol duration given by EnI = v2nI Ts ;
EnQ = v2nQ Ts
As a result of the addition of this noise, a transmitted state S0 will be received at point S0′ having been displaced through distance a along the in-phase axis 𝛼 and distance b along the quadrature axis 𝛽, as shown in Figure 11.10a. β
Figure 11.10 (a) Two-dimensional signal √ space with two states S 0 and S 1 separated by distance E; (b) shaded area gives the probability that S 0 is received in error.
d= E 2
S′o So
a
b
S1
–d
α
+d E Decision boundary (a) p(ʋnI) =
1
σ 2π
exp
⎛ ʋnI2 ⎞ ⎝ 2σ 2 ⎠ z=
z (b)
1 E Ts 2
ʋnI
703
704
11 Digital Modulated Transmission
Since distance in signal space is the square root of energy, it follows that √ √ a = EnI = vnI Ts √ √ b = EnQ = vnQ Ts
(11.45)
There will be symbol error if S0′ lies to the right of the decision boundary, which will be the case if a > d, i.e. if √ √ √ 1 E vnI Ts > E∕2; ⇒ vnI > ≡z (11.46) 2 Ts Since vnI is a Gaussian random variable of zero-mean and variance 𝜎 2 given by Eq. (11.44), it follows that the probability Pe0 of symbol error given that S0 is sent is the shaded area of Figure 11.10b, which we evaluate in Eq. (6.16) as Pr(vnI > z) = Pr[S1 ∣ S0 ] ≡ Pe0 ( ) ( ) 1 z z =Q = erfc √ 2 𝜎 𝜎 2
(11.47)
This equation involves the complementary error function erfc(x) and Q-function Q(x) discussed in Section 3.4.1 and tabulated in Appendix C. From the symmetry of the problem in Figure 11.10a, the probability Pe1 of an error occurring when S1 is sent is the same as Pe0 . A transmission channel that satisfies this condition, Pe1 = Pe0 , is referred to as a binary symmetric channel. Therefore the probability Pe of an error occurring in the detection of any symbol is given by Eq. (11.47), which when we substitute the expressions for z and 𝜎 (given in Eq. (11.46) and (11.44)) yields ( √ ) (√ ) 1 E E 1 Pe = erfc =Q (11.48) 2 2 No 2No To reiterate, Eq. (11.48) gives the symbol error ratio (SER) in a binary transmission system where (i) bandlimited white Gaussian noise of power per unit√bandwidth N o is the only source of degradation and (ii) the two transmitted states are separated by a distance E in signal space. We show in Section 11.6.1 that Eq. (11.48) is directly applicable to the following digital transmission systems: ●
● ●
Unipolar baseband (UBB) systems, e.g. all variants of the non-return-to-zero (NRZ) line code in Figure 10.23 (except bipolar) and the RZ code. Bipolar baseband (BBB) systems, e.g. bipolar NRZ and Manchester codes. All binary modulated systems, including ASK, PSK, and FSK.
We will also show with a specific example of 4-PSK how Eq. (11.48) may be applied to obtain the BER of M-ary systems. In the above basic analysis, the transmitted states were located along a single axis in a signal space. In Section 11.6.2 we extend the analysis to obtain the BER of all binary transmission systems in terms of the received average energy per bit Eb . All our results are expressed exclusively in terms of erfc. If required, these may be converted into equivalent Q-function expressions using the relation √ erfc(x) = 2Q( 2 x) (11.49) The following two characteristics of the complementary error function have implications on the interplay between the carrier-to-noise ratio (C/N) at the receiver’s detection point and the BER of the transmission system: ● ●
erfc(x) decreases monotonically as x increases. The rate of decrease of erfc(x) is higher at large x.
The first characteristic means that we can always improve (i.e. reduce) BER by increasing C/N, whereas the second implies that the improvement in BER per dB increase in C/N is larger at higher values of C/N.
11.6 Symbol and Bit Error Ratios
11.6.1 Special Cases Worked Example 11.3 Apply Eq. (11.48) to obtain expressions for the BER of the following systems in terms of the average energy per bit Eb and the noise power per unit bandwidth N o : (a) (b) (c) (d)
UBB transmission and ASK BBB transmission and PSK FSK 4-PSK.
Symbol error ratio (SER) and BER are identical in binary systems, since each transmitted symbol conveys one bit. Eq. (11.48) will therefore give the desired expression when E is expressed in terms of Eb . In the following we √ sketch the signal-space diagram of each system with their two states S0 and S1 separated by distance E, write out the implied relationship between E and Eb , and substitute this relationship into Eq. (11.48) to obtain BER. (a) Unipolar baseband (UBB) transmission and ASK have the 1D signal space shown in Figure 11.11a. Binary ASK is always implemented as on–off keying (OOK), so state S0 representing bit 0 has zero energy, which means that there is no signalling during the symbol interval T s . State S1 , representing bit 1, has energy Es . Clearly, Es = E and therefore Eq. (11.48) yields ( √ ) 1 1 Es BER = erfc 2 2 No Es is the peak symbol energy, which is contained in state S1 , whereas state S0 contains zero energy. If both states are equally likely then the average energy per symbol is Esav = Es /2. In fact, Esav is also the average energy per bit Eb since each symbol conveys one bit. Thus, substituting 2Eb for Es in the above expression yields the desired formula (√ ) Eb 1 (UBB and 2-ASK) (11.50) BER = erfc 2 2No (b) Figure 11.11b shows the signal-space diagram of bipolar (BBB) and (binary) PSK systems. S0 and S1 √ baseband √ have equal energy Es , but opposite polarity. Clearly, E = 2 Es , and in this case Es is also the average energy per bit Eb . Thus, Eq. (11.48) becomes (√ ) Eb 1 (BBB and 2-PSK) (11.51) BER = erfc 2 No (c) The signal-space diagram of a (binary) FSK system is 2D, as shown in Figure 11.11c. The two states S0 and S1 have equal energy Es (which is also the average energy per bit Eb ), but are transmitted at different frequencies represented by the orthogonal basis functions 𝛼0′ (t) and 𝛼1′ (t). These states are shown separated by distance √ E as required for using Eq. (11.48). Applying Pythagoras’s rule to the diagram, we see that E = 2Es (= 2Eb ). Eq. (11.48) therefore yields (√ ) Eb 1 (FSK) (11.52) BER = erfc 2 2No (d) 4-PSK, also referred to as quadriphase shift keying (QPSK), has a 2D signal-space diagram, which was sketched in Figure 11.7b, but is repeated in Figure 11.12a using a labelling that is appropriate to the following √discussion. There are four states, each with energy Es . By Pythagoras’s rule, the distance between S0 and S1 is 2Es , and so
705
706
11 Digital Modulated Transmission
(a) S0
S1
α0
Es
0 E
(b)
S0
S1
– Es
α0
Es
0 E
α′1 S1
Es
(
(c) E
Es
2
) +(
2
2
) =( E)
⇒ 2 Es = E S0
0
Es
Es
α′0
Figure 11.11 Signalspace diagrams for Worked Example 11.3. (a) unipolar baseband systems and ASK (usually OOK); (b) bipolar baseband and PSK; (c) FSK.
√ is the distance between S0 and S2 . The separation between S0 and S3 is obviously 2 Es . Let us determine the probability Pe0 that there is an error given that S0 was transmitted. Clearly, an error will occur if S0 is sent whereas the received point R lies in quadrant 2, 3, or 4 – the shaded region in Figure 11.12b. In order to apply Eq. (11.48), we take two states at a time as shown in Figure 11.12c. Consider c(i). All points in the shaded region are nearer to S1 than S0 , and the receiver will therefore decide in favour of S1 whenever the received point lies in this region. So, an error occurs if S0 is sent but the received state lies in the shaded area. The probability of this error, denoted Pr(S1 |S0 ) and read ‘probability S1 received given S0 sent’, is given by Eq. (11.48) √ √ with E = 2Es (√ ) Es 1 Pr(S1 ∣ S0 ) = erfc 2 2No Similarly, in c(ii) and c(iii) (√ ) Es 1 Pr(S2 ∣ S0) = erfc 2 2No (√ ) Es 1 Pr(S3 ∣ S0) = erfc 2 No
11.6 Symbol and Bit Error Ratios
α1
2Es α1
S1
S0
Es
S1
Es
Quadrant 2 α0
(a)
S3
2Es
α0 Quadrant 3
S2
α1 S1
S2
α1
α1 S0
α0
S0 α0
S2 (i) Pr(S1|S0)
Quadrant 4
S3
S0
(c)
Quadrant 1
(b)
Es
Es
S0
(ii) Pr(S2|S0)
α0 S3 (iii) Pr(S3|S0)
Figure 11.12 QPSK: (a) signal space diagram; (b) error occurs if S 0 is sent but the received signal lies in shaded region; (c) taking two states at a time, error occurs if the received signal falls in the shaded region.
Observe that the shaded area of Figure 11.12b is given by the sum of the shaded areas in Figure 11.12c(i) and (ii) less half the shaded area in c(iii) – to correct for quadrant 3 being included twice in the summation. Therefore, the probability Pe0 that the received state lies in the shaded region of Figure 11.12b is given by 1 Pe0 = Pr(S1 ∣ So ) + Pr(S2 ∣ So ) − Pr(S3 ∣ So ) 2 (√ (√ ) ) Es Es 1 = erfc − erfc 2No 4 No (√ ) Es ≃ erfc 2No where the approximation holds because the second term in the previous line is negligible compared to the first term for practical values of Es /N o . For example, the ratio between the two terms is 25 at Es /N o = 5 dB, increasing rapidly to 809 at Es /N o = 10 dB. An important implication of this observation is that when Es /N o is large then Pr(S3 |S0 ) is small compared to Pr(S1 |S0 ) and Pr(S2 |S0 ). In other words, it can be assumed that errors involve the mistaking of one symbol for its nearest neighbours only. From the symmetry of the signal-space diagram, the probability of error in any of the other symbols is the same as obtained above for S0 . Thus, the probability of error in any symbol is (√ ) (√ ) Es Eb = erfc (11.53) Pes ≡ SER = erfc 2No No
707
11 Digital Modulated Transmission
0.1 10–2
2–
AS K
,2
10–3
–F
SK
,U
BB
Q K PS
10–4
SK –P
,2
10–5
,B
BER
708
BB
10–6 10–7 10–8 10–9
2
4
6
8
10
12
14
16
Eb/No, dB Figure 11.13 Bit error ratio (BER) versus E b /No of selected digital transmission systems. (BBB = bipolar baseband; UBB = unipolar baseband.)
since each symbol conveys two bits, so that Es = 2Eb . Finally, to obtain BER, we observe that, in M-ary transmission with Gray coding, neighbouring states differ in only one bit position. An error in one symbol, which represents log2 M bits, gives rise to one bit error. Thus BER =
SER log2 M
In this case, with M = 4 and SER given by Eq. (11.53), we obtain (√ ) Eb 1 (QPSK) BER = erfc 2 No
(11.54)
(11.55)
From Worked Example 11.3, we summarise the following important results, which are also plotted in Figure 11.13 with Eb /N o expressed in dB. An important word of caution is in order here: before using any of the formulas for BER presented in this chapter, the quantity Eb /N o must first be computed as the (non-dB) ratio between Eb in joules (≡ watt-second) and N o in watt/hertz (≡ watt-second). (√ ) ⎧1 Eb , (2-ASK, 2-FSK & UBB) ⎪ 2 erfc (√ 2No) BER = ⎨ 1 (11.56) Eb , (2-PSK, QPSK & BBB) ⎪ 2 erfc No ⎩
11.6.2 Arbitrary Binary Transmission Equation (11.48) may be expressed in a very useful form that gives the BER of any type of binary transmission system directly in terms of the average energy per bit Eb . To do so, consider the general signal-space diagram
11.6 Symbol and Bit Error Ratios
α1
Figure 11.14 Signalspace diagram of an arbitrary binary transmission system.
S0
s01
E
E0
s01 – s11
S1 E1 s00
s11 α0
s10 – s00 s10
shown in Figure 11.14. This diagram applies to all binary transmission systems with an appropriate choice of coefficients (s00 , s01 , s10 , s11 ) and basis functions 𝛼 0 (t) and 𝛼 1 (t). We also know that the BER is given by Eq. (11.48) √ with E the distance between states S0 and S1 , and that this system employs two symbols g0 (t) = s00 𝛼0 (t) + s01 𝛼1 (t),
Binary 0
g1 (t) = s10 𝛼0 (t) + s11 𝛼1 (t),
Binary 1
The respective energies of the symbols are E0 = s200 + s201 E1 = s210 + s211
(11.57)
The average energy per bit Eb is E0 + E1 2 The correlation coefficient of the two symbols follows from Eq. (11.24) Eb =
𝜌=
1 Eb ∫ 0
(11.58)
Ts
g0 (t)g1 (t)dt T
s 1 [s00 𝛼0 (t) + s01 𝛼1 (t)][s10 𝛼0 (t) + s11 𝛼1 (t)]dt ∫ Eb 0 { } Ts Ts 1 2 2 s00 s10 𝛼0 (t)dt + s01 s11 𝛼1 (t)dt = ∫0 ∫0 Eb s s + s01 s11 = 00 10 Eb
=
(11.59)
In the above, we obtain the third line by ignoring the integrals involving the product 𝛼 0 (t)𝛼 1 (t) since they evaluate to zero (in view of the orthogonality property), and the last line by noting that 𝛼 0 (t) and 𝛼 1 (t) are unit-energy basis functions. Finally, applying Pythagoras’s rule in Figure 11.14 allows us to express the energy E in terms of Eb and 𝜌 as follows √ ( E)2 = (s01 − s11 )2 + (s10 − s00 )2 = (s200 + s201 ) + (s210 + s211 ) − 2(s00 s10 + s01 s11 )
709
710
11 Digital Modulated Transmission
Replacing each term on the right-hand side with its equivalent from Eqs. (11.57–11.59) yields the following important relation E = E0 + E1 − 2𝜌Eb = 2Eb − 2𝜌Eb = 2Eb (1 − 𝜌)
(11.60)
Substituting this relation in Eq. (11.48) gives the BER of any binary transmission system (assumed to have binary symmetry) √ ⎛ E (1 − 𝜌) ⎞ 1 b ⎟ (11.61) BER = erfc ⎜ ⎜ 2 2No ⎟ ⎝ ⎠ It is worth emphasising that Eq. (11.61) applies to all binary symmetric transmission systems, whether modulated or baseband. A few special cases will help to demonstrate the utility of this important equation. ●
●
● ●
Identical symbols: if g0 (t) = g1 (t), then 𝜌 = 1, and Eq. (11.61) gives BER = 0.5erfc(0) = 0.5. It would be ridiculous to use the same symbol to convey both binary 1 and 0. The BER is the same as would be obtained by basing each decision entirely on the result of flipping a fair coin. The receiver does not gain any information from detecting the incoming symbols and should not even bother. PSK and BBB: two antipodal symbols are used. With g0 (t) = −g1 (t), we obtain 𝜌 = −1. Equation (11.61) then reduces to Eq. (11.56). FSK: two orthogonal symbols are used, giving 𝜌 = 0. Equation (11.61) then reduces to (11.56). ASK and UBB: two symbols are used that differ only in their amplitudes A0 and A1 , which are of course positive numbers. You may wish to verify that in this case 𝜌 ≥ 0. Specifically 𝜌=
●
2A0 A1 A20 + A21
(11.62)
We see from Eq. (11.61) that, for a given Eb , the lowest BER is obtained when A0 = 0, giving 𝜌 = 0. For all other values of A0 , the correlation coefficient 𝜌 has a positive value between 0 and unity. This reduces the argument of the complementary error function and leads to a larger BER. Setting A0 = 0 gives what is known as on–off keying (OOK). It is therefore clear that OOK gives ASK its best (i.e. lowest) possible BER. Assigning nonzero values to both A0 and A1 always results in a higher BER compared to an OOK of the same energy per bit. Note that setting A0 and A1 to the same nonzero value yields 𝜌 = 1, and BER = 0.5. This, and the case of A0 = A1 = 0, corresponds to the identical-symbol system discussed above.
From the foregoing we have a very clear picture of the BER performance of the three types of binary modulation. We see that PSK gives a lower BER than either ASK or FSK for a given received signal power, which is measured in terms of the average energy per bit Eb . To achieve the same error ratios in the three systems, twice as much symbol energy (i.e. 3 dB increase) is required in ASK and FSK. The increase is more worthwhile at higher values of Eb /N o . For example, when Eb /N o is 12 dB then a 3 dB increase in Eb improves the BER of ASK and FSK dramatically from 3.4 × 10−5 to 9 × 10−9 . On the other hand, at Eb /N o = 2 dB, a similar increase in Eb only yields a modest improvement in BER, from 0.1 to 0.04. An important clarification is necessary about the BER of QPSK when compared to binary phase shift keying (BPSK). For the same symbol rate and hence bandwidth, QPSK allows transmission at twice the bit rate of BPSK. However, Figure 11.13 shows that QPSK and BPSK have the same BER at the same Eb /N o and this could be erroneously interpreted to mean that QPSK has somehow achieved a doubling of bandwidth efficiency (over BPSK) at the same signal quality without requiring additional signal power. But Eb is energy per bit and not a direct
11.6 Symbol and Bit Error Ratios
measure of signal power. If N o is equal in both systems then a direct comparison of their signal power is given by the difference between the quantity C/N o (expressed in dB) in each system. This ratio between carrier power and N o may be expressed in terms of Eb /N o as E C = b Rb No No
(11.63)
where we have made use of Eq. (11.19) for signal power in terms of Eb and bit rate Rb . Therefore, to have the same Eb /N o as BPSK (as in Figure 11.13), the QPSK signal power must be a factor of two (i.e. 3 dB) higher than that of BPSK since its bit rate is twice that of BPSK. Worked Example 11.4 A binary PSK system transmits at 140 Mbit/s. The noise power per unit bandwidth at the detection point of the receiver is 5 × 10−21 W/Hz and the received signal power is −82 dBm at the same point. (a) Determine the BER. (b) Show how BER may be improved to 1 × 10−8 if the modulation technique and noise level remain unchanged. (a) To determine BER we need the average energy per bit Eb , which equals the product of received power and bit duration. The received power P (in watts) is P = [10(−82∕10) ] × 10−3 = 6.31 × 10−12 W = 6.31 pW The duration of one bit is given by 1 1 = = 7.143 × 10−9 s Bit Rate 140 × 106 Therefore, energy per bit is Ts =
Eb = PT s = 4.507 × 10−20 J With N o = 5 × 10−21 , the ratio Eb /N o = 9.0137 or 9.55 dB. There are now several options for obtaining the BER. From the PSK curve in Figure 11.13, we see that at Eb /N o = 9.55 dB, the BER is 1.1 × 10−5 . Alternatively, using Eq. (11.56), with Eb /N o = 9.0137 we obtain √ BER = 0.5erfc( 9.0137) = 0.5erfc(3) Now we may read the value erfc(3) = 2.2 × 10−5 from the tables in Appendix C, or directly calculate it using the formula provided in Eq. (3.39). Whichever way, the result is BER = 1.1 × 10−5 . (b) Figure 11.13 shows that in binary PSK a BER of 1 × 10−8 requires Eb /N o = 12 dB. Therefore, with N o unchanged, we must increase Eb by 2.45 dB (i.e. 12–9.55) or a factor of 1.758 to achieve this lower BER. Since Eb = PT s , it means that Eb may be increased either by increasing the transmitted power (and hence received power P) or increasing the symbol duration T s . Note that increasing T s is equivalent to reducing bit rate by the same factor. Therefore, we may maintain the bit rate at 140 Mbit/s but raise the transmitted power to give a received power level of 6.31 × 1.758 = 11.1 pW. Alternatively, we maintain the previous power level but reduce the bit rate to 140/1.758 = 79.64 Mbit/s. The transmitted power level is often restricted in order to minimise interference to other systems, or to reduce radiation hazards to users of handheld transmitters, or to prolong battery life in portable systems. However, with Eb determined by both transmitted power level and bit rate, appropriate values of these two parameters may often be found to match the noisiness of the transmission system and achieve a desired BER.
711
712
11 Digital Modulated Transmission
11.7 Binary Modulation In binary modulation one symbol is transmitted to represent each bit in the information-bearing bit stream. There are therefore two distinct symbols, namely g0 (t) representing binary 0 and g1 (t) representing binary 1. Binary modulation is the simplest special case of our discussion in previous sections with Number of states in Signal Space M = 2 Symbol duration Ts = Bit duration Tb Symbol Rate Rs = Bit Rate Rb Symbol Energy Es = Bit Energy Eb
(11.64)
Binary modulation has already been discussed to some extent in this chapter, especially in the worked examples. This section is devoted to a brief discussion of the generation and bandwidth of the three types of binary modulated signals, namely ASK, FSK, and PSK. Each generator in the following discussion involves a product modulator, which is represented as a multiplier. A detailed description of the operation of a product modulator is presented in Section 7.7.1.2.
11.7.1 ASK ASK signal g(t) may be generated using the circuit shown in block diagram form in Figure 11.15a. The bit stream is first represented as a unipolar non-return-to-zero (UNRZ) waveform. At the output of the UNRZ coder, binary 1 is represented by a pulse of height +1 (normalised) and duration spanning the entire bit interval T b , and binary 0 by the absence of a pulse in the bit interval. The pulse shape is shown as rectangular but may be shaped (using, for example, the root raised cosine filter (defined in Eq. (4.173) and further discussed in Chapter 12)) in order to reduce the frequency components outside the main lobe of Figure 11.3. The UNRZ waveform and a sinusoidal carrier signal Ac cos(2𝜋f c t) (which is a constant multiple K of the basis function 𝛼 0 (t) of the system) are applied to a product modulator. The resulting output is the desired ASK signal. This consists of a sinusoidal pulse during the bit interval for a binary 1 and no pulse for a binary 0 { Binary 1 g1 (t), vask (t) = 0, Binary 0 { Ac cos(2𝜋fc t), 0 ≤ t ≤ Tb g1 (t) = 0, elsewhere fc = n∕Tb = nRb ,
n = 1, 2, 3, · · ·
(11.65)
The frequency f c of the sinusoidal symbol is an integer multiple of the bit rate Rb , and the amplitude Ac has a value that gives the required average energy per bit Eb . If binary 1 and 0 are equally likely in the input bit stream, it follows from Eq. (11.3) that ( 2 ) A2 T 1 Ac Tb + 0 = c b Eb = 2 2 4 √ Or, Ac = 2 Eb ∕Tb (11.66) To examine the spectrum of ASK, let us rewrite vask (t) in the form vask (t) =
Ac (1 ± m) cos(2𝜋fc t), 2
m=1
(11.67)
11.7 Binary Modulation
1 Bit stream 0 1 0 1 … Unipolar (a) NRZ coder
Ac
Tb
×
ʋm(t)
0 –Ac ʋASK(t)
ASK signal
Kα0(t) = Ac cos(2πfct) |Vunrz(f)|
|Vask(f)|
(b)
AM
0
A2c
B
Lower sideband
f
Upper sideband fc Bask = 2B
f
A2c δ( f – fc) 4
Sask(f)
4Rb Rb = 1/Tb
(c)
0
fc – 3Rb
fc – 2Rb
fc – Rb
fc
fc + Rb
fc + 2Rb
f fc + 3Rb
Figure 11.15 ASK: (a) modulator; (b) single-sided amplitude spectrum |Vask (f )|, based on a triangular shape for the spectrum |Vunrz (f )| of a unipolar NRZ waveform; (c) single-sided power spectral density (PSD), assuming a random bit stream.
where the positive sign holds during an interval of binary 1 and the negative sign during binary 0. In this form, we see that ASK is a double sideband transmitted carrier amplitude modulation signal, with modulation factor m = 1. A little thought and, if need be, reference to Chapter 7 will show that in this view, the modulating signal vm (t) is a bipolar NRZ waveform of value Ac /2 during binary 1, and −Ac /2 during binary 0, and the unmodulated carrier amplitude is Ac /2. The spectrum of the ASK signal then consists of an upper sideband, a lower sideband (LSB), and an impulse (of weight Ac /2) at the carrier frequency, as shown in Figure 11.15b for a symbolic spectrum of vm (t). The bandwidth of ASK is therefore twice the bandwidth of the baseband bipolar NRZ waveform. We have assumed that the input bit stream is completely random, 1’s and 0’s occurring with equal likelihood. Under this condition the power spectral density SB (f ) of the bipolar NRZ waveform equals the square of the amplitude spectrum of the constituent pulse. Thus, assuming a rectangular pulse shape SB (f ) = sinc2 (fTb ), Normalised
(11.68)
Figure 11.15c shows a plot of SB (f ), which decreases rapidly as the inverse square of frequency, and has a null bandwidth equal to the bit rate Rb (= 1/T b ). Setting the baseband signal bandwidth equal to this null bandwidth, it follows that the bandwidth of ASK is given by Bask = 2Rb
(11.69)
713
714
11 Digital Modulated Transmission
11.7.2 PSK A block diagram for the generation of PSK is shown in Figure 11.16. The bit stream is first coded as a bipolar non-return-to-zero (BNRZ) waveform. At the output of the BNRZ coder, binary 1 is represented by a pulse of height +1 and duration spanning the entire bit interval T b , and binary 0 is represented by a similar pulse but of opposite polarity. Pulse shaping may be included prior to modulation by following the coder with a suitable filter or making the filter an integral part of the coder. The BNRZ waveform is applied to a product modulator along with a sinusoidal signal K𝛼 0 (t), a constant multiple of the basis function of the system. The resulting output is the desired PSK signal. This consists of two distinct sinusoidal pulses of duration T b that have the same frequency f c and amplitude Ac but differ in phase by 180∘ . The generation process shown in Figure 11.16a leads to a sinusoidal pulse with 0∘ phase during intervals of binary 1, and an opposite polarity pulse (i.e. 180∘ phase) during intervals of binary 0. That is {
g1 (t),
Binary 1
−g1 (t),
Binary 0
Ac cos(2𝜋fc t),
0 ≤ t ≤ Tb
vpsk (t) = { g1 (t) =
0,
elsewhere
fc = n∕Tb = nRb ,
n = 1, 2, 3, · · ·
1
(a)
Bit stream 101…
Ac 0 –Ac
Tb
–1 Bipolar NRZ coder
ʋm(t)
(11.70)
×
PSK signal
ʋpsk(t)
Kα0(t) = Ac cos (2πfct) |Vpsk(f)|
|Vbnrz(f)| (b)
DSB 0
A2c Rb
B
Upper sideband
Lower sideband
f
fc Bpsk = 2B
Spsk(f)
f
Rb = 1/Tb
(c)
fc – 3Rb
fc – 2Rb
fc – Rb
fc
fc + Rb
fc + 2Rb
f fc + 3Rb
Figure 11.16 PSK: (a) modulator; (b) amplitude spectrum |Vpsk (f )|, based on a triangular shape for the spectrum |Vbnrz (f )| of a BNRZ waveform; (c) single-sided PSD, assuming a random bit stream.
11.7 Binary Modulation
Both transmitted symbols have the same energy. In this case the energy per bit Eb is given by
Or,
Eb = A2c Tb ∕2 √ Ac = 2Eb ∕Tb
(11.71)
We may obtain the spectrum of PSK by noting that Figure 11.16a represents a double sideband suppressed carrier amplitude modulation. PSK will therefore have a spectrum like that of ASK, except that there is no impulse at the frequency point f c , which reflects the absence or suppression of the carrier. Figure 11.16b and c show, respectively, a representative amplitude spectrum of PSK and the power spectral density of a PSK signal when the input bit stream is completely random. Clearly, PSK has the same bandwidth as ASK, which is twice the bandwidth of the baseband bipolar waveform Bpsk = 2Rb
(11.72)
11.7.3 FSK 11.7.3.1 Generation
In FSK, two orthogonal sinusoidal symbols are employed, one of frequency f 1 to represent binary 1 and the other of frequency f 0 to represent binary 0. FSK can therefore be generated by combining (i.e. interleaving) two ASK signals as shown in Figure 11.17a. The bit stream is first represented using a UNRZ waveform. The output of the UNRZ coder is fed directly into the top product modulator, which is also supplied with a sinusoidal signal of
Bit stream 101…
1
Ac
Tb
×
Unipolar 0 NRZ coder
–Ac Ac
ASKf1 +
(a)
Inverter 1
×
0
ASKf0
Ac cos(2π f0t)
A2c
Σ
Ac cos(2π f1t)
–Ac
+
FSK signal
Ac –Ac
A2c δ( f – f0) 4
Sfsk(f)
A2c δ( f – f1) 4
4Rb (b)
Rb = 1/Tb
f0 – Rb Figure 11.17
f0
f0 + Rb
f 1 – Rb
f1
f1 + Rb
FSK: (a) modulator; (b) single-sided PSD, assuming a random bit stream.
f
715
716
11 Digital Modulated Transmission
frequency f 1 . The output of the top modulator is therefore an ASK signal that has a sinusoidal pulse of frequency f 1 during intervals of binary 1, and no pulse during intervals of binary 0. The UNRZ coder output is also fed into the lower product modulator but is first passed through an inverter. The lower modulator is supplied with another sinusoidal signal of frequency f 0 . The inverter produces a UNRZ waveform, which has a value +V during intervals of binary 0, and a value 0 for binary 1. As a result, the output of the lower modulator is another ASK signal, but one which contains a sinusoidal pulse of frequency f 0 during intervals of binary 0 and no pulse during intervals of binary 1. It is easy to see that by combining the outputs of the two modulators in a summing device, we obtain a signal that contains a sinusoidal pulse of frequency f 1 for binary 1, and another sinusoidal pulse of frequency f 0 for binary 0. This is the desired FSK signal, and we may write { g1 (t), Binary 1 vfsk (t) = g0 (t), Binary 0 { 0 ≤ t ≤ Tb Ac cos(2𝜋f1 t), g1 (t) = 0, elsewhere { Ac cos(2𝜋f0 t), 0 ≤ t ≤ Tb g0 (t) = 0, elsewhere f1 = n1 ∕Tb ;
f0 = n0 ∕Tb ;
n1 ≠ n0 = 1, 2, 3, · · ·
(11.73)
It is important that the two transmitted symbols g1 (t) and g0 (t) are orthogonal. This requires that the sinusoidal signals supplied to the pair of modulators in Figure 11.17a should always have the same phase. There is an implicit assumption of this phase synchronisation in Figure 11.17a, where the sinusoidal signals both have the same phase (0∘ ). Phase synchronisation coupled with the use of sinusoids whose frequencies are integer multiples of the bit rate ensures that there is phase continuity between symbols. The FSK is then described as continuous phase frequency shift keying (CPFSK). Note that both symbols have the same energy. The average energy per bit is the same as in PSK and is given by Eq. (11.71). 11.7.3.2 Spectrum
The view of FSK as two interleaved ASK signals leads logically to the power spectral density of FSK shown in Figure 11.17b. Each constituent ASK power spectrum has an impulse of weight A2c ∕4 at its respective carrier frequency, as explained earlier. It follows from this power spectrum that the bandwidth of FSK is given by Bfsk = (f1 − f0 ) + 2Rb
(11.74)
The bandwidth increases as the frequency spacing f 1 −f 0 between the two orthogonal symbols g1 (t) and g0 (t). In Eq. (11.73), f 1 and f 0 are expressed as integer multiples of 1/T b . This means that they are selected from the set of orthogonal sinusoidal pulses discussed in Worked Example 11.2b. The minimum frequency spacing in this set is f 1 − f 0 = 1/T b , which is obtained by setting n1 = n0 + 1 in Eq. (11.73). 11.7.3.3 Frequency Spacing and MSK
You may wonder whether it is possible to have two orthogonal sinusoidal pulses that are more closely spaced than 1/T b . We can gain excellent insight into this issue by considering two unit-energy pulses g1 (t) and g0 (t) with a frequency separation Δf {√ 2∕Tb cos(2𝜋f0 t), 0 ≤ t ≤ Tb g0 (t) = 0, elsewhere {√ [ ] 2∕Tb cos 2𝜋(f0 + Δf )t , 0 ≤ t ≤ Tb g1 (t) = 0, elsewhere
11.7 Binary Modulation
From Eq. (11.24), the correlation coefficient of the two pulses is given by 𝜌=
2 Tb ∫0
Tb
cos(2𝜋f0 t) cos[2𝜋(f0 + Δf )t]dt T
T
b b 1 1 cos(2𝜋Δft)dt + cos[2𝜋(2f0 + Δf )t]dt ∫ ∫ Tb 0 Tb 0 sin(2𝜋ΔfTb ) sin[2𝜋(2f0 + Δf )Tb ] = + 2𝜋Tb Δf 2𝜋Tb (2f0 + Δf )
=
= sinc(2ΔfTb ) + sinc[2(2f0 + Δf )Tb ]
(11.75)
Don’t worry much about the derivation of this equation but concentrate rather on the simplicity of its graphical presentation in Figure 11.18. There are several important observations based on this graph. ●
●
The correlation coefficient 𝜌 = 0 at integer multiples of bit rate Rb (= 1/T b ). That is, the two sinusoidal pulses of duration T b are orthogonal when their frequencies differ by an integer multiple of Rb . This means that both pulses complete an integer number of cycles in each bit interval T b , a finding that agrees with the result obtained in Worked Example 11.2b. The correlation coefficient is in fact zero at all integer multiples of half the bit rate, e.g. 1 × 0.5Rb , 2 × 0.5Rb , 3 × 0.5Rb , etc. We see that two sinusoidal pulses are orthogonal when they differ in frequency by only half the bit rate. This is the smallest frequency separation at which two sinusoidal pulses can be orthogonal and means that both pulses complete an integer number of half-cycles in the interval T b . Below this minimum frequency spacing of Rb /2, the pulses become increasingly positively correlated. An FSK scheme that uses this minimum frequency separation as well as having continuous phase is given the special name minimum shift keying (MSK). 1
Correlation coefficient, ρ
0.75
0.5
0.25
0
–0.24
0
0.5Rb 0.718Rb
Rb
1.5Rb
2Rb
2.5Rb
3Rb
Frequency separation, Δf Figure 11.18 Correlation coefficient of two sinusoidal pulses of frequencies f 0 and f 0 + Δf , as a function of their frequency separation Δf . Here f 0 = 3Rb .
717
718
11 Digital Modulated Transmission
It follows from Eq. (11.74) that the bandwidth of MSK is Bmsk = 2.5Rb ●
●
(11.76)
𝜌 has a minimum value of −0.24 at Δf = 0.718Rb . In view of Eq. (11.61), this frequency separation gives FSK the largest possible immunity to noise. To appreciate this, recall that PSK has the lowest BER for a given Eb , because it uses symbols that have a correlation coefficient 𝜌 = −1. And it follows from Eq. (11.61) that the ‘effective’ Eb is increased by 100%. In the case of FSK that employs two sinusoidal pulses separated in frequency by 0.718Rb , the effective Eb is increased by about 25% compared to that of orthogonal-symbol FSK. It is for this reason that FSK modems operate with a frequency separation that lies in the first region of negative correlation, between Rb /2 and Rb . A commonly used separation is two-thirds the bit rate. The specific value of minimum correlation coefficient 𝜌min and the frequency spacing Δf at which it occurs depends on the integer multiple of Rb that f 0 assumes. Figure 11.18 is plotted with f 0 = 3Rb and the values of 𝜌min = −0.24 at Δf = 0.718Rb quoted above are for f 0 = 3Rb . However, 𝜌min will always occur at Δf between Rb /2 and Rb . That is, the above observations hold for other values of f 0 with only a minor variation in the minimum value of 𝜌. At one extreme when f 0 → ∞, the second term of Eq. (11.75) becomes negligible, and we have 𝜌min = −0.217 at Δf = 0.715Rb . At the other extreme, when f 0 = Rb , we have 𝜌min = −0.275 at Δf = 0.721Rb .
11.7.4 Minimum Transmission Bandwidth It is important to bear in mind the assumptions involved in obtaining the bandwidths of ASK, PSK, FSK, and MSK given by Eqs. (11.69), (11.72), (11.74), and (11.76): ● ●
A completely random bit stream with equally likely 1’s and 0’s. A baseband waveform representation of the bit stream that is either rectangular or shaped using a realisable raised-cosine filter with roll-off factor 𝛼 = 1.
These assumptions lead to a baseband waveform (which serves as the modulating signal) that has bandwidth B = Rb , and hence the above results for Bask , Bpsk , and Bfsk . The minimum transmission bandwidth requirement of these binary modulation systems may be specified using two alternative arguments that lead to the same result. First of all, if the rectangular pulses of the baseband waveform are shaped using an ideal Nyquist filter then (as is discussed further in Chapter 12) the baseband waveform will have a bandwidth B = Rb /2. Secondly, and for now, we can use a simpler argument in which we consider the fastest-changing bit sequence 101010… of the message signal. The corresponding baseband waveform is a periodic signal of period 2T b , and fundamental frequency f 0 = 1/2T b = Rb /2. Figure 11.19 shows how this sequence is clearly identifiable from a waveform that contains only the fundamental frequency Rb /2. The minimum bandwidth required to convey this sequence so that it is detectible at the destination is therefore one that is large 1
0
1
0
1
0 Frequency f0 =
R 1 = b 2Tb 2
t Tb = 1/Rb
Figure 11.19
Bipolar waveform and fundamental frequency sinusoid for the most rapidly changing bit sequence.
11.8 Coherent Binary Detection
enough to pass this fundamental frequency. All other sequences will change more slowly and hence have a lower fundamental frequency. So, the minimum bandwidth of the modulating baseband waveform is Rb /2. Correspondingly, the minimum transmission bandwidth of a binary modulated system is given by Baskmin = Bpskmin = Rb Bfskmin = (f1 − f0 ) + Rb = Δf + Rb Bmskmin = 1.5Rb
(11.77)
11.8 Coherent Binary Detection Coherent demodulation is discussed at length in Chapter 7 as a technique for receiving analogue signals transmitted by amplitude modulation of a sinusoidal carrier. The matched filter, for the detection of rectangular or shaped pulses in digital baseband systems and sinusoidal pulses in digital modulated systems, is briefly introduced in Chapter 1 and further discussed in Chapter 12. And in Section 11.4 and Figure 11.8c we briefly discuss the correlation receiver, which consists of a product modulator followed by an integrator. All three processors, namely coherent demodulator, correlation receiver, and matched filter, are in fact equivalent. Note, however, that coherent demodulators used in analogue systems differ in a subtle way from those in digital systems. In the former, we want the output signal to be a faithful reproduction of the original modulating (i.e. message) signal. In the latter, the main aim is that the output has a form that gives the most reliable indication whether a matched pulse is present at a given decision instant. It is then more appropriate to use the term coherent demodulation when referring to analogue receivers and coherent detection in reference to digital receivers. With this clarification in mind, we may now briefly discuss the coherent detection of ASK, PSK, and FSK signals. We present the receiver as a matched filter of impulse response h(t) followed by a decision device. The output y(t) of the matched filter is given by the convolution of the received input signal r(t) and h(t). See Section 3.6.2 for a detailed discussion of the computation of this convolution. For simplicity and clarity, we will concentrate on a graphical presentation of the output signal for various input signals. This will serve to clarify the criteria used by the decision device to decide in favour of a binary 1 or 0 in each interval of duration T b . Equation (11.61) gives the probability of error or BER in each of the coherent detectors discussed below. You may wish to refer to the important discussion accompanying that equation. The BER may also be conveniently read from the graph of Figure 11.13.
11.8.1 ASK Detector Figure 11.20 shows the block diagram of a coherent ASK receiver. The filter is matched to cos(2𝜋f c t). The output of the filter for each of the two symbols g1 (t) and g0 (t) transmitted for binary 1 and 0, respectively, is shown in Figure 11.21. It can be seen that the output y(t), taken at t = T b and denoted y(T b ), equals 2Eb for binary 1 and 0 for binary 0, where Eb is related to the amplitude Ac of the sinusoidal pulse by Eq. (11.66). Figure 7.21 represents an ideal situation of noiseless transmission in which y(T b ) is exactly 2Eb for binary 1 and is 0 for binary 0. In practice, ASK signal r(t)
Matched filter h(t) Matched to cos(2πfct)
Figure 11.20
y(t)
y(Tb)
sample at t = Tb
Coherent ASK receiver.
To decision device
Binary 1 if > Eb Binary 0 if < Eb Eb = Ac2Tb/4
719
720
11 Digital Modulated Transmission
Noise-free input
Output
g1(t)
Ac
2Eb y1(t)
t
Binary 1
t
–Ac g0(t)
y0(t)
Binary 0 t
0
t
0
0
Tb
0
Figure 11.21
2Tb
Matched filter output for noise-free inputs in ASK detector.
Noisy input
Output
r1(t)
y1(t)
2Eb Eb t
Binary 1
t
0 –Eb –2Eb
r0(t)
y0(t)
Eb t
Binary 0
0 Figure 11.22
Tb
t
0 –Eb 0
Tb
2Tb
Matched filter output for noisy inputs in ASK detector.
the symbols are corrupted by noise, as shown in Figure 11.22. The inputs r 1 (t) and r 0 (t) in this diagram result from adding Gaussian noise w(t) of variance 5Eb to g1 (t) and g0 (t). Note that the exact shape of the output y1 (t) and y0 (t) will depend on the sequence of values in w(t), which is in general not identical for two noise functions – even if both have the same variance. To demonstrate this, Figure 11.23 shows the outputs y1 (t) and y0 (t) in six different observation intervals with the same noise variance (= 10Eb ). It is interesting to observe the similarity between the output y1 (t) in Figures 11.21 and 11.22. The matched filter effectively pulls the sinusoidal pulse g1 (t) (to which it is matched) out of noise. However, in the presence of noise, y1 (T b ) ≠ 2Eb , and y0 (T b ) ≠ 0. The decision threshold is therefore set halfway between 2Eb and 0, and binary 1 is chosen if y(T b ) > Eb , and binary 0 if y(T b ) < Eb . In the rare event of y(T b ) being exactly equal to Eb , a random choice
11.8 Coherent Binary Detection
Observation ↓
Output for binary 0
Output for binary 1 2Eb Eb
1
–Eb –2Eb 2Eb Eb
2
–Eb –2Eb 2Eb Eb
3
–Eb –2Eb 2Eb Eb
4
–Eb –2Eb 2Eb Eb
5
Error
–Eb –2Eb 2Eb Eb
6
–Eb –2Eb
Tb
0
Figure 11.23
2Tb 0
Tb
2Tb
Matched filter output in six different observation intervals with Gaussian noise of the same variance.
is made between 1 and 0. Therefore, in the fifth observation interval of Figure 11.23, binary 1 would be erroneously detected as binary 0.
11.8.2 PSK Detector A coherent PSK detector is shown in Figure 11.24a. The filter is matched to the pulse g1 (t) = Ac cos(2𝜋f c t) that represents binary 1. Recall that binary 0 is represented using a pulse of opposite polarity, g0 (t) = −g1 (t). The output of the filter with g1 (t) and g0 (t) as input is shown in Figure 11.24b. The output is sampled at t = T b . It can be seen that y(T b ) = Eb for binary 1, and −Eb for binary 0, where Eb is related to the pulse amplitude Ac by Eq. (11.71). The effect of noise is to shift y(T b ) from +Eb during the interval of binary 1, and from −Eb during binary 0. See the discussion of Figures 11.22 and 11.23. The decision threshold is set halfway between +Eb and −Eb , so that the decision device chooses binary 1 if y(T b ) > 0, and binary 0 if y(T b ) < 0. In the rare event of y(T b ) = 0, the receiver makes a random guess of 1 or 0.
11.8.3 FSK Detector Figure 11.25a shows the block diagram of a coherent FSK detector. Two matched filters are employed. The upper filter has an impulse response hU (t) which is matched to the sinusoidal pulse g1 (t) = cos(2𝜋f 1 t) that represents binary 1, whereas the lower filter’s impulse response hL (t) is matched to the sinusoidal pulse g0 (t) = cos(2𝜋f 0 t) that represents binary 0. Figure 11.25b shows the output of the two filters, denoted yU (t) and yL (t), respectively, when the received input signal r(t) (under noise-free conditions) is either g1 (t) or g0 (t). The outputs of both filters are sampled at t = T b and subtracted in a summing device to give a value y(T b ) that is fed into the decision device. It can be seen that when the input is the binary 1 pulse g1 (t), then yU (T b ) = Eb and yL (T b ) = 0, so that y(T b ) = Eb . On the other hand, when the input is the binary 0 pulse g0 (t), then yU (T b ) = 0 and yL (T b ) = Eb , giving y(T b ) = −Eb .
721
722
11 Digital Modulated Transmission
(a)
PSK signal r(t)
y(Tb)
Matched filter h(t)
y(t) sample at t = Tb
Matched to cos(2πfct) (b) g1(t)
Ac
To decision device
Input
y1(t)
Eb
Binary 1 if y(Tb) > Eb Binary 0 if y(Tb) < Eb
Output
t
Binary 1
t
–Ac g0(t)
Ac
y0(t) t
Binary 0 –Ac
t
–Eb 0
Figure 11.24
Tb
0
Tb
2Tb
Coherent PSK detector: (a) receiver; (b) input and output of matched filter in noise-free conditions.
With y(T b ) obtained in this manner, the situation in the decision device is like that of PSK detection. Therefore, the decision threshold is set halfway between +Eb and −Eb , and the decision device chooses binary 1 if y(T b ) > 0, and binary 0 if y(T b ) < 0, and a random guess of 1 or 0 if y(T b ) = 0. The values of y(T b ) quoted above apply only when the symbols g1 (t) and g0 (t) are orthogonal. Specifically, Figure 11.25b is based on the frequency values f 0 = 3Rb and f 1 = f 0 + Rb , so that Δf = f 1 −f 0 = Rb . If the frequency spacing Δf = f 1 −f 0 has a value between 0.5Rb and Rb then, as discussed in Section 11.7.3.3, the two symbols are negatively correlated. In this case, y(T b ) greater than +Eb for binary 1 and less than −Eb for binary 0. The decision threshold is still at the halfway point at y(T b ) = 0 but there is increased immunity to noise due to the wider gap between the nominal values of y(T b ) corresponding to binary 1 and binary 0. If, on the other hand, Δf is less than 0.5Rb , or between Rb and 1.5Rb , etc. then the two symbols are positively correlated and y(T b ) is less than +Eb for binary 1 and greater than −Eb for binary 0. The gap between the nominal outputs for binary 1 and binary 0 is therefore narrower, and this increases the susceptibility of the system to noise. To further demonstrate the importance of frequency spacing, we show in Figure 11.26 the difference signal y(t) = yU (t) − yL (t) of the two matched filters for three different values of frequency spacing Δf = Rb , 0.718Rb and 0.25Rb . With Δf = Rb , we have the orthogonal-pulses scenario plotted earlier in Figure 11.25, and you can see that y(T b ) = ±Eb for binary 1 and binary 0, respectively, giving a gap of 2Eb between these nominal values of the difference signal. The case Δf = 0.718Rb corresponds to the most negative correlation possible between the two pulses, and we have an increased gap of 2.49Eb . Finally, the pulses separated in frequency by Δf = 0.25Rb are positively correlated, giving a reduced gap of 0.68Eb . In the extreme case of Δf = 0, both matched filters are identical, and so are the transmitted symbols g1 (t) and g0 (t). The difference signal y(t) is then always zero and the decision device chooses between 1 and 0 each time by a random guess. This is the identical-symbol system
11.9 Noncoherent Binary Detection
yU(Tb) Matched filter hU(t) yU(t) sample at + Matched to To decision device t = Tb Σ cos(2πf1t) y(Tb) – y (T ) Matched filter L b hL(t) yL(t) sample at Matched to t = Tb cos(2πf0t) (a)
FSK signal r(t)
Input Ac ↑
Binary 1
Eb ↑
g1(t)
–Ac
Binary 0 if y(Tb) < 0
Output →t
0 →t
0
yU(t)
Binary 1 if y(Tb) > 0
–Eb Eb ↑
yL(t) →t
0 –Eb
Ac ↑
Binary 0
0
Eb ↑
g0(t)
→t
0 →t
–Ac 0
yU(t)
–Eb Eb ↑
yL(t) →t
0 Tb
–Eb 0
Tb
2Tb
(b) Figure 11.25 and g1 (t).
Coherent FSK: (a) block diagram of detector; (b) response of the matched filters to transmitted symbols g0 (t)
discussed earlier in connection with Eq. (11.61). You can see that the probability of a wrong guess is 0.5, which agrees with the BER obtained earlier for such a system.
11.9 Noncoherent Binary Detection Coherent detection gives a high immunity to noise but poses two major challenges in its implementation. ●
We must have a complete knowledge of the phase of the incoming pulses in order to have a perfectly matched filter. Put another way, the locally generated pulse used in the correlation receiver must match the incoming
723
724
11 Digital Modulated Transmission
y(t)
Eb Orthogonal pulses: Δf = Rb
Gap = 2Eb
Binary 1 t
0 Binary 0
–Eb y(t) Negatively correlated pulses: Δf = 0.718Rb
Positively correlated pulses: Δf = 0.25Rb
Gap = 2.49Eb
Eb
Binary 1 t
0 Binary 0
–Eb y(t)
Eb
Binary 1 t
0 Binary 0 Gap = 0.68Eb
–Eb 0
Tb
2Tb
Figure 11.26 Difference signal y(t) = y U (t) − y L (t) in a coherent FSK detector using orthogonal, negatively-correlated, and positively-correlated pulses.
●
pulse exactly in phase. In practice, variations in the transmission medium will cause the incoming pulse to arrive with a variable phase. This gives rise to a nonzero and variable phase difference between the incoming pulse and a local pulse generated with fixed initial phase. This phase error will significantly degrade the detection process and increase the probability of error. Figure 11.27 shows the effect of phase errors of 45∘ and 120∘ in a coherent ASK detector. The filter is matched to the expected symbol for binary 0. Note that, in the former, the output of the filter at t = T b is y(T b ) = 1.414Eb , rather than the value 2Eb that is obtained in the absence of phase error. Since the decision threshold is at y(T b ) = Eb , it means that the noise margin has been lowered by 58.6% because of this phase error. In the second case, the phase error leads to y(T b ) = −Eb which causes outright error since this output falls below the decision threshold and the decision device would therefore decide in favour of binary 0. We must sample the output of the matched filter at the correct instants t = T b . By examining, for example, Figure 11.24b you can see that the filter output drops very rapidly away from the sampling instant t = T b . There is therefore little tolerance for timing error, the effect of which is to reduce the noise margin or (beyond a small fraction) give rise to outright error. This problem is less crucial in baseband transmission systems where a matched filter output decreases more slowly away from the sampling instant, as discussed further in Chapter 12.
We can minimise the first problem by extracting the desired pulse frequency from the incoming signal and using this to provide the locally generated pulse. ASK and FSK contain impulses at the pulse frequency, which can therefore be extracted using a bandpass filter (BPF) or, more accurately, a phase-locked loop. See Section 7.5.2 for a discussion of this carrier extraction process. A PSK signal does not, however, contain an impulse at the carrier frequency f c but this may be obtained by a full-wave rectification of the PSK signal to create a component at 2f c , extracting this component using a BPF and dividing by 2 to obtain the desired pulse frequency. This process is elaborated in Figure 11.28. There is a phase uncertainty of 180∘ in the generated carrier, depending on the phase of the division. This makes it impossible to
11.9 Noncoherent Binary Detection
2Eb↑
y(t) y(Tb) = 1.414Eb
Phase error = 45°
Eb (a) 0 –Eb –2Eb 2Eb
→t ↑y(t) Phase error = 120°
Eb (b) 0 –Eb
y(Tb) = –Eb
–2Eb 0
→t 2Tb
Tb
Figure 11.27 Effect of phase error in coherent ASK. The filter is matched to Ac cos(2𝜋f c t). Graphs show output, i.e. difference signal y(t), when the input is (a) Ac cos(2𝜋f c t + 45∘ ); (b) Ac cos(2𝜋f c t + 120∘ ).
or
1/fc PSK
Full-wave rectifier
Figure 11.28
Period = 1/2fc; Fundamental frequency = 2fc
Frequency = fc
Frequency = 2fc BPF (2fc)
÷2
Carrier signal
Generating a carrier signal from an incoming PSK signal.
be certain that the receiver is matched to g1 (t) – the symbol for binary 1, and not to g0 (t) – the symbol for binary 0, which would cause the output bit stream to be an inverted copy of the transmitted bit stream. To resolve this phase ambiguity of the locally generated carrier, a known training sequence of bits, or preamble, is first transmitted to the receiver. By comparing the receiver output to the expected output, the carrier phase is correctly set. The arrangement discussed above achieves phase synchronisation at the cost of increased complexity of the coherent binary detector. The effect of phase errors can be completely eradicated if we ignore phase information in the incoming binary modulated signal. The receiver is then described as a noncoherent detector. Obviously, this method is not applicable to PSK since it is the phase that conveys information. But the need for generating a phase-synchronised carrier at the receiver can also be eliminated in PSK by using a variant technique known as differential phase shift keying (DPSK). These noncoherent receivers are briefly discussed in the following sections.
11.9.1 Noncoherent ASK Detector Figure 11.29 shows the outputs of a matched filter in a coherent ASK receiver with inputs of phase error 0∘ , 90∘ , and 180∘ . Note that all three outputs have a common envelope, which is shown in dotted lines. It is true in general
725
726
11 Digital Modulated Transmission
2Eb ↑
y(t) 0° phase error 180° phase error 90° phase error
Common envelope Eb
0
–Eb
–2Eb
Figure 11.29
Output of a matched filter in a coherent ASK receiver for inputs with various phase errors.
ASK signal Matched filter h(t) r(t)
Envelope detector y(t)
Matched to Ac cos(2πfct) Figure 11.30
→t 2Tb
Tb
0
y(Tb) To decision device
sample at t = Tb
Binary 1 if y(Tb) > Eb Binary 0 if y(Tb) < Eb Eb = Ac2Tb/4
Noncoherent ASK detector.
that the envelope of the output of a matched filter is independent of the phase of the input signal. Therefore, we may follow a matched filter with an envelope detector. When the output of the envelope detector is sampled at t = T b , it will have a value y(T b ) = 2Eb for binary 1 irrespective of the phase error in the incoming sinusoidal pulse g1 (t). For binary 0 the sample value will of course be y(T b ) = 0. For a discussion of the operation and design of envelope detectors (also known as diode demodulators), see Section 7.5.1. The receiver just described is a noncoherent ASK detector and is shown in Figure 11.30. Note that the matched filter is just a (special response) BPF centred at f c and is frequently indicated as such in some literature. Noncoherent detection has two main advantages over coherent detection. ●
●
The output is independent of phase error, making it unnecessary to provide phase synchronisation at the receiver. This greatly simplifies receiver design. The envelope does not decrease rapidly away from the sampling instant T b . Therefore, the system is much more tolerant of timing errors than is a coherent detector.
The main drawback of a noncoherent receiver is that it is more susceptible to noise. By ignoring phase information, the receiver inadvertently admits contributions from noise of all phases. A coherent receiver, on the other hand, is not affected by noise components that are 90∘ out of phase with the incoming signal to which the receiver is matched. Eqs. (11.56) and (11.61) therefore do not apply to noncoherent reception. We will not derive this here, but assuming that (i) Eb > > N o and (ii) the bandwidth of the BPF (shown as a matched filter in Figure 11.30) is the minimum required to pass the ASK signal – see Eq. (11.77) – then the probability of error of a noncoherent
11.9 Noncoherent Binary Detection
ASK receiver is given by the expression (√ ) ( ) E Eb 1 1 + exp − b BER = erfc 4 2No 2 2No
(11.78)
where Eb is the average energy per bit given by Eq. (11.66) and N o is the noise power per unit bandwidth. Note that the BER expression in Eq. (11.78) is dominated by the second term on the right-hand side.
11.9.2 Noncoherent FSK Detector From the foregoing discussion, a noncoherent FSK detector is obtained by inserting an envelope detector after each of the matched filters in a coherent FSK detector. Figure 11.31 is a block diagram of the resulting noncoherent FSK detector. In this diagram, it is helpful to view the upper matched filter as a BPF that passes the pulse of frequency f 1 while completely rejecting the other pulse of frequency f 0 . The lower matched filter similarly serves as a BPF that passes f 0 and rejects f 1 . To help you understand the operation of this receiver, the waveforms at various points of the circuit are plotted in Figure 11.32. The upper half of Figure 11.32 gives the outputs when the incoming pulse is g1 (t) = Ac cos(2𝜋f 1 t + 𝜑e ), which represents a binary 1, whereas the lower half of the figure gives outputs for input g0 (t) = Ac cos(2𝜋f 0 t + 𝜑e ), which represents binary 0. The parameter 𝜑e is a phase error, the precise value of which does not change the output at the decision instant t = T b and only marginally affects the output at other time instants between 0 and 2T b . Figure 11.32 was produced with 𝜑e set to 90∘ . The difference signal is the input to the decision device and, regardless of phase error, is y(T b ) = Eb for binary 1 and y(T b ) = −Eb for binary 0. Therefore, the decision threshold is set halfway at y(T b ) = 0. Figure 11.32 assumes that g1 (t) and g0 (t) are orthogonal pulses. For correlated pulses, the value of y(T b ) changes in the manner discussed in Section 11.8.3. On the same assumptions as in the previous section, we find that the BER of a noncoherent FSK receiver is given by ( ) Eb 1 (11.79) BER = exp − 2 2No In this case, the energy per bit Eb is calculated from the received pulse amplitude Ac and pulse duration T b using Eq. (11.71). By comparing Eqs. (11.78) and (11.79), you can see that, for the same average Eb /N o ratio, noncoherent FSK has a slightly better noise performance (in terms of lower BER) than noncoherent ASK. However, the ASK system would have double the peak energy of FSK since it is only on for half the time (during binary 1) and off during every interval of binary 0.
11.9.3 DPSK We have noted that it is impossible to distinguish the two pulses used in PSK by observing the envelope of an incoming PSK signal. Therefore, strictly speaking, incoherent PSK detection does not exist. However, we can obviate the Matched filter hU(t) FSK signal r(t)
Envelope detector
Matched to Ac cos(2πf1t) Matched filter hL(t)
Envelope detector
Matched to Ac cos(2πf0t)
Figure 11.31
Noncoherent FSK detector.
yU(Tb) yU(t) sample at t = Tb
+
– yL(Tb)
yL(t) sample at t = Tb
Σ
To decision Binary 1 if y(Tb) > 0 device y(Tb)
Binary 0 if y(Tb) < 0
727
11 Digital Modulated Transmission
Upper (U) and lower (L) branches ↑ yU(t)
Difference signal ↑ y(t) = yU(t) – yL(t)
Eb
Eb
Binary 1
Eb/2 0 y (t) Eb↑ L
Eb/2
Eb/2 →t
0
0 y (t) Eb↑ U
0↑
y(t) = yU(t) – yL(t)
→t
Eb/2 Binary 0
728
0
–Eb/2
y (t) Eb↑ L Eb/2 0 0
Figure 11.32
Tb
2Tb
–Eb
0
Tb
2Tb
Output waveforms in a noncoherent FSK detector.
need for phase synchronisation at the receiver if information (i.e. binary 1 and 0) is coded at the transmitter as changes in phase, rather than as absolute phase values. This means that we keep the sinusoidal pulse amplitude and frequency constant at Ac and f c , respectively, but we transmit the pulse with its phase incremented by, say, 180∘ to represent binary 0, and the phase left unchanged to represent binary 1. This technique is known as differential phase shift keying (DPSK). We may write { g1 (t), Binary 1 vdpsk (t) = Binary 0 g0 (t), { 0 ≤ t ≤ Tb Ac cos(2𝜋fc t + 𝜙n−1 ), g1 (t) = 0, elsewhere { Ac cos(2𝜋fc t + 𝜙n−1 + 𝜋), 0 ≤ t ≤ Tb g0 (t) = (11.80) 0, elsewhere In the above, f c is an integer multiple of bit rate, and 𝜑n-1 is the phase of the sinusoidal pulse transmitted during the previous bit interval, which will always be either 0 or 𝜋 radians. Note that 𝜑n-1 = 0 in the first bit interval. Figure 11.33a shows the block diagram of a DPSK modulator. Comparing this with the block diagram of a PSK modulator in Figure 11.16a, we see that the only difference is in the type of baseband coder that precedes the product modulator. A bipolar NRZ-S coder is used in a DPSK modulator, whereas a PSK modulator uses a bipolar NRZ coder. These line codes are discussed in Section 10.7.1. For a random bit stream consisting of equally likely 1’s and 0’s, DPSK and PSK will therefore have identical power spectra, and hence bandwidth. The bipolar NRZ-S coder is implemented using an inverted exclusive-OR (XNOR) gate and a one-bit delay device (e.g. a clocked flip-flop) connected as shown in Figure 11.33b. Note the accompanying truth table, where bn denotes
11.9 Noncoherent Binary Detection
(a)
Input bit stream bn
Bipolar NRZ-S coder
cn
±1
×
ʋdpsk(t)
DPSK signal
Ac cos(2πfct) bn (b)
cn–1
XNOR Gate
cn
bn 0 0 1 1
Delay Tb
DPSK
Matched BPF
Current pulse
(c) Delay Tb Figure 11.33
×
x(t)
cn–1 0 1 0 1
LPF
cn 1 0 0 1
yn
Output bit stream (BNRZ)
Previous pulse
DPSK: (a) modulator; (b) NRZ-S coder and truth table of XNOR gate; (c) detector.
current input, cn denotes current output, and cn−1 denotes previous output. What is entered as binary 1 in the truth table is represented electrically in the circuit as +1 V (normalised) and binary 0 as −1 V. The output of the XNOR gate is the bipolar NRZ-S waveform cn , which is shown in Figure 11.34 for a selected input bit stream bn . The easiest way to remember the coding strategy of NRZ-S is that the waveform makes a transition between ±1 at the beginning of the bit interval if the bit is binary 0, and makes no transition if the bit is binary 1. Note that the cn waveform shown corresponds to an initially high (+1) state of the XNOR gate output. When this NRZ-S waveform is multiplied by a carrier Ac cos(2𝜋f c t) in a product modulator, the result is the DPSK waveform given by Eq. (11.80) and also shown in Figure 11.34. Detection is performed at the receiver by comparing the phase of the current pulse to that of the previous pulse. A significant phase difference indicates that the current interval is binary 0, whereas a negligible difference indicates binary 1. It is assumed that phase variations due to the transmission medium are negligible over the short period of one bit interval T b . That is, the only significant change in phase from one bit interval to the next is due entirely to the action of the DPSK modulator. To make this phase comparison, the current and previous pulses are multiplied together, which yields the signal x(t) shown in Figure 11.34. Note that in the very first bit interval the previous pulse used has zero-phase. It is easy to see that when the two pulses are antipodal then their product is negative, whereas when the pulses have the same phase then the product is positive. Passing x(t) through an LPF gives an output voltage yn , which is positive for binary 1 and negative for binary 0, as shown in Figure 11.34. You will no doubt recognise yn as a bipolar NRZ representation of the transmitted bit stream bn . Thus, the DPSK signal has been successfully demodulated. A block diagram of a DPSK detector that operates as described above is shown in Figure 11.33c. For optimum detection, the BPF is matched to the sinusoidal pulse Ac cos(2𝜋f c t) of duration T b . The BER of this optimum DPSK detector is given by the expression ( ) E 1 (11.81) BER = exp − b 2 No
729
11 Digital Modulated Transmission
Transmitter
Transmitted bitstream (bn) 1 cn 1 +1
0
1
0
0
0 NRZ-S waveform
–1 ʋdpsk(t)
+Ac
DPSK waveform
–Ac x(t)
Receiver
730
Recovered bitstream 1 yn 1 +1
0
1
0
0
0 BNRZ waveform
–1 0
Figure 11.34
Tb
2Tb
3Tb
4Tb
5Tb
6Tb
7Tb
Waveforms in DPSK modulator and detector.
You can see that DPSK – our ‘noncoherent PSK’ – has a better noise performance than either noncoherent ASK or noncoherent FSK but is inferior in this respect to coherent PSK. However, the main advantage of DPSK compared to PSK is the simplicity of its receiver circuit, which does not require phase synchronisation with the transmitter.
11.10 M-ary Transmission It is only in a limited number of situations, e.g. optical fibre communication which employs (binary) ASK, that the available channel bandwidth is enough to allow the use of binary modulated transmission to achieve the required bit rate. In communication systems involving radio and copper-wired transmission media, bandwidth efficiency is an important design consideration which makes M-ary transmission (M > 2) necessary.
11.10.1 Bandwidth Efficiency Bandwidth efficiency 𝜂 is defined as the ratio Rb /Bocc between message bit rate Rb and transmission (or occupied) bandwidth Bocc and is expressed in bits per second per hertz (b/s/Hz). In Section 6.5.2, we discuss at length the bandwidth efficiency of M-ary transmission systems that employ raised cosine filtering and plotted 𝜂 versus M in Figure 6.19. To summarise these results, M-ary ASK, PSK, APSK, and FSK systems use M unique symbols to convey log2 M bits per symbol so that bit rate Rb and symbol rate Rs are related by Rb = Rs log2 M
(11.82)
All digital transmission systems employ a raised cosine filter of roll-off factor 𝛼 (0 ≤ 𝛼 ≤ 1), introduced in Section 4.7.3.5 (Worked Example 4.16) and discussed further in Chapter 12. In M-ary ASK, PSK, APSK systems, each transmitted symbol is a sinusoidal pulse of the same (carrier) frequency f c and duration T s = 1/Rs , leading to
11.10 M-ary Transmission
Bb = (1 + α)Rs/2
Bocc
Δf = Rs/2 f0
f0 – Bb
f0 + Bb
f1
f1 – Bb
f1 + Bb
fM–1 – Bb
fM–1
f fM–1 + Bb
Figure 11.35 Spectrum of M-ary FSK signal employing orthogonal sinusoidal pulses of duration T s = 1/Rs and frequencies f 0 , f 1 , f 2 , …, f M−1 having minimum spacing Δf = Rs /2.
occupied bandwidth Bocc = Rs (1 + 𝛼)
M-ary ASK, PSK and APSK
(11.83)
In the case of M-ary FSK, the M symbols are orthogonal sinusoidal pulses of duration T s = 1/Rs at respective frequencies f 0 , f 1 , f2 , …, f M-1 . The amplitude spectrum of an M-ary FSK signal is therefore as shown in Figure 11.35. Using the minimum frequency spacing Δf = Rs /2 required for the pulses to be mutually orthogonal, occupied bandwidth is therefore Bocc = (M − 1)Δf + 2Bb ) ( M+1 +𝛼 , = Rs 2
M-ary FSK
(11.84)
Most digital transmission systems also incorporate error control coding in which redundant bits are systematically inserted to aid error detection and/or correction at the receiver. Thus, only a fraction r (0 < r ≤ 1) of the transmitted bits are message bits, r being referred to as the code rate. Error control coding reduces the message bit rate Rb from the value given by Eq. (11.82) to Rb = rRs log2 M
(11.85)
so that, in view of Eq. (11.83) and (11.84), the bandwidth efficiency of modulated M-ary transmission is given by 𝜂=
rR log M Rb = s 2 Bocc Bocc
⎧ rlog2 M , ⎪ 1+𝛼 =⎨ rlog2 M ⎪ , ⎩ 𝛼 + (M + 1)∕2
ASK, PSK, APSK (11.86) FSK
where, 𝛼 ≡ roll-off factor of raised cosine filter. 𝛼 = 0 for an ideal Nyquist channel or brickwall filter, which is not realisable in real-time, so 𝛼 typically exceeds 0.05. r ≡ code rate. r = 1 for a system operating without error control coding. Maximum bandwidth efficiency 𝜂 max is obtained when 𝛼 = 0 and r = 1, leading to 𝜂max
⎧log M, ⎪ 2 = ⎨ 2log M 2 , ⎪ ⎩ M+1
ASK, PSK, APSK (11.87) FSK
It is this maximum bandwidth efficiency that is plotted in Figure 6.19 from which we see that beyond M = 4 the maximum bandwidth efficiency of M-ary FSK decreases steadily with M and is, for example, 𝜂 max = 0.8 at
731
732
11 Digital Modulated Transmission
M = 4 and 𝜂 max = 0.02 at M = 1024. On the other hand, the bandwidth efficiency of M-ary ASK, PSK, and APSK increases steadily with M from 𝜂 max = 1 at M = 2, reaching 𝜂 max = 10 at M = 1024. ASK, PSK, and APSK systems therefore have a significantly superior spectral efficiency compared to FSK. However, as we will see shortly, what FSK lacks in spectral efficiency it makes up for in noise immunity. M-ary FSK is therefore the preferred modulation technique in applications, such as deep space communication, that involve very weak received signals, making noise immunity a prime design consideration.
11.10.2 M-ary ASK The general principles of M-ary modulation and detection are presented in Section 11.4. We now apply those ideas to M-ary ASK. 11.10.2.1 M-ary ASK Modulator
Figure 11.36a shows a block diagram of an M-ary ASK modulator. A serial-to-parallel converter, implemented using a shift register, converts the serial input bit stream to a parallel block Bin consisting of log2 M bits. A Gray code converter (GCC) then maps each of the M distinct states Bin to a unique state Bout with the same number of bits. The binary sequences Bin and Bout have respective decimal values k (in column 3 of Table 11.1) and i (in column 4). Table 11.1 also shows the mapping rule for M = 16. Column 2 is the 4-bit output of the GCC and is the binary equivalent of the decimal numbers or state index i from 0 to 15 (column 4). Notice the pattern in this binary number sequence. Going down column 2 and considering the first (i.e. rightmost) bit (or LSB) position only, we see that it consists of the two-element repeating pattern 01, which we will call the fundamental pattern. Similarly, the second bit position consists of the 4-element repeating pattern 0011, obtained by repeating each element of the fundamental pattern. The third bit position consists of the pattern 00001111 obtained by using each element of the fundamental pattern four times. In this way, the kth bit position of a binary number sequence (representing the decimal numbers 0, 1, 2, 3, …) is generated from a pattern obtained by using each element of the fundamental pattern 2k−1 times. There is also a pattern in the Gray code Bin of column 1. The fundamental pattern in this case is, however, 0110. Thus, the first bit position of the Gray code consists of the sequence 0110… From the above discussion, we therefore expect the second bit position to consist of the sequence 00111100…, the third bit position to consist of the sequence 0000111111110000…, and so on. The importance of Gray coding lies in the fact that adjacent codewords (Bin in column 1) differ in only one bit position. That is, the ith codeword differs from the (i−1)th and (i + 1)th codewords by only one bit, for i = 1, 2, 3, …, M − 2, where Bin is the Gray code of the decimal number or state index i. For example, Table 11.1 shows that the Gray code for i = 2 is 0011, rather than the usual binary
(a)
Input bit stream
M-aryASK signal (b)
Gray code converter
y(Ts) Matched filter h(t) y(t) Matched to sample at t = Ts A cos(2πf t) c
Figure 11.36
Bout
Bin
Serial to parallel converter
log2M-bit to M-level converter
ADC
c
M-ary ASK: (a) modulator; (b) detector.
cn
×
M-ary ASK signal
cos(2πfct)
Gray coder
Parallel to serial converter
Output bit stream
11.10 M-ary Transmission
Table 11.1
Step-by-step mapping of a 4-bit input sequence Bin into a transmitted symbol gk (t) in 16-ASK modulator. Decimal values
Normalised DAC output, cn
Transmitted symbol, gk (t)
0
0
g0 (t)
1
1
1/15
g1 (t)
3
2
2/15
g3 (t)
0011
2
3
3/15
g2 (t)
0100
6
4
4/15
g6 (t)
0111
0101
7
5
5/15
g7 (t)
0101
0110
5
6
6/15
g5 (t)
0100
0111
4
7
7/15
g4 (t)
1100
1000
12
8
8/15
g12 (t)
1101
1001
13
9
9/15
g13 (t)
1111
1010
15
10
10/15
g15 (t)
1110
1011
14
11
11/15
g14 (t)
1010
1100
10
12
12/15
g10 (t)
1011
1101
11
13
13/15
g11 (t)
1001
1110
9
14
14/15
g9 (t)
1000
1111
8
15
1
g8 (t)
Bin (Gray code)
Bout
0000
0000
0
0001
0001
0011
0010
0010 0110
Bin ⇒ k
Bout ⇒ i
equivalent 0010. The latter requires a change in two bit positions from the adjacent code word 0001, which violates the Gray code rule. The GCC in Figure 11.36a is a logic circuit that maps column 1 in Table 11.1 to column 2. So, for example, the 4-bit sequence 1010 is converted to 1100. The output of the converter is then processed in a log2 M-bit to M-level converter to yield the normalised output shown in column 5 of Table 11.1. This output consists of M distinct and uniformly spaced levels from 0 to 1 and is multiplied by a sinusoidal carrier of frequency f c to produce the desired M-ary ASK signal. Make sure you can see that the overall result of the whole process is that adjacent states i in the signal-space diagram of this M-ary ASK represent a block of log2 M input bits that differ in only one bit position. This means that Eq. (11.54) applies in relating BER and SER. Refer to Figure 11.7a for the signal-space diagram of 4-ASK. Numbering the states with index i, starting from i = 0 for the state at the origin to i = M − 1 for the state furthest from the origin, the M-ary ASK signal consists of one of the following M sinusoidal pulses in each symbol interval of duration T s = T b log2 M ) ( iAc t − Ts ∕2 ′ cos(2𝜋fc t)rect gi (t) ≡ gk (t) = M−1 Ts i = 0, 1, 2, · · · , M − 1;
k = GCC[i]
(11.88)
where GCC[] denotes Gray code conversion and the rectangular function rect() serves the simple purpose of constraining the sinusoidal function to a pulse in the interval (0, T s ). For example, from Table 11.1, GCC[10] = 15, so (in this example where M = 16) the symbol g15 (t) has amplitude 10Ac /15 and corresponds to state number 10 (counting from the origin). Notice that the pulses differ only in amplitude, which is determined by the output of the M-level converter in each log2 M-bit interval.
733
11 Digital Modulated Transmission
11.10.2.2 M-ary ASK Detector
An M-ary ASK detector operates on the principle shown in block diagram form in Figure 11.36b. The incoming signal is passed through a BPF that is matched to the sinusoidal pulse cos(2𝜋f c t) of duration T s . The output of the filter is sampled at t = T s and fed into an analogue-to-digital conversion (ADC), which quantises the sample y(T s ) to the nearest of M levels and provides a binary number representation of the level at its output. The M levels are Esmax 2Esmax 3Esmax , , , · · · , Esmax M−1 M−1 M−1 where Esmax is the maximum pulse energy given by 0,
A2c Ts (11.89) 2 Note that the output of the ADC is given by column 2 in Table 11.1. To recover the original bit stream, the ADC output is fed into a Gray coder, which maps from column 2 to column 1. Esmax =
11.10.2.3 BER of M-ary ASK
The main handicap of M-ary ASK is its very poor BER caused by the fact that it uses strongly correlated sinusoidal pulses. You will have the opportunity in Question 11.16 to show that the correlation coefficient of adjacent states with indices i and i + 1 in Eq. (11.88) is given by 𝜌i,i+1 =
2i(i + 1) ; 2i(i + 1) + 1
i = 0, 1, 2, · · · , M − 2
(11.90)
This result is shown in Figure 11.37 for M = 8. It shows that only the zeroth state is orthogonal to the other states. The rest of the pairs of adjacent states are strongly correlated, with 𝜌 increasing rapidly towards unity as i increases. For example, the correlation coefficient between the sixth and seventh states (i.e. i = 6 in Eq. (11.90)) is 0.99. In view of Eq. (11.61), the consequence of such high-correlation coefficients is a high BER, which we now determine. Figure 11.38 shows the signal-space diagram of an M-ary ASK with message states Si corresponding to the transmitted symbols gi′ (t) given by Eq. (11.88). Consider the probability of error Pesi in state Si . In Worked Example 11.3 1 0.9 Correlation coefficient, ρi,i + 1
734
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
0
Figure 11.37
1
2
3 State index, i
4
Correlation coefficient of adjacent states (i and i + 1) in 8-ASK.
5
6
11.10 M-ary Transmission
S0
S1
Figure 11.38
S2
…….
Si – 1
Si
Si + 1
…….
SM – 1
α0
Signal space diagram of M-ary ASK.
we justify the fact that Pesi is dominated by the event of Si being mistaken either for Si−1 or for Si+1 . Denoting the probability that Si is mistaken for Si−1 as Pesi− , and the probability that it is mistaken for Si+1 as Pesi+ , we may write Pesi = Pesi− + Pesi+
(11.91)
Pesi+ follows from Eq. (11.61) with Eb replaced by Esi+ – the average energy of Si and Si+1 , and 𝜌 given by Eq. (11.90). Using the pulse amplitudes given by Eq. (11.88) and the expression for pulse energy given by Eq. (11.3) we obtain [ ] A2c i2 Ts A2c (i + 1)2 Ts 1 + 2 (M − 1)2 2 (M − 1)2 2 2 Ac Ts [2i(i + 1) + 1] = 4(M − 1)2
Esi+ =
(11.92)
Furthermore 2i(i + 1) 2i(i + 1) + 1 1 = 2i(i + 1) + 1
1−𝜌=1−
So that Eq. (11.61) yields
Pesi+
√ ⎛ E (1 − 𝜌) ⎞ 1 si+ ⎜ ⎟ = erfc ⎜ ⎟ 2 2No ⎝ ⎠ √ ⎞ ⎛ A2c Ts 1 ⎟≡P = erfc ⎜ esa ⎜ 8(M − 1)2 No ⎟ 2 ⎠ ⎝
(11.93)
We see that the result is independent of i, and therefore is the probability (denoted Pesa ) of mistaking any symbol for an adjacent symbol. It follows from Eq. (11.91) that the probability of error in Si is Pesi = 2Pesa , and this applies for all i = 1, 2, 3, …, M − 2. Symbols S0 and SM-1 , however, have only one immediate neighbour, and hence Pes0 = PesM−1 = Pesa . The desired probability of symbol error Pes in the M-ary ASK detector is obtained by averaging these errors over all symbols. Thus [ ] M−2 ∑ 1 Pes = Pesa + 2Pesa + Pesa M i=1 2(M − 1) Pesa M √ ⎛ ⎞ A2c Ts (M − 1) ⎟ erfc ⎜ = ⎜ 8(M − 1)2 No ⎟ M ⎝ ⎠
=
(11.94)
735
736
11 Digital Modulated Transmission
It is more useful to express this equation in terms of the average energy per bit Eb in the M-ary system. Now the average energy per symbol is ] M−1 [ A2c i2 Ts 1 ∑ Es = M i=0 (M − 1)2 2 =
M−1 ∑ A2c Ts i2 2 2M(M − 1) i=0
A2c Ts (M − 1)M(2M − 1) 6 2M(M − 1)2 A2c Ts (2M − 1) = 12(M − 1) =
We obtained the third line by using the standard expression for the sum of squares 1 n(n + 1)(2n + 1) 6 Since each symbol represents log2 M bits, the average energy per bit is therefore 12 + 22 + 32 + · · · + n2 =
Es log2 M A2c Ts (2M − 1) = 12(M − 1)log2 M
(11.95)
Eb =
(11.96)
We can now use this relationship to eliminate A2c Ts from Eq. (11.94). Thus √ ⎛ ⎞ 3Eb log2 M (M − 1) ⎟ Pes = erfc ⎜ ⎜ 2No (2M − 1)(M − 1) ⎟ M ⎝ ⎠ Finally, with Gray coding, Eq. (11.54) applies, and we obtain the BER of M-ary ASK as √ ⎛ ⎞ 3Eb log2 M (M − 1) ⎟ erfc ⎜ BER = ⎜ 2No (2M − 1)(M − 1) ⎟ Mlog2 M ⎝ ⎠
(11.97)
This is a remarkable equation. It gives the BER of M-ary ASK explicitly in terms of M and our now familiar Eb /N o . Note that when M = 2 this equation reduces nicely to Eq. (11.56) for (binary) ASK, as expected. Figure 11.39 shows a plot of BER against Eb /N o for various values of M. We see that the BER increases rapidly with M. For example, at Eb /N o = 14 dB, the BER is 2.7 × 10−7 for M = 2 but increases dramatically to 2.8 × 10−3 for M = 4. To put it in another way, we need to increase transmitted signal power very significantly in order to obtain the same BER in multilevel ASK as in binary ASK. For example, a BER of 1 × 10−7 is achieved in binary ASK and 32-ASK at respective Eb /N o values of 14.3 and 35.2 dB. Using Eq. (11.63) and the fact that the bit rate of 32-ASK is five times that of binary ASK when using the same bandwidth, it follows that for both systems to operate at the same BER of 1 × 10−7 the transmitted signal power in a 32-ASK system must be raised above that of binary ASK by ΔP = 35.2 − 14.3 + 10log10 (5) = 27.89 dB This represents a very large signal power increase by a factor of 615. This finding is in line with Shannon’s information capacity theorem (Chapter 12), which stipulates that signal power must be increased if one wishes to transmit at a higher bit rate using the same bandwidth and maintaining the same level of acceptably low BER. Therefore, despite the excellent bandwidth efficiency of M-ary ASK, its extremely poor power efficiency makes multilevel ASK unsuitable for most applications. Binary ASK is, however, widely used, especially on optical fibre links. Note
11.10 M-ary Transmission
0.1 256
10–2
Bit error ratio (BER)
10–3
128
64
32 16
10–4 8 10–5 4
10–6 M=2 10–7 10–8 10–9
0
Figure 11.39
5
10
15
20 Eb/No (dB)
25
30
35
40
BER of coherent M-ary ASK, for M = 2, 4, …, 256.
that Eq. (11.97) is the optimum noise performance attainable with a coherent detector. A poorer noise performance will be obtained if an envelope (noncoherent) detector is used at the receiver.
11.10.3 M-ary PSK An M-ary PSK signal is made up of one of the following M sinusoidal pulses in each symbol interval of duration T s . The pulses have the same amplitude and frequency, differing only in phase. ) ( t − Ts ∕2 ′ gi (t) ≡ gk (t) = Ac cos(2𝜋fc t + 2𝜋i∕M + 𝜃0 )rect Ts i = 0, 1, 2, · · · , M − 1;
fc = n∕Ts ;
k = GCC[i]
(11.98)
The constant 𝜃 0 is an angular offset, which specifies the phase of state i = 0 but has no effect on the BER performance of the PSK system, so it can be set to any value, including zero. The signal-space diagram of M-ary PSK consists of message points uniformly spaced at an angular spacing 2𝜋/M along a circle that is centred at the origin. An example is shown in Figure 11.40 for 8-PSK. Because M-ary PSK has a circular constellation, the numbering or index i of the states may start at any angle and may proceed either clockwise or counterclockwise. Furthermore (and this applies to ASK and APSK as well), the bits of each state may be written left to right (i.e. MSB first) or right to left (i.e. MSB last). In Figure 11.40, 𝜃 0 = 45∘ , numbering starts at angle 𝜃 0 and proceeds counterclockwise, and the bits are written MSB first. For example, the state at angle −135∘ is state number i = 4. Eq. (11.98) gives the phase of this state as 2𝜋 × 4/8 + 𝜋/4 rad = 225∘ (≡ −135∘ ). Converting i = 4 to Gray code k (using Table 11.1) yields k = 6, so this state must represent the bits 110, as shown in Figure 11.40. A special feature of the Gray code arrangement of states in M-ary PSK constellation is that, starting at the all-binary-zero state (e.g. 000 in 8-PSK), the Gray code sequence progresses in one direction through one semicircle until all states with MSB = 0 are exhausted, and then the sequence is repeated in the other semicircle by starting at the state adjacent to the all-zero-state and flipping only the MSB and progressing in the opposite direction. The
737
738
11 Digital Modulated Transmission
001
Q (≡ α1)
011
000
45°
010
100
I (≡ α0)
101
110 111 Figure 11.40
Signal space diagram of 8-PSK.
terminal states of both semicircles are adjacent to each other in the circular constellation and are guaranteed to differ in only one bit position, provided the number of states M is an integer power of 2 (as is usually the case). In Figure 11.40, for example, the Gray code sequence starts at 000 going counterclockwise for one-half of the circle and then starts at 100 going clockwise for the remaining half of the circle. Binary PSK (for which M = 2) has already been discussed at length, and the case M = 4 (QPSK) was introduced in Worked Examples 11.2 and 11.3. We will now briefly consider the generation and detection of M-ary PSK, treating QPSK (with M = 4) first and separately, before presenting a general discussion of BER in M-ary PSK systems.
11.10.3.1 QPSK Modulator and Detector
Figure 11.41a shows the block diagram of a QPSK modulator. Consider first the modulator. The serial-to-parallel converter is a 2-bit shift register that takes in two serial bits b1 b0 during one symbol interval T s = 2T b and makes b1 available to the upper modulator (labelled I (in-phase) channel) and b0 to the lower modulator (labelled Q (quadrature) channel). The sinusoidal carrier supplied to each of the two product modulators comes from a common source, but the carrier is passed through a 90∘ phase shifting network before being fed to the lower modulator. The carrier signal of the in-phase channel is therefore cos(2𝜋f c t), whereas the carrier signal of the quadrature channel is −sin(2𝜋f c t), which leads the cosine carrier by 90∘ . Bits b1 and b0 are represented in the circuit as bipolar voltages of normalised value +1 V (for binary 1) and − 1 V (for binary 0). The QPSK signal is the sum of the I and Q channel outputs. Thus, when the input is b1 b0 = 00 the (normalised) QPSK pulse is obtained as follows by straightforward phasor addition g00 (t) = − cos(2𝜋fc t) + sin(2𝜋fc t) √ = 2 cos(2𝜋fc t − 135o )
(11.99)
11.10 M-ary Transmission
α1 01
11 α0 10
00 +1 Ts –1 180° 180°
I channel
BNRZ coder
0°
cos(2πfct)
Serial-to-parallel converter
Input Bits 00011110 …
0°
QPSK
–sin(2πfct)
+1
–135° 135° 45° –45°
Q channel
+1 Ts Tb
–1
–1 –90° 90°
(a) BPSK with sine carrier
QPSK –sin (≡ α1) 01
BPSK with cosine carrier
–sin
11
1 Es 2
Es cos
(≡ α0) 00
90° –90°
=
+
1
0
cos Es 2 Es = A2cTs /2
0
10
(b) Matched filter hU(t) yU(t) QPSK signal r(t)
Matched to cos(2πfct)
yU(Ts)
Decision device: binary 1 if yU(Ts) > 0; binary 0 if yU(Ts) < 0
sample at t = Ts
yL(Ts) Matched filter hL(t) yL(t) sample at Matched to t = Ts –sin(2πf t)
Parallel to serial converter
Output bit stream
Decision device: binary 1 if yL(Ts) > 0; binary 0 if yL(Ts) < 0
c
(c) Figure 11.41
(a) QPSK modulator and waveforms; (b) QPSK constellation as the sum of two BPSK constellations.
739
740
11 Digital Modulated Transmission
Similarly, for the remaining 2-bit inputs 01, 10, and 11, we obtain √ g01 (t) = − cos(2𝜋fc t) − sin(2𝜋fc t) = 2 cos(2𝜋fc t + 135o ) √ g10 (t) = cos(2𝜋fc t) + sin(2𝜋fc t) = 2 cos(2𝜋fc t − 45o ) √ g11 (t) = cos(2𝜋fc t) − sin(2𝜋fc t) = 2 cos(2𝜋fc t + 45o )
(11.100)
The signal-space diagram of the generated QPSK signal and the signal waveforms at every stage of the QPSK modulation process are shown in Figure 11.41a. Comparing with Eq. (11.98), we see that this modulator has angular offset 𝜃 0 = −3𝜋/4 rad, and the states are numbered clockwise from this point, with the bits written left to right. From the block diagram, we also see that QPSK consists of the simultaneous transmission of two BPSK signals, one on a cosine carrier and the other on an orthogonal sine carrier of the same frequency. This means that the QPSK signal-space diagram can be realised by combining two BPSK constellations, as shown in Figure 11.41b. QPSK detection is accomplished using two matched filters, as shown in Figure 11.41c. The task is to detect the two bits b1 b0 transmitted during each symbol interval T s . The upper filter is matched to the cosine pulse and detects b1 , whereas the lower filter is matched to the sine pulse and detects b0 . It is clear from Eqs. (11.99) and (11.100) that the cosine component in the transmitted pulse is positive for b1 = 1 and negative for b1 = 0. Similarly, the sine component is positive for b0 = 1 and negative for b0 = 0. Thus, yU (T s ), the output of the upper filter sampled at t = T s , equals +Es /2 for b1 = 1, and −Es /2 for b1 = 0. In the same way, yL (T s ) = +Es /2 for b0 = 1, and yL (T s ) = −Es /2 for b0 = 0. So, the sampled output of each filter is passed through a decision device with a threshold of zero, which gives a binary 1 output when its input is positive and a binary 0 output for a negative input. You may wish to review our detailed discussion of matched filter operation in Section 11.8 if you are in any doubt. The parallel-to-serial converter is a 2-bit shift register that takes the bits generated by the two decision devices in each interval T s and clocks them out serially, b1 first followed by b0 . 11.10.3.2 M-ary PSK Modulator and Detector
M-ary PSK modulation (for M > 4) may be achieved using the block diagram shown in Figure 11.42a. The system involves two branches of modulation that are combined to yield the M-ary PSK signal. During each symbol interval, the serial-to-parallel converter takes k (= log2 M) bits from the input bit stream and presents this block of bits to both branches. Each branch uses a nonbinary multilevel bipolar NRZ coder (described in some literature as a digital-to-analogue converter (DAC)) to map each group of k bits to a discrete output level, denoted AI in the upper branch and AQ in the lower branch. The assignment of an output level to a bit group in each branch is done in such a way as to lead to a generated M-ary PSK signal having a Gray code arrangement of states, all lying on one circle centred at the origin. The multilevel pulse sequence AI of the upper branch is multiplied by a cosine carrier of frequency f c . This branch is therefore referred to as the in-phase channel. In the lower branch, the pulse sequence AQ is multiplied by a negative sine carrier of the same frequency f c . This lower branch is therefore called the quadrature channel since the negative sine carrier leads the cosine carrier by 90∘ in phase. Adding the outputs of both product modulators yields an M-ary PSK pulse in each symbol interval (0, T s ) as ) ( t − Ts ∕2 gpsk (t) = [AI cos(2𝜋fc t) − AQ sin(2𝜋fc t)]rect (11.101) Ts which (by straightforward phasor addition) has amplitude A and phase 𝜑 given by √ A = A2I + A2Q ≡ Constant for all bit groups 𝜙 = tan−1 (AQ ∕AI ) ≡ Unique to each bit group
(11.102)
As discussed in Section 11.3.2, AI and AQ are the in-phase and quadrature components of each bandpass symbol. The real and imaginary √ parts x and y of the state representing each symbol in signal space are obtained by multiplying AI and AQ by Ts ∕2 to obtain x and y, respectively. Table 11.2 shows the mapping of input bits to discrete
11.10 M-ary Transmission
Multilevel bipolar NRZ coder (a)
Input bit stream
Serial to parallel converter
AI
cos(2πfct)
Gray-coding
M-ary PSK
–sin(2πfct) Multilevel bipolar NRZ coder
k bits (k = log2M) gpsk(t)
AQ
Ts
xI(t)
∫
AI
0
(b)
Input M-ary PSK signal
2cos(2πfct)
ADC decision device
–2sin(2πfct) gpsk(t)
Ts
xQ(t)
∫
Output bit stream
AQ
0
Multiply
Figure 11.42
Integrate
M-ary PSK system: (a) modulator; (b) detector.
Table 11.2 Required mapping of input bits to discrete output levels AI and AQ (normalised) in the multilevel bipolar NRZ coders of Figure 11.42 to produce the 8-PSK constellation of Figure 11.40. Input bits
AI
AQ
8-PSK phase
Components in signal space Real part (x)
000
√ 1∕ 2
√ 1∕ 2
001
0
1
010
−1 √ −1∕ 2
0 √ 1∕ 2
110
1 √ 1∕ 2 √ −1∕ 2
0 √ −1∕ 2 √ −1∕ 2
−45∘ −135∘
0 √ − Ts ∕2 √ − Ts ∕2 √ Ts ∕2 √ Ts ∕2 √ − Ts ∕2
111
0
−1
−90∘
0
011 100 101
45∘ 90∘ 180∘ 135∘ 0∘
√ Ts ∕2
Imag. part (y)
√ Ts ∕2 √ Ts ∕2 0 √ Ts ∕2
0 √ − Ts ∕2 √ − Ts ∕2 √ − Ts ∕2
output levels in Figure 11.42 that is required to produce the 8-PSK constellation discussed in Figure 11.40. The signal-space components of each state are also shown in the last two columns as x and y. In general, if we set the M-ary PSK constellation angular offset 𝜃 0 of Eq. (11.98) to 𝜃0 = 𝜋∕M rad
(11.103)
741
742
11 Digital Modulated Transmission
then the output levels of the bipolar NRZ coders required for state i, counting from i = 0 at the all-binary-zero state and going counterclockwise to cover the entire upper semicircle of the constellation (with terminal state index i = M/2–1), are ) ( 𝜋 (1 + 2i) AI (i) = AI (M − 1 − i) = cos M ( ) 𝜋 AQ (i) = −AQ (M − 1 − i) = sin (1 + 2i) M i = 0, 1, 2, · · · , M∕2 − 1 (11.104) The assignment of bits to the above states follows a Gray code sequence as earlier discussed, starting at the rightmost point of the upper semicircle with 00…0 and going counterclockwise to cover the upper semicircle, and then from the rightmost point of the lower semicircle with 10…0 and going clockwise to cover the lower semicircle. For example, this scheme produces the 8-PSK constellation shown in Figure 11.43 along with tabulated in-phase and quadrature components AI and AQ for each state. To devise a process for coherent detection of the above M-ary PSK signal at the receiver, let us multiply the PSK signal gpsk (t) in Eq. (11.101) by 2cos(2𝜋f c t) and use trigonometric identities to simplify this product signal, denoted xI (t) xI (t) = gpsk (t) × 2 cos cos(2𝜋fc t)
(
t − Ts ∕2 Ts
)
× 2 cos cos(2𝜋fc t) ) ( t − Ts ∕2 2 = [2AI cos (2𝜋fc t) − 2AQ sin(2𝜋fc t) cos(2𝜋fc t)]rect T ) s ( t − Ts ∕2 = [AI + AI cos(4𝜋fc t) − AQ sin(4𝜋fc t)]rect Ts = [AI cos(2𝜋fc t) − AQ sin(2𝜋fc t)]rect
011
α1 ≡ Q
001
000
010 22.5°
110
Figure 11.43
α0 ≡ I 100
111
(11.105)
Input Bits
AI
AQ
8-PSK Phase
000
0.9239
0.3827
22.5°
001
0.3827
0.9239
67.5°
010
–0.9239
0.3827
157.5°
011
–0.3827
0.9239
112.5°
100
0.9239
–0.3827
–22.5°
101
0.3827
–0.9239
–67.5°
110
–0.9239 –0.3827 –157.5°
111
–0.3827 –0.9239 –112.5°
101
8-PSK constellation with angular offset 𝜃 0 = 22.5∘ and its table of in-phase and quadrature components.
11.10 M-ary Transmission
Next, noting that the sinusoidal pulse completes an integer number n of cycles within a symbol duration T s , so that f c = n/T s , we integrate xI (t) over one symbol duration to obtain an output yI (T s ) as Ts
[AI + AI cos(4𝜋fc t) − AQ sin(4𝜋fc t)]dt ∫0 ]T [ AI sin(4𝜋fc t) AQ cos(4𝜋fc t) || s = AI t + + | | 4𝜋fc 4𝜋fc |0 1 = AI Ts + [A sin(4𝜋n) + AQ (cos(4𝜋n) − cos(0))] 4𝜋fc I
yI (Ts ) =
= AI Ts ≡ AI (Normalised)
(11.106)
Thus, the process of multiplying the incoming M-ary PSK signal by 2cos(2𝜋f c t) before integrating over the symbol duration yields an estimate of the in-phase component AI of the signal. An equivalent view of this process is that when the product signal xI (t) in Eq. (11.105) is passed through an LPF then the components at frequency 2f c are blocked and only the DC component AI is passed. This view is entirely consistent with the fact that an integrator is of course an LPF. Similarly, if we multiply gpsk (t) by −2sin(2𝜋f c t) to obtain xQ (t) and then integrate xQ (t) over one symbol duration to obtain an output yQ (T s ), we find that yQ (Ts ) = AQ (Normalised)
(11.107)
In this way, the pair of values (AI , AQ ) used at the transmitter is recovered from the incoming M-ary PSK signal for each symbol interval. This pair may then be fed into a decision device (or special ADC) which maps or converts the pair into a group of k bits according to the bits-assignment rules followed at the transmitter. The decisions take account of the effect of noise and are based on the magnitude of the ratio |AQ ∕AI | and the signs of AI and AQ . For example, in Figure 11.43, the sectors of the 8-PSK states have been alternately shaded to show the angular range (or decision boundaries) which the detector uses for each state. The angular range 0–45∘ belongs to state 000, the range 45∘ –90∘ belongs to state 001, the range 90∘ –135∘ belongs to state 011, and so on. Therefore, for this 8-PSK constellation, decisions will be made as follows: ● ● ● ●
If 0 < |AQ ∕AI | < 1 and AI , AQ > 0, then output bits = 000. If |AQ ∕AI | > 1 and AI , AQ > 0, then output bits = 001. If |AQ ∕AI | > 1 and AI < 0, AQ > 0, then output bits = 011. And so on.
In the unlikely event of the above conditions falling exactly on a decision boundary, one of the two sectors is chosen purely at random. Figure 11.42b shows a block diagram of the M-ary PSK coherent detection process described above.
11.10.3.3 BER of M-ary PSK
The task of determining the BER of M-ary PSK is simplified by the symmetry in the signal-space diagram, e.g. see Figure 11.43. Every transmitted state has the same probability of error, which is therefore the probability of symbol error Pes of the system. We will continue with our justified assumption that symbol errors arise from the mistaking of a message state for its immediate neighbour. In Figure 11.44, the probability Pesi+ of mistaking a message state Si for its immediate counterclockwise neighbour Si+1 is given by Eq. (11.61), with Eb replaced by Es (= A2c Ts ∕2), and 𝜌 the correlation coefficient of the pulses
743
744
11 Digital Modulated Transmission
–sin Si + 1
ϕ
ϕ = 2π/M
Si cos
ϕ
Si – 1
Figure 11.44
Computation of BER in M-ary PSK.
′ gi′ (t) and gi+1 (t) in Eq. (11.98). By definition, see Eq. (11.24) T
𝜌= = =
∫0 s gi (t)gi+1 (t)dt Es T A2c ∫0 s
1 Ts
cos[2𝜋fc t + 2𝜋i∕M + 𝜃0 ] cos[2𝜋fc t + 2𝜋(i + 1)∕M + 𝜃0 ]dt A2c Ts ∕2
{
Ts
∫0
Ts
cos(2𝜋∕M)dt +
∫0
}
cos[4𝜋fc t + 2𝜋(2i + 1)∕M + 2𝜃0 ]dt
= cos(2𝜋∕M) ≡ cos 𝜙
(11.108)
Note that the second integral in the third line is zero, being the integration of a sinusoidal function of frequency 2f c (≡ 2n/T s ) over an interval T s that is an integer number of its period. Eq. (11.108) is an interesting result which confirms what we already know in the following special cases. (i) Two states separated by 𝜙 = 90∘ are orthogonal: (𝜌 = cos90∘ = 0), e.g. QPSK. (ii) Two states separated by 𝜙 = 180∘ are antipodal: (𝜌 = cos180∘ = −1), e.g. BPSK. Note, therefore, that the two message states in a BPSK system do not have to lie on (opposite sides of) the cosine axis. All that is required is that they differ in phase by 180∘ . (iii) Two (equal energy) states separated by 𝜙 = 0∘ are identical: (𝜌 = cos0∘ = 1). Armed with this result, we return to the problem at hand and obtain √ ⎛ E (1 − cos 𝜙) ⎞ 1 s ⎜ ⎟ Pesi+ = erfc ⎜ ⎟ 2 2No ⎝ ⎠ √ ⎛ E (1 − cos 𝜙)log M ⎞ 1 b 2 ⎟ = erfc ⎜ ⎜ ⎟ 2 2No ⎝ ⎠ where Eb is the energy per bit. Pesi+ obtained above would be the probability of symbol error in an M-ary system with M = 2. You may wish to verify that this result agrees with Eq. (11.56). However, for M > 2, there is also an immediate clockwise neighbour Si−1 for which Si can be mistaken with equal probability. Thus, the probability of symbol error Pes = 2Pesi+ , and BER = Pes /log2 M. It follows that the BER of M-ary PSK (with coherent detection
11.10 M-ary Transmission
ASK AS : M = 25 6 K: M =1 28
0.1 10–2
64
64
ASK: 2 M=3
8
6 M=1
6
ASK:
4
=8
2
ASK: M
0
=4
=2
10–8
ASK: M
ASK: M
10–7
128
M=
32
4
10–6
10–9
16
= 2,
10–5
8
: ASK
256 10–4
:M PSK
Bit error ratio (BER)
10–3
10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 Eb/No (dB)
Figure 11.45
BER of coherent M-ary PSK (solid curve) with comparison to M-ary ASK (dashed curve) for M = 2, 4, …, 256.
and M > 2) is given by
√ ⎛ E [1 − cos(2𝜋∕M)]log M ⎞ 1 b 2 ⎟ BER = erfc ⎜ ⎜ ⎟ log2 M 2No ⎝ ⎠
(11.109)
As expected, when M = 4 this equation reduces to Eq. (11.56) for the BER of a QPSK system. It should be emphasised that to apply Eq. (11.109) to the binary case (M = 2) a factor of one-half is required to account for the absence of a second immediate neighbour in the signal-space diagram. Figure 11.45 gives a plot of Eq. (11.109) for various values of M. The BER of M-ary ASK is also shown in dashed lines for comparison. We may deduce the following observations from this plot: ●
Signal power P (= Eb Rb ) in a multilevel PSK system (M > 2) must be increased in order to maintain the error ratio obtainable with BPSK (M = 2). If noise power per unit bandwidth (N o ) is the same for all M, then the required increase in signal power when going from M 1 -ary PSK to M 2 -ary PSK, where M 2 > M 1 and both are integer powers of 2, is ( ) log2 M2 Eb | Eb | | | dB (11.110) − + 10log10 ΔP = No ||M2 -PSK No ||M1 -PSK log2 M1
where the first two terms on the right-hand side are the values of Eb /N o in dB required to achieve a specified BER in each system (as read from Figure 11.45). For example, we read from Figure 11.45 that to achieve BER = 10−7 , 8-PSK requires Eb /N o = 14.75 dB and 64-PSK requires Eb /N o = 29.35 dB. Therefore, to switch from 8-PSK to 64-PSK while maintaining the same signal transmission quality of BER = 10−7 and using the same bandwidth, Eq. (11.110) stipulates that signal power must be increased by ΔP = 29.35 − 14.75 + 10log10 (6∕3) = 17.6 dB
745
746
11 Digital Modulated Transmission
You may recall (see Figure 6.19) that the bandwidth efficiency of M-ary PSK increases logarithmically with M, so this observation is simply highlighting the trading of signal power for improved bandwidth efficiency. ●
The noise performance of M-ary PSK is clearly superior to that of M-ary ASK. In fact, 8-PSK has about the same BER as a binary ASK (i.e. OOK) system that uses the same average energy per bit. To further illustrate, achieving the same BER = 10−7 in a 4-ASK system as in a 4-PSK system requires the signal power of the 4-ASK system to be increased by a factor of 6.9 (≡ 8.4 dB) above that of the 4-PSK system. Since M-ary ASK and M-ary PSK require the same bandwidth, there is no reason to choose multilevel ASK, except for the simplicity of its modulation and detection circuits.
The complexity of M-ary PSK detection can be greatly simplified if the modulator represents message state i by a shift 𝛼i in the carrier phase of the previous interval, rather than by an absolute phase value. The receiver then performs detection by comparing the phase of the sinusoidal pulse received in the current interval to that of the previous interval. This implementation is known as M-ary DPSK, the binary case of which is discussed in Section 11.9.3. M-ary DPSK, however, has the disadvantage of an inferior noise performance compared to the coherent M-ary PSK discussed in this section.
11.10.4 M-ary FSK An M-ary FSK signal consists of one of the following M orthogonal sinusoidal pulses in each symbol interval of duration T s . The pulses differ only in frequency, which are spaced at half the symbol rate to achieve mutual orthogonality of the transmitted symbols with a minimum required bandwidth. ) ( t − Ts ∕2 gi (t) = Ac cos[2𝜋(f0 + iΔf )t]rect Ts i = 0, 1, 2, · · · , M − 1;
Δf = 1∕2Ts = Rs ∕2
(11.111)
The frequency f 0 is chosen to place the transmission in the desired or allocated frequency band. 11.10.4.1 M-ary FSK Modulator and Detector
A simple arrangement for generating M-ary FSK is shown in Figure 11.44a. During each symbol interval (of duration T s ) the log2 M-bit to M-level converter takes log2 M bits from the input bit stream and converts them to a normalised output equal to their decimal equivalent, 0, 1, 2, …, M-1. The output of the M-level converter drives a voltage-controlled oscillator (VCO). If the normalised frequency sensitivity of the VCO is Δf and its free-running frequency is f 0 , then its output is the M-ary FSK signal given by Eq. (11.111). Detection of M-ary FSK requires M matched filters or, equivalently, correlation receivers connected as shown in Figure 11.46b. Only one orthogonal pulse is present in the received signal during each symbol interval. Therefore, during each interval only one of the M branches (the one matched to the transmitted pulse during that interval) will have a significant output whereas the output of the other branches will be negligible. So, the outputs of all M branches after each integration cycle are fed into a decision device, which generates a normalised output equal to the index number (0, 1, 2, …, M − 1) of the input port with the maximum input. A binary encoder then produces the corresponding group of log2 M bits, which may be clocked out in a serial fashion using a parallel-to-serial converter. 11.10.4.2 BER of M-ary FSK
We can obtain an upper bound, referred to as the union bound, for the BER of M-ary FSK in a straightforward manner if we make three important observations.
11.10 M-ary Transmission
Serial to parallel converter
Input bit stream
(a)
log2M-bit to M-level converter
Ts
∫
VCO
M-ary FSK signal
y0(Ts)
0
cos[2πf0t]
(b)
Ts
∫
M-ary FSK
y1(Ts)
0
gfsk(t)
cos[2π(f0 + Δf)t]
Decision device: chooses maximum input
Binary encoder
Parallel to serial converter
Output bit stream
Ts
∫ 0
yM–1(Ts)
cos[2π(f0 + (M – 1)Δf)t]
Figure 11.46
M-ary FSK: (a) modulator; (b) coherent detector.
f0 + Rs/2 S1 2d
d
d
d=
2d
d
S0
Es
f0
2d S2 f0 + R s Figure 11.47
●
● ●
Distances in M-ary FSK signal space, M = 3.
M-ary FSK has an M-dimensional signal space with all message points at the same distance from each other. Thus, every message point has M-1 adjacent (or nearest) neighbours. This observation is illustrated in Figure 11.47 for M = 3. In fact, the distance between any pair of message points in the signal space of M-ary √ FSK (for all M) is 2Es , where Es is the energy per symbol. All message points are mutually orthogonal, which yields 𝜌 = 0 in Eq. (11.61). Gray coding is not applicable. By averaging the number of bit errors incurred when a message state is mistaken for each of its M − 1 neighbours we find that BER =
M∕2 × Symbol Error Rate M−1
(11.112)
747
748
11 Digital Modulated Transmission
With these observations, the probability Pes1 of mistaking a message point for one other message point follows from Eq. (11.61) with Eb replaced by Es (the energy per symbol), and 𝜌 = 0. Thus (√ ) Es 1 Pes1 = erfc 2 2No √ ⎛ E log M ⎞ 1 b 2 ⎟ (11.113) = erfc ⎜ ⎜ 2 2No ⎟ ⎝ ⎠ where Eb is the energy per bit. The maximum probability of symbol error Pesmax is the sum of the probabilities (all equal to Pes1 ) of mistaking a message state for each of its M − 1 neighbours. That is Pesmax = (M − 1)Pes1
(11.114)
This is an upper bound on Pes , because summing the individual probabilities implicitly assumes independence of each of the events. That is, it assumes that when an error occurs the received state is nearer to only one other state than to the transmitted state. However, there will be some situations in which a received state is nearer to two or more other states than the transmitted state. In this case, summing the probabilities needlessly increases the overall probability of error by including regions of intersection more than once. It follows from Eqs. (11.112–11.114) that the upper bound on BER in an M-ary FSK system (with coherent detection) is given by ⎛ M BER ≤ erfc ⎜ ⎜ 4 ⎝
√
Eb log2 M ⎞ ⎟ 2No ⎟ ⎠
(11.115)
For M = 2 the bound becomes an equality and Eq. (11.115) reduces to the result obtained earlier for binary FSK. 11.10.4.3 Noise-bandwidth Trade-off in M-ary FSK
Equation (11.115) is plotted in Figure 11.48 for selected values of M. Note that, in contrast to M-ary ASK and M-ary PSK, the BER performance improves as M increases. It is very important that you understand why this is the case. M-ary ASK and M-ary PSK transmit different amplitudes and phases of the same carrier frequency. In so doing bandwidth is conserved, but the transmitted sinusoidal pulses become increasingly positively correlated as M is increased. As a result, it becomes more and more difficult to distinguish the pulses at the receiver and the system becomes increasingly vulnerable to noise. In M-ary FSK, on the other hand, mutually orthogonal pulses are transmitted at all values of M, making them perfectly distinguishable at the receiver. However, this is achieved at the expense of bandwidth since each of the M pulses must have a different frequency, with a separation equal to half the symbol rate or 1/2T s . Increasing M increases T s and allows closer spacing of the frequencies. But the spacing reduces proportionately to log2 M, whereas the number of frequencies required increases with M. So, overall the bandwidth increases roughly proportionately to M/log2 M. As M is increased, the symbol interval T s (= T b log2 M) increases. Since the receiver performs detection by integrating over an interval T s , the contribution of random noise is significantly reduced as the integration interval increases, leading to improved BER performance. You can see how an increase in T s reduces noise effects by noting that if random noise is observed over a sufficiently long time then the amount of positive and negative samples will be equal, giving a sum of zero. The action of integration is equivalent to summing these samples and including a scaling factor. Of course, the effect of noise is also reduced in M-ary ASK and M-ary PSK as M and hence T s increases, but the benefit to the receiver of this reduction is more than cancelled out by the increased correlation of the transmitted pulses. Therefore, M-ary FSK gives us the ability to trade bandwidth for an improved noise performance in a way that is not possible with M-ary PSK and M-ary ASK.
11.10 M-ary Transmission
0.1 10–2
Bit error ratio (BER)
10–3 M=2
10–4
4
10–5 16
10–6
64
10–7
1024 256
10–8 10–9
0
Figure 11.48
1
2
3
4
5
6
7 8 9 Eb/No (dB)
10
11
12
13
14
15
16
Upper bound of BER of M-ary FSK for M = 2, 4, 16, 64, 256, 1024.
11.10.5 M-ary APSK M-ary PSK constrains all M message states to a circle centred about the origin in signal space. BER performance improvement can be achieved if the message states are spaced more freely and widely in the 2D signal space. This requires combining two quadrature (sine and cosine) carriers that are modulated both in amplitude and phase. This is therefore a hybrid modulation technique known as M-ary amplitude and phase shift keying (APSK), which in some literature is also described as quadrature amplitude modulation (QAM). There are different implementations of APSK, leading to square, circular, and star constellations, as shown in Figure 11.49 for M = 16. The star APSK for M = 16 has a minimum phase difference of 90∘ between message points of the same energy and is therefore thought to perform better than square APSK in transmission media with a predominance of phase distortion. Our discussion focuses on square APSK and its BER performance, with 16-APSK modulation and detection presented in detail. The same principles can be applied to other types of APSK. In general, APSK involves sinusoidal pulses of duration T s and of the form ) ( t − Ts ∕2 gi (t) = Ai cos(2𝜋fc t + 𝜙i )rect (11.116) Ts The amplitude Ai and phase 𝜙i take on discrete sets of values, which depend on the implementation. 11.10.5.1 16-APSK
An illustration of the combination of two quadrature carriers to form M-ary APSK is shown in Figure 11.50 for M = 16. A sine carrier (of frequency f c and duration T s ) codes 2 bits using two phases (0 and 180∘ ) and two (normalised) amplitudes 1 and 3. An orthogonal cosine carrier also codes 2 bits in a similar manner. The two pulses are added, with the result that 4 bits are conveyed in each symbol interval using a sinusoidal pulse (of frequency f c and duration T s ) that can take on three different amplitudes (indicated by the dotted circles) and a number of phase angles. Thus, whereas all transmitted pulses in 16-PSK have the same energy Es , in this square 16-APSK
749
750
11 Digital Modulated Transmission
–sin
–sin
cos
cos
Square APSK
Circular APSK –sin
cos
Star APSK Figure 11.49
Various APSK constellations.
four symbols are transmitted with minimum energy E0 , eight symbols with energy E1 = 5E0 , and four with peak energy E2 = 9E0 . The average energy per symbol in 16-APSK is therefore 4E0 + 8(5E0 ) + 4(9E0 ) 16 = 5E0 = 5A2c Ts
Es =
(11.117)
where Ac is the amplitude of each quadrature carrier. Figure 11.51a shows the block diagram of a 16-APSK modulator. The 2-bit GCC performs the following conversions 00 → −3 01 → −1 11 → +1 10 → +3
(11.118)
During each symbol interval T s = 4T b , the serial-to-parallel converter takes 4 bits b3 b2 b1 b0 of the input bit stream and presents b1 b0 to the lower GCC and b2 b3 to the upper GCC. Note the flipping of bit order in the latter. The output of the upper GCC multiplies a carrier Ac sin(2𝜋fc t) to give the quadrature component of the 16-APSK signal, whereas the output of the lower GCC multiplies a carrier Ac cos(2𝜋fc t) to give the in-phase component. Both
11.10 M-ary Transmission
–sin 00
10 origin
origin
+
00
01
11
10
cos
11 16-APSK
–sin
01
2d E0 = 2d2
0000
0001
0011
0010
1000
1001
1011
1010
d
E1 = 10d2 = 5E0 E2 = 18d2 = 9E0
E0
cos
d 1100
1101
1111
0100
0101
0111
2d
2d Figure 11.50
E1
d
d
E2
1110
0110 2d
Square M-ary APSK constellation, M = 16.
carriers are derived from a common source and have the same amplitude Ac and frequency f c but differ in phase by 90∘ . The desired 16-APSK signal is obtained by summing the in-phase and quadrature channels. As an example, assume an input b3 b2 b1 b0 = 0111. Following the Gray code conversion in Eq. (11.118), the in-phase channel output is yi11 (t) = Ac cos(2𝜋fc t) And the quadrature channel output is yq01 (t) = 3Ac sin(2𝜋fc t) Note that the swapping of bits at the input to the upper GCC causes b3 b2 to be received as b2 b3 , and hence the above result for b3 b2 = 01. The resulting 16-APSK pulse is therefore g0111 (t) = Ac cos(2𝜋fc t) + 3Ac sin(2𝜋fc t) √ = 10A cos(2𝜋f t − 71.6∘ ) c
c
(11.119)
Note that this pulse (of duration T s ) has energy 5A2c Ts = E1 , and phase −71.6∘ , which is in agreement with the location of 0111 on the 16-APSK constellation diagram of Figure 11.50. By proceeding in the same manner, you
751
752
11 Digital Modulated Transmission
2-bit Gray code to 4-level converter
(a)
± 1, ±3
b3 b2 b1 b0
Input bit stream
×
Ac sin(2πfct) Ac cos(2πfct) 2-bit Gray code to 4-level converter
Serial-to-parallel converter
± 1, ±3
×
(b) Matched filter hU(t) yU(t) Matched to Ac sin(2πfct)
16-APSK signal
yU(Ts)
sample at t = Ts
yL(Ts) Matched filter hL(t) yL(t) sample at Matched to t = Ts A cos(2πf t) c
Figure 11.51
Quadrature channel
4-level quantiser
+ Σ
16-APSK +
In-phase channel
4-level to 2-bit Gray code converter
± E0/2, ± 3E0/2 4-level quantiser
b3 b2
Output bit stream
b1 4-level to 2-bit Gray code converter
b0 Parallelto-serial converter
c
Square 16-APSK: (a) modulator; (b) detector.
may verify that the block diagram of Figure 11.51a does generate the array of message points in Figure 11.50. It is noteworthy that all adjacent points on the 16-APSK constellation differ in only one bit position, even for points (like 1111) with up to four immediate neighbours. This is important for BER minimisation, as discussed earlier, and results from the use of a Gray code in each channel. A 16-APSK detector is shown in Figure 11.51b. The 4-level quantiser approximates each matched filter output to the nearest of 4 levels −3E0 /2, −E0 /2, +E0 /2, and + 3E0 /2. A 4-level to 2-bit GCC then generates the bits in each channel according to the conversions −3E0 ∕2
→
00
−E0 ∕2
→
01
E0 ∕2
→
11
3E0 ∕2
→
10
(11.120)
The 2 bits from each channel are combined as shown in a parallel-to-serial converter to produce a serial output bit stream. 11.10.5.2 BER of Square M-ary APSK
A square M-ary APSK constellation is one in which the signal states form a regular 2D grid with equal spacing in both horizontal and vertical directions centred at the origin in signal space. An example is shown in Figure 11.50 for M = 16. The system conveys k = log2 M bits per symbol interval using two orthogonal and hence independent channels. If M is an even integer power of 2 (i.e. M = 4, 16, 64, 256, 1024, …) then the bits are shared equally among √ the two channels, with the in-phase and quadrature channels each conveying k∕2 = log2 M bits per symbol
11.10 M-ary Transmission
interval. But if M is an odd integer power of 2√ (i.e. M = 8, 32, 128, 512, …) then the two channels will be unequally loaded with one conveying (k + 1)∕2 = log 2 2M bits per symbol interval and the other conveying (k − 1)∕2 = √ log2 M∕2 bits per symbol interval. It may be shown (see page 382–385 of [1]) that the BER of such M-ary APSK is given by the expression √ [√ ] ⎛ 3log2 M Eb ⎞ M−u 2 ⎜ ⎟ erfc BER = √ ⎜ 2(aM − 1) No ⎟ log2 M M ⎝ ⎠ where, a = u = 1, for M an even integer power of 2 √ 3 2 , for M an odd integer power of 2 (11.121) a = 1.25; u = 4 Note that when M = 4 this equation reduces to Eq. (11.56) for the BER of a QPSK system. This is to be expected since 4-APSK and QPSK are of course identical. Figure 11.52 shows the BER of square M-ary APSK versus Eb /N o (expressed in dB) for various values of M. For comparison, the BER of the corresponding M-ary PSK is also shown in dashed lines. We can see that M-ary APSK provides a significant improvement in BER and allows a lower value of Eb /N o to be used for a given error performance. For example, there is a saving of 9.8 dB in the signal power required by 64-APSK for a BER of 10−7 compared to a 64-PSK system of the same BER. However, unlike M-ary PSK, the performance of M-ary APSK is sensitive to channel nonlinearity. The superior noise performance evident in Figure 11.52 assumes that there are no amplitude distortions in the transmission system. We also see that the noise performance of 8-APSK is only marginally better than that of 8-PSK (and this is because errors in 8-APSK are dominated in one of the orthogonal channels that carries 2 bits per symbol), so, given the sensitivity of 8-APSK to channel nonlinearity, 8-PSK will be preferred to 8-APSK in all applications. Furthermore, Figure 11.52 shows APSK
0.1
PSK
10–2
Bit error ratio (BER)
10–3 10–4 10–5 10–6 10–7 10–8 10–9
0
2
4
6
8
10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 Eb/No (dB)
Figure 11.52
BER of square M-ary APSK, with comparison to M-ary PSK.
753
754
11 Digital Modulated Transmission
the noise performance of 64-APSK as being very close to that of 16-PSK. However, in view of Eq. (11.110) we know that an extra signal power of 2.1 dB is required to increase bit rate by a factor of 3/2 (using the same bandwidth and maintaining the same low BER) when one switches from 16-PSK to 64-APSK.
11.11 Design Parameters A designer of a digital transmission system has the following important parameters at their disposal. ●
Transmission bandwidth: this is a scarce resource that must be used judiciously to maximise the number of users or services. All digital transmission systems employ raised cosine filters (introduced in Worked Example 4.16) for which the transmission bandwidth (also called occupied bandwidth) Bocc in Hz when transmitting at symbol rate Rs in baud (i.e. symbols per second) is Bocc
⎧ ⎪Rs (1 + 𝛼), ) = ⎨ (M + 1 + 𝛼 , R ⎪ s 2 ⎩
M-ary ASK, PSK, APSK M-ary FSK
(11.122)
and the noise equivalent bandwidth B (discussed in Section 4.7.3.5) is B = Rs (1 − 𝛼∕4)
(11.123)
where 𝛼 is the roll-off factor of the raised cosine filter. Eq. (11.123) applies to all M-ary modulation schemes (including FSK) since they employ a bank of correlation receivers, each matched to one symbol of duration T s = 1/Rs . ●
●
Signal power: this is an expensive resource that has a direct bearing on component sizes, battery life, radiation safety, and potential interference to other systems. The signal power Ps at the reference point of the receiver input may be determined in various transmission media, as discussed extensively in Chapter 5. Noise power: the presence of noise places a fundamental limit on the ability of the receiver circuit to distinguish between each of the unique symbols used by the transmitter to convey different bit groups. Every instance of symbol misidentification produces one or more errors in the recovered bit stream. The noise power Pn at the reference point of the receiver input is given by the product of the noise power per unit bandwidth N o and the noise equivalent bandwidth B of Eq. (11.123): Pn = No B = kT sys B
(11.124)
where k = 1.38 × 10−23 J/K is Boltzmann’s constant and T sys is the noise temperature of the receiver, as discussed in Chapter 6. ●
Carrier-to-noise ratio (C/N): the performance quality of a transmission system is determined not just by signal power but by the ratio between signal power Ps and noise power Pn (called the carrier-to-noise ratio and denoted C/N) at the reference point of the receiver input. Another parameter that is widely used in digital transmission systems to compare signal and noise powers is the ratio Eb /N o between the average energy per bit Eb and the noise power per unit bandwidth N o at the receiver input. Using Eq. (11.123) and (11.19), the two parameters are related as follows Rs log2 M ER E C = b b = b × N No B No Rs (1 − 𝛼∕4) log2 M Eb (11.125) × = No (1 − 𝛼∕4)
11.11 Design Parameters
●
Here, Rb is the coded bit rate which includes any redundant bits inserted for the purpose of error control. The system designer must think beyond simply increasing the transmitted signal power and explore ways of reducing overall signal loss as well as the noisiness of the transmission medium and the receiver. Bit rate: some services, such as interactive digital television, require a minimum transmission bit rate for proper operation. The time of transmitting data over a communication network decreases proportionately with bit rate, and this will have implications on service cost if charged according to service duration, as in plain old telephone service (POTS). Bit rate Rb is directly related to symbol rate Rs through the order M of the M-ary modulation scheme and is only indirectly related to transmission bandwidth Bocc through Rs (as stated in Eq. (11.122)) Rb = Rs log2 M
●
●
(11.126)
Bit error ratio: most services specify a maximum BER that can be tolerated, for example 10−4 for voice and 10−7 for data. The transmission link must therefore be designed to deliver at the reference point of the receiver input a signal that has a C/N or Eb /N o which satisfies the required BER specification upon demodulation and error control decoding. Modcod: the combination of modulation and error control coding schemes is often referred to as modcod. All the M-ary modulation schemes discussed in this chapter enable log2 M bits to be conveyed by each transmitted symbol. In the schemes of M-ary ASK, PSK, and APSK the transmitted symbols all have the same frequency, so bandwidth is used more efficiently by a factor of log2 M compared to binary transmission in which M = 2, but signal power must be increased to maintain the same BER. In the case of M-ary FSK, the symbols have different orthogonal frequencies, so bandwidth is used less efficiently as M increases, but a lower signal power is required to maintain the same BER. In all cases, redundant bits may be systematically introduced prior to modulation to enable error detection/correction at the receiver which allows the use of a lower signal power to maintain the same BER. When error control coding is used in this way, the ratio between message bits and message plus redundant bits is known as the code rate r and the dB reduction in the required value of Eb /N o is known as coding gain Gc . The threshold value of Eb /N o required to not exceed a specified maximum BER on the link is then given by (Eb ∕No )threshold = (Eb ∕No )theoretical − Gc + Lmil
(11.127)
where (Eb /N o )theoretical is the theoretical value of Eb /N o at the specified BER read from graphs such as Figure 11.45, 11.48 and 11.52 for the modulation scheme, and Lmil is the modem implementation loss, which indicates that a practical modem will not be as efficient as suggested by the theoretical BER curves and will therefore require an extra signal power given by Lmil in dB. Let us bring together in one place the expressions that relate the above design parameters in each of the digital modulated systems that we have studied. Message bit rate Rb (which includes all bits except those inserted for the sole purpose of error control at code rate r), transmission or occupied bandwidth Bocc , modulation order M, bandwidth efficiency 𝜂, and raised cosine filter’s roll-off factor 𝛼 are related as follows
𝜂=
Rb Bocc
⎧ rlog2 M , ⎪ 1+𝛼 =⎨ rlog2 M ⎪ , ⎩ 𝛼 + (M + 1)∕2
ASK, PSK, APSK (11.128) FSK
755
756
11 Digital Modulated Transmission
BER and Eb /N o are related as follows in the various transmission systems √ ⎧ ⎛ ⎞ 3Eb log2 M ⎪ (M − 1) ⎜ ⎟, erfc ⎪ Mlog M ⎜ 2No (2M − 1)(M − 1) ⎟ 2 ⎪ ⎝ ⎠ ⎪ √ ⎪ ⎛ ⎞ ⎪ 1 erfc ⎜ Eb [1 − cos(2𝜋∕M)]log2 M ⎟ , ⎪ log2 M ⎜ ⎟ 2No ⎝ ⎠ ⎪ BER = ⎨ √ ] [√ ⎛ ⎪ 2 3log2 M Eb ⎞ M−u ⎟, ⎪ erfc ⎜ √ ⎜ 2(aM − 1) No ⎟ ⎪ log2 M M ⎠ ⎝ ⎪ √ ⎪ ⎛ E log M ⎞ ⎪ M b 2 ⎟, ⎪≤ erfc ⎜ ⎜ 2No ⎟ ⎪ 4 ⎝ ⎠ ⎩
ASK
PSK
(M>2)
(11.129) APSK
FSK
√ where a = u = 1 for M an even integer power of 2, and a = 1.25, u = 3 2∕4 for M an odd integer power of 2. Note that the APSK constellation to which the above BER expression applies is a square APSK having states arranged in a uniform 2D grid. Also, you will recall that the BER for BPSK (M = 2) is the same as for QPSK (M = 4) and is given as half the above PSK expression with M = 2. Furthermore, for binary FSK the BER is exactly equal to the upper bound given above. Worked Example 11.5 An on-board-processing (OBP) geostationary orbit (GEO) satellite system provides broadband communication to 20 VSAT terminals. Transmission on the inbound link from each VSAT to the satellite is at 12 Mb/s using 8-PSK modulation. The satellite employs 16-APSK modulation to transmit on the outbound link at 240 Mb/s to the VSATs, and this transmission consists of packets addressed to each of the 20 VSATs. Each VSAT receives the 240 Mb/s bit stream from the satellite and processes this to extract the information addressed to it. All links employ raised cosine filters with roll-off factor 𝛼 = 0.2. Determine: (a) The noise equivalent bandwidth of the VSAT receiver. (b) The occupied bandwidth of the transmission from VSAT to satellite. (c) The BER at the satellite receiver if the 8-PSK demodulator on the satellite has a 1 dB implementation loss and the inbound link from VSAT to satellite has C/N = 20 dB. (d) The fade margin of the inbound link in (c) if threshold BER (prior to error correction in the onboard decoder) is set at 10−3 . (e) The BER at the VSAT receiver if the 16-APSK demodulator in the VSAT receiver has a 1.5 dB implementation loss and the outbound link from satellite to VSAT has C/N = 22 dB. (f) The fade margin of the outbound link in (e) if threshold BER (prior to error correction in the VSAT decoder) is set at 3 × 10−4 . (a) Each VSAT receiver (on downlink) is required to receive at bit rate Rb = 240 Mb/s transmitted via 16-APSK by the satellite. Thus, received symbol rate Rs = Rb /log2 M = 240/log2 16 = 240/4 = 60 MBd. Using Eq. (11.123), the VSAT receiver’s noise equivalent bandwidth is B = Rs (1 − 𝛼∕4) = 60(1 − 0.2∕4) = 57 MHz
11.12 Summary
(b) Transmission from VSAT to satellite is at Rb = 12 Mb/s using 8-PSK. Thus, symbol rate Rs = Rb /log2 M = 12/log2 8 = 12/3 = 4 MBd, and occupied bandwidth is Bocc = Rs (1 + 𝛼) = 4(1 + 0.2) = 4.8 MHz (c) A practical modem with implementation loss Lmil = 1 dB has the same BER at C/N = 20 dB as a theoretically ideal modem has at C/N = 20−Lmil = 19 dB. Since the BER curve for 8-PSK in Figure 11.45 is plotted as a function of Eb /N o , we use Eq. (11.125) to convert C/N to Eb /N o , obtaining ( ( ) ) log2 M log2 8 Eb C − 10log10 = 19 − 10log10 = 19 − 5 = 14 dB = No N 1 − 𝛼∕4 1 − 0.2∕4 Reading the 8-PSK BER curve of Figure 11.45 at Eb /N o = 14 dB yields the BER in the satellite receiver as BER = 8.8 × 10−7 . From the BER vs Eb /N o curve of Figure 11.45 for 8-PSK, a threshold BER of 10−3 corresponds to a threshold Eb /N o of 10 dB. Converting to C/N ) ( log2 M Eb C = = 10 + 5 = 15 dB + 10log10 N No 1 − 𝛼∕4 This is the C/N level required to achieve the threshold BER of 10−3 in a theoretical 8-PSK modem. A practical modem will need something higher by Lmil = 1 dB. Thus, the threshold C/N is 15 + 1 dB = 16 dB. Fade margin is the amount by which actual link C/N exceeds the threshold value. Thus, the fade margin of the inbound link is 20–16 = 4 dB. (d) This practical modem with modem implementation loss Lmil = 1.5 dB will have the same BER at 22 dB as a theoretical modem has at C/N = 22–1.5 = 20.5 dB. We will assume that the 16-APSK scheme has a square constellation so that the BER curves of Figure 11.52 are applicable. Converting C/N to Eb /N o yields ) ) ( ( log2 M log2 16 Eb C − 10log10 = 20.5 − 10log10 = No N 1 − 𝛼∕4 1 − 0.2∕4 = 20.5 − 6.24 = 14.26 dB Reading the 16-APSK BER curve of Figure 11.52 at Eb /N o = 14.26 dB yields the BER in the VSAT receiver as BER = 1.5 × 10−6 . (e) From the 16-APSK BER curve of Figure 11.52, Eb /N o = 11.5 dB at BER = 3 × 10−4 . The corresponding C/N is C/N = Eb /N o + 6.24 = 18.04 dB. The practical C/N is 1.5 dB higher due to modem implementation loss. Thus, threshold C/N = 19.54 dB, and since the link’s actual C/N is 22 dB, the fade margin is 2.46 dB.
11.12 Summary We have now come to the end of our study of digital modulated transmission, which featured an in-depth instruction in the analysis, design, and operation of the major digital modulation schemes. It must be emphasised that our discussion was not exhaustive. However, the solid foundation and thorough understanding which you have now acquired in the principles should give you plenty of confidence and all the tools required for dealing with the many applications and variant techniques in this rapidly developing field. Binary and M-ary ASK offer the advantages of bandwidth efficiency and simple modulation and demodulation circuits. However, this class of techniques suffer from a poor BER and require, comparatively, the largest signal power for an acceptable BER. Binary and M-ary PSK have the same high bandwidth efficiency as the corresponding ASK system as well as the added advantage of good (i.e. low) BERs. However, they require complex modulation and demodulation circuits
757
758
11 Digital Modulated Transmission
and, beyond M = 4, they are significantly inferior to M-ary APSK in BER performance when the transmission system does not introduce amplitude distortions. Binary and M-ary FSK have the poorest bandwidth efficiency of all the digital modulation techniques, with a peak at M = 4. The BER performance is the same as in ASK for M = 2, but M-ary FSK allows a unique and subtle exchange between bandwidth, signal power, and BER, as has been discussed. The circuit complexity of FSK systems lies somewhere between that of ASK and PSK. M-ary APSK, with its less restricted distribution of signal states in a 2D signal space, provides good efficiency in signal power consumption. It has the same bandwidth efficiency as the corresponding ASK and PSK systems but a better BER than both systems. It, however, has the drawbacks of circuit complexity (comparable to PSK) and susceptibility to channel nonlinearity. Historically, early low-speed modems used for data transmission over the public switched telephone network (PSTN) employed FSK. Many International Telecommunication Union (ITU) standards were specified. For example, the V.21 specified a full duplex modem operating at 300 b/s. Sinusoids at frequencies f 0 = 980 Hz and f 1 = 1180 Hz were used in the forward direction, and f 0 = 1650 Hz and f 1 = 1850 Hz in the return direction. The V.23 modem provided a half-duplex operation at 1.2 kb/s using frequencies f 0 = 1300 Hz and f 2 = 2100 Hz. There was provision in this standard for a 75 b/s back channel that used tones at 390 and 450 Hz. FSK was also employed for teletype transmission via HF and VHF radio. PSK was used in voice-band full-duplex synchronous modems for data transmission over the PSTN. For example, QPSK was used in the ITU-T V.22 and V.26 modems, which, respectively, operated at carrier frequencies of 1.2 kHz and 1.8 kHz, and bit rates of 1.2 kb/s and 2.4 kb/s. The V.29 and V.32 modems used 16-APSK to achieve a bit rate of 9.6 kb/s. The V.32 can also operate with a 32-state signal space (or 32-APSK), where extra bits are included for forward error correction, using a technique known as trellis coding. This allows the same BER to be achieved with 4 dB less signal power than in 16-APSK. Such a saving in signal power at the same BER and transmission bandwidth is referred to as a coding gain. There is of course necessarily a higher circuit complexity. The V.33 modem operated at a carrier frequency of 1.8 kHz and delivered a maximum bit rate of 14.4 kb/s using 128-APSK, and 12 kb/s using 64-APSK. Both included trellis coding. The V.34 modem delivered various bit rates up to a maximum of 33.6 kb/s using trellis coding and subsets of a 1664-APSK constellation. The V.90 modem also employed APSK, with a constellation in excess of 1024 states, to achieve a bit rate of 56 kb/s. Modern applications of the digital modulation principles discussed in this chapter are much more varied and sophisticated. Binary ASK, implemented as OOK and in many cases combined with wavelength division multiplexing, is widely employed for transmission in optical fibre links. High-order M-ary FSK is employed in deep space communications where its unique features can be exploited to enhance the reliable detection of extremely weak signals. BPSK, QPSK, and 8-PSK are employed in radio frequency identification (RFID), wireless personal area networks (such as Bluetooth and ZigBee), wireless local area networks (such as Wi-Fi), various generations of terrestrial mobile communication systems and satellite communications. High-order M-ary APSK (for M ≥ 16) is increasingly widely employed in terrestrial and satellite communication systems to support an adaptive modulation strategy which strives always to use the most bandwidth-efficient modulation and coding combination permitted by the prevailing transmission channel conditions.
Reference 1 Otung, I. (2014). Digital Communications: Principles & Systems. London: Institution of Engineering and Technology (IET). ISBN: 978-1849196116.
Questions
Questions 1
.a) Determine the duration, energy, centre frequency, and bandwidth of the bandpass pulse v(t) = 20rect(2 × 103 t) sin(4𝜋 × 104 t) V b) Sketch the waveform of the above pulse.
2
Figure Q11.2 shows the orthonormal basis functions and constellation diagram of a baseband transmission system. (a) Determine the amplitude V 𝜙 of the basis functions. (b) Sketch a clearly labelled waveform of each transmitted symbol. (c) Calculate the energy of each transmitted symbol.
3
A transmission system conveys information using the symbols shown in Figure Q11.3. (a) Determine and sketch the orthonormal basis functions of the system. (b) Calculate the energy of each of the transmitted symbols. (c) Express each symbol as a linear combination of the basis functions. (d) Sketch the constellation diagram of the transmission system.
4
Sketch the constellation diagram of a transmission system that employs the symbols g0 (t) to g7 (t) shown in Figure Q11.4. Calculate the energy of each symbol.
5
Determine and sketch the orthonormal basis function(s) of a transmission system that uses the following pulses: g2 (t) and g5 (t) in Figure Q11.4. g0 (t) and g1 (t) in Figure Q11.3. g3 (t) and g8 (t) in Figure Q11.3.
6
A digital modulated system transmits the following symbols [ ] ⎧ 𝜋 ⎪10(1 + j) cos 8000𝜋t + 4 (2i + 1) , gk (t) = ⎨ 0, ⎪ ⎩ where i = 0, 1, 2, 3;
Vϕ
elsewhere
k = i + 4j
α0(t)
α1 2
t, μs
100 Vϕ
j = 0, 1;
0 ≤ t ≤ 1 ms
α1(t)
S1
S2
–2 100
200
Figure Q11.2 Question 11.2.
α0
2
–2
t, μs
S3
S0
759
760
11 Digital Modulated Transmission
. (a) (b) (c) (d) (e)
What is the dimension N of the system’s signal space? Sketch the constellation diagram. What are the values of symbol energies used? Determine the system’s orthonormal basis functions. Express each transmitted symbol as a linear combination of the orthonormal basis functions. What is the name of the modulation technique employed?
7
Calculate the correlation coefficient of the following energy signals: (a) g4 (t) and g5 (t) in Figure Q11.4 (b) g1 (t) and g2 (t) in Figure Q11.4 (c) g4 (t) and g6 (t) in Figure Q11.4 (d) g1 (t) and g4 (t) in Figure Q11.3 (e) g3 (t) and g8 (t) in Figure Q11.3.
8
A binary ASK system uses two sinusoidal pulses of duration T s and amplitudes Ao and A1 = 𝛼Ao , where Ao > 0 and 𝛼 ≥ 0. Determine: (a) An expression for the average energy per bit Eb . (b) An expression for the BER of the system in terms of 𝛼 and Eb /N o only, where N o is the noise power per unit bandwidth. (c) The value of 𝛼 that yields the lowest BER for a given Eb . g3(t), V
g0(t), V 5
g6(t), V
5 6
5 6
t, μs
–5
2
t, μs
–5
6
6
2
t, μs
–5
5
g8(t), V
5 4
t, μs
–5 Figure Q11.3 Question 11.3.
5
4 –5
t, μs
–5 g5(t), V
g2(t), V
6
5
t, μs
–5
4
t, μs
g7(t), V
5
2
6
–5 g4(t), V
g1(t), V 5
4
t, μs
6 –5
t, μs
Questions
9
A binary PSK system uses two sinusoidal pulses of phases 𝜃 o and 𝜃 1 = 𝜃 o + 𝜑. Determine: (a) An expression for the BER of the system in terms of 𝜑 and Eb /N o , where N o is the noise power per unit bandwidth and Eb is the energy per bit. (b) The value of 𝜑 that yields the lowest BER for a given Eb /No . (c) Comment on the significance of 𝜃 o and 𝜑 on BER.
10
A binary modulated system transmits at 8448 kbit/s. The noise power per unit bandwidth at the receiver input is 10−19 W/Hz and each received sinusoidal pulse has amplitude Ac = 5.2 μV. What are the transmission bandwidth and BER when the following schemes are used? (a) Coherent PSK. (b) Coherent FSK. (c) Coherent OOK, if 1’s and 0’s are equally likely.
11
A binary baseband system transmits the following pulses, shown in Figure Q11.4: (a) g5 (t) and g1 (t) for binary 0 and binary 1, respectively. (b) g5 (t) and g2 (t) for binary 0 and binary 1, respectively. The pulses experience an attenuation of 140 dB up to the detection point where the noise power per unit bandwidth is 10−17 W/Hz. Calculate the bit rate and BER of each transmission. Comment on the difference in BER.
12
A data transmission system operates at a bit rate of 4 kb/s using an 11 kHz tone to convey binary 1 and an 8 kHz tone for binary 0. The noise power per unit bandwidth at the receiver is 83.33 nW/Hz. The transmitted tones are of amplitude 1 V and they suffer a net loss of 20 dB along the path leading to the detection point. Determine: g0(t), V
g3(t), V
10
g6(t), V
10
200
t, μs
g4(t), V 100
t, μs
–10
t, μs
100
g1(t), V 200
10
t, μs
–10
g2(t), V
–10
g5(t), V
10
10
200
t, μs
–10 Figure Q11.4 Question 11.4.
200
–10
t, μs
100
200
g7(t), V 100
200
t, μs
t, μs
761
762
11 Digital Modulated Transmission
(a) The transmission bandwidth (b) The bit error ratio (BER). 13
The noise power per unit bandwidth at the receiver of a 140 Mb/s data transmission system is 10−19 W/Hz. Determine the minimum received average signal power in dBm that is required to achieve a maximum BER of 10−7 using the following modulation techniques: (a) Coherent BASK (b) QPSK (c) DPSK (d) Noncoherent BFSK.
14
A transmission system is to have a maximum BER of 10−4 . The average received signal power is −60 dBm and the noise power per unit bandwidth is 4.2 × 10−18 W/Hz. Determine the maximum bit rate that is possible with the following modulation schemes: (a) Coherent BASK (b) BPSK (c) QPSK (d) DPSK.
15
Derive an expression for the BER of the following systems when there is a phase error 𝜑 in the incoming signal: (a) BPSK (b) Coherent BASK. Hence, determine the extra signal power required to make up for a phase error of 10∘ in each system and prevent an increase in BER.
16
By making use of Eq. (11.62), show that the correlation coefficient of adjacent pulses gi (t) and gi + 1 (t) in Eq. (11.88) is as given in Eq. (11.90).
17
Repeat Question 11.13 for the following modulation schemes: (a) 16-FSK (b) 16-PSK (c) 16-ASK (d) 16-APSK.
18
A transmission system is to have a maximum BER of 10−7 . The average received signal power is −60 dBm and the noise power per unit bandwidth is 4.2 × 10−18 W/Hz. Determine the maximum bit rate that is possible with the following modulation schemes: (a) 64-ASK (b) 64-PSK (c) 64-FSK (d) 64-APSK.
763
12 Pulse Shaping and Detection
Live optimistically and hope for the best, but design deliberately in expectation of the worst. If a bad scenario can arise, assume in your design that it will.
In this Chapter ✓ Pulse shaping to eliminate intersymbol interference (ISI): a clear discussion of various anti-ISI filtering techniques, including Nyquist, raised cosine, root raised cosine, and duobinary. ✓ Information capacity law: a nonmathematical introduction to the Shannon–Hartley law followed by a detailed analysis of its implications. ✓ Digital receiver: a brief discussion of the core functions of a digital communication receiver, including a detailed treatment of the matched filter for optimum pulse detection in the presence of additive white Gaussian noise. ✓ Worked examples: a mix of graphical, heuristic, and mathematical approaches to further develop your understanding and hone your problem-solving skills.
12.1 Introduction So far, we have mostly represented transmitted symbols as rectangular pulses in baseband systems (Chapter 10) and as sinusoidal pulses (being the product of a sinusoidal signal and a rectangular window function) in bandpass systems (Chapter 11). However, the rectangular function, because of its sharp transitions, contains high-frequency components, which are attenuated in practical finite-bandwidth transmission systems. This causes the transmitted pulse to spread out beyond its symbol interval, so that it contributes significant energy into one or more adjacent symbol intervals. This phenomenon is known as intersymbol interference (ISI) and, if not addressed, will contribute to symbol errors, as the receiver cannot discriminate between energy in the current symbol and leftover energy from previous symbols. In Worked Example 4.9 of Chapter 4, we explore the limiting effect of ISI on symbol rate and transmission system capacity. It would be helpful to review this worked example before working through the rest of this chapter, which is devoted to (i) exploring various techniques of filtering and pulse shaping to minimise ISI; (ii) gaining a deeper insight into the interplay among various system parameters, particularly the constraints placed on bit rate and system capacity by ISI, bandwidth, signal power, and channel noise; and (iii) understanding the specifications of a matched filter to optimise signal detection in the presence of noise. We will carefully quantify the challenges at Communication Engineering Principles, Second Edition. Ifiok Otung. © 2021 John Wiley & Sons Ltd. Published 2021 by John Wiley & Sons Ltd. Companion Website: www.wiley.com/go/otung
764
12 Pulse Shaping and Detection
y(t), when B = Rs/4
y(t), when B = Rs/2 x(t) x(t)
Past intervals
Current interval, Ts
Future intervals
y(t)
y(t), when B = Rs
Ideal LPF (0 → B) y(t), when B = 4Rs
Figure 12.1 Transmission of rectangular pulse x(t) of duration T s through ideal lowpass LTI channel of bandwidth B. Symbol rate Rs = 1/T s ; output = y(t).
hand and then explore practical measures that can be taken at the transmitter to minimise ISI and at the receiver to enhance the correct detection of incoming symbols in the presence of noise. Figure 12.1 illustrates the transmission of a rectangular pulse x(t) (also called symbol) of duration T s through an ideal lowpass linear time invariant (LTI) channel (i.e. filter – note that every channel or system is a filter of some sort). At symbol duration T s , the number of symbols transmitted each second, or symbol rate (denoted Rs ), is Rs =
1 Ts
(symbols per second ≡ baud)
(12.1)
The output y(t) of the channel is shown for various values of channel bandwidth from B = Rs /4 to B = 4Rs . We see the occurrence of varying degrees of ISI, which may contribute to symbol detection errors at the receiver due to the identity of the current symbol being blurred by contributions from previous symbols. From Figure 12.1, one obvious solution to ISI would be to increase channel bandwidth B, since the trend in the plots (for B = Rs /4 to B = 4Rs ) shows that pulse spreading reduces as B is increased. This behaviour is to be expected from the inverse relationship between time and frequency discussed in Section 4.6: the channel narrows the bandwidth of the input pulse x(t) to B at the output, so the duration of the output pulse y(t) broadens beyond T s in response. The amount of pulse broadening decreases as bandwidth narrowing lessens (i.e. as B becomes larger), and vice versa. However, attempting to solve the ISI problem by increasing transmission bandwidth is a very expensive approach, which may even be unfeasible if it requires bandwidth to be increased beyond what is physically possible in the transmission medium. In Section 12.2 we explore effective and economic solutions through filtering that controls pulse broadening to ensure that the received pulse has zero value at the decision instants of adjacent intervals. Whatever the anti-ISI filtering measure employed, the channel bandwidth must be at least wide enough to pass the fundamental frequency f 0 of the fastest-changing sequence of transmitted pulses, as discussed in Figure 11.19
12.2 Anti-ISI Filtering
in connection with the fastest changing bit sequence 101010… If the channel has enough bandwidth to pass the fundamental sinusoidal waveform then sampling this waveform at the midpoint of each symbol interval enables the receiver to completely identify the pulse sequence. However, if the channel bandwidth is smaller than f 0 , the fundamental sinusoidal waveform will be blocked, making it impossible for the receiver to detect this fastest-changing pulse sequence. All other pulse sequences will change less slowly and will therefore have a lower fundamental frequency and hence require less bandwidth, so a bandwidth that passes f 0 is adequate for detecting all possible pulse sequences having pulse duration T s . For baseband transmission this means that the minimum passband must be from DC to f 0 , whereas for bandpass transmission obtained by multiplying the baseband message signal by a sinusoidal carrier signal of frequency f c , it means that minimum passband must be from f c − f 0 to f c + f 0 . Since (from Figure 11.19) f 0 = 1/2T s = Rs /2, it follows that a baseband system can support transmission at a symbol rate that is at most twice the bandwidth of the system. That is { 2B, Baseband system (12.2) Rsmax = B, Bandpass system Furthermore, in order to convey bits, there needs to be M unique symbols each of which is used to represent a unique combination of k = log2 M bits, the assignment of which is agreed between transmitter and receiver. In general, the transmission system is described as M-ary, but in the special cases M = 2, 3, and 4 it is described as binary, ternary, and quaternary, respectively. Beyond M = 3, the number of unique symbols M is always chosen to be an integer power of 2 to ensure full utilisation of all possible bit combinations. The task of the receiver is to identify each received symbol and then to locally generate the group of k bits represented by that symbol. One or more bit errors occur whenever there is symbol misidentification. Each transmit symbol may be described as possessing one of M distinct states that differ in amplitude, phase, frequency, or a combination of these three parameters. Since each received symbol delivers log2 M bits, the maximum bit rate follows from Eq. (12.2) as Rbmax = Rsmax log2 M = 2 Blog2 M
(12.3)
Can we therefore indefinitely increase the bit rate through a channel of bandwidth B simply by increasing M? In Section 12.3 we look at the constraint placed by noise on M and hence on the maximum bit rate that is possible for error-free transmission using bandwidth B. We then develop in Section 12.4 the specification of a filter, known as a matched filter, that enables the receiver to optimise its ability to correctly detect a known symbol in the presence of noise.
12.2 Anti-ISI Filtering If we disregard the approach of attempting to provide enough transmission bandwidth to prevent any pulse spreading then an outline of the required shape of the spread pulse p(t) at the receiver’s decision point is as shown in Figure 12.2 for the nth pulse. That is, at the decision point in the receiver where incoming pulses are sampled at a regular interval T s , the waveform of p(t) should pass through the points marked with an asterisk on the diagram, having a maximum absolute value at the decision instant of its own interval and zero value at the decision instants of adjacent intervals. Provided the receiver follows the correct timing, we don’t care much about the values of p(t) at nonsampling time instants. To explore this concept of controlled pulse spreading in more depth, consider Figure 12.3, which shows a baseband transmission system in which a baseband signal x(t) produced by a symbol generator (in this case a line coder) at point (a) in a transmitter is decided upon at point (b) in a receiver. The baseband signal consists of a sequence of pulses spaced T s seconds apart, which means that the symbol rate is Rs = 1/T s . The worst-case situation for pulse spreading occurs when the pulses have very narrow widths so that
765
766
12 Pulse Shaping and Detection
p+(t – nTs) +1
(a)
*
0
*
*
*
*
t
*
*
*
*
t
–1 p–(t – nTs) +1
(b)
0
–1
* nTs
(n – 2)Ts (n – 1)Ts Figure 12.2
Sampling instants
(n + 2)Ts (n + 1)Ts
Outline of pulse shape to avoid ISI: (a) positive symbol; (b) negative symbol.
Data in 1101…
Line coder
(a) x(t)
Baseband transmission system
(b) y(t)
Decision Data out y(nTs) device
Sample at t = nTs x(t) (a)
2Ts 0
Ts
3Ts
t
Ts y(t) (b)
t
Sampling instants at decision point at receiver Figure 12.3 Zero-ISI pulse spreading: The sequence of narrow pulses at point (a) in the transmitter is spread in the transmission system to become a sequence of sinc pulses at point (b) in the receiver.
12.2 Anti-ISI Filtering
the nth pulse can be approximated by an impulse of weight bn which carries the identity of the nth bit. For example, bn = 1 for binary 1 and bn = −1 for binary 0. Note that the discussion that follows is equally applicable to M-ary transmission, in which case bn takes on values drawn from a discrete set of M amplitudes. Thus ∑ x(t) = bn 𝛿(t − nT s ) n
Due to the finite bandwidth of the transmission system from symbol generator output at point (a) to decision point (b) at the receiver, the baseband signal y(t) arriving at point (b) will be a sequence of spread pulses. Normalising the transmission path to unit gain and zero delay, we may write ∑ bn h(t − nT s ) y(t) = n
Since h(t) is the output pulse in response to an input impulse 𝛿(t), it represents the impulse response of the entire transmission path from point (a) to point (b). At the detection point, y(t) is sampled at a regular interval T s in step with the transmitter and the sample value y(nT s ) is passed to a decision device which compares it to one threshold level (in the case of binary transmission) or multiple levels (for M > 2) to reach a decision as to which bit(s) were transmitted in that interval. Thus, to avoid ISI it is necessary and sufficient for the nth pulse h(t − nT s ) at the detection point to have (normalised) unit value at its own sampling instant t = nT s , and a value of zero at all other sampling instants …, (n − 2)T s , (n − 1)T s , (n + 1)T s , (n + 2)T s , … This means that we require that at the sampling instant t = mT s the contribution bn h(mT s − nT s ) from h(t − nT s ) to the sample output y(mT s ) should be zero in all cases except m = n when the contribution equals bn . Stated mathematically { 1 for m = n h(mT s − nT s ) = (12.4) 0 otherwise So long as there is correct timing at the receiver, the values and hence waveform of h(t − nT s ) between sampling instants are immaterial. Equation (12.4) is the Nyquist criterion for zero ISI which allows a number of solutions, as discussed in the following subsections.
12.2.1 Nyquist Filtering A sinc impulse response (12.5)
h(t) = sinc(t∕Ts ) satisfies Eq. (12.4) since { ) ( mT s − nT s 1 = sinc(m − n) = sinc Ts 0
for m = n otherwise
(12.6)
This is illustrated in Figure 12.3 for a binary impulse sequence in (a) corresponding to a bit sequence 1101… We see that among the received spread pulses arriving at (b) only the current pulse has a nonzero value at the current sampling instant. This means that if the impulse response of the transmission path from symbol output point at transmitter to symbol sampling point at receiver is as given by Eq. (12.5) then ISI-free operation is guaranteed in a properly synchronised receiver. From row 9 of Table 4.5, the Fourier transform (FT) of Eq. (12.5) yields the transfer function of an ISI-free channel as ( ) f 1 H(f ) = Ts rect (fTs ) = rect (12.7) Rs Rs
767
768
12 Pulse Shaping and Detection
h(t) = sinc(t/Ts)
1
(a)
0
–5Ts
–4Ts
–3Ts
–2Ts
–Ts
0
Ts
2Ts
3Ts
4Ts
5Ts
t
H(f) = Tsrect(f/Rs) (b)
Ts f –Rs/2
Figure 12.4
Rs/2
(a) Impulse response h(t) and (b) transfer function H(f ) of ideal Nyquist channel for zero-ISI.
The impulse response h(t) and transfer function H(f ) are plotted in Figure 12.4, which makes clear that the channel specified by H(f ) is an ideal lowpass filter (LPF) of bandwidth B=
Rs 2
(12.8)
known as the Nyquist bandwidth for ISI-free transmission at symbol rate Rs . Such a channel is called the ideal Nyquist channel and allows transmission at a symbol rate, called the Nyquist rate, which is twice the available bandwidth. This is the very best that we can do without incurring ISI. That is Rsmax = 2B which you may recognise as Eq. (12.2) in the introduction. The passband gain of 1/Rs is not significant to the scheme of ISI elimination and may be normalised to unit gain. It is important to emphasise that Eq. (12.7) does not necessarily prescribe a single filter for the transmission system; rather, it lays down the filtering characteristic which must be satisfied by the entire transmission path from symbol output point at the transmitter to the sampling point at the receiver. Typically this path will consist of several filters, in which case Eq. (12.7) specifies what the product of their transfer functions must be for ISI-free operation. There are, however, two practical problems with this solution to ISI, namely: ●
It requires an overall filtering action that produces constant gain up to frequency f = Rs /2, and infinite attenuation beyond this cut-off frequency. Such a sharp transition cannot be achieved in real-time because it requires noncausality (see Section 2.10.3) whereby future inputs contribute to current output. Thus, our ideal scenario of ISI-free transmission at a symbol rate that is twice the channel bandwidth is practically unrealisable in real time.
12.2 Anti-ISI Filtering ●
The envelope of the sinc pulse sinc(t/T s ) decays very slowly with t as 1/|t| which leaves significant residual energy in adjacent intervals, a contribution that is only avoided if we sample at precisely the right timings t = 0, T s , 2T s , 3T s , … The Nyquist channel therefore imposes a stringent requirement on timing accuracy. Any timing error leads to significant contribution to each sampled output from previous and future (yes, future!) sinc pulses at each mistimed sampling instant. That the future affects the present only serves to highlight that a sinc pulse is not of this world (i.e. it is not possible to implement it in real time)!
12.2.2 Raised Cosine Filtering We may address the problem of residual energy highlighted above by introducing a factor which forces a more rapid decay of the impulse response envelope. Now consider the set of impulse response functions ] [ cos(𝜋𝛼t∕Ts ) sinc(t∕Ts ) (12.9) h(t) = 1 − 4(𝛼t∕Ts )2 where 0 ≤ 𝛼 ≤ 1 is a dimensionless parameter. These all satisfy the Nyquist criterion for zero ISI stated in Eq. (12.4) since ) ( mT s − nT s cos 𝜋𝛼 ) ( Ts mT s − nT s sinc h(mT s − nT s ) = ( ) Ts mT s − nT s 2 1 − 4𝛼 2 Ts ⎧ ⎪1, cos[(m − n)𝜋𝛼] sinc(m − n) = = ⎨ 1 − 4𝛼 2 (m − n)2 ⎪0 ⎩
m=n otherwise
This can be readily seen in Figure 12.5a, where the impulse responses, plotted for three values of the parameter 𝛼 = 0, 0.5, 1, have zero crossings at all nonzero integer multiples of T s . Notice that, when 𝛼 = 0, Eq. (12.9) reduces to the sinc impulse response of Eq. (12.5) and the rate of decay of the tails of the function is left unchanged. At the other end of the scale, 𝛼 = 1 gives the fastest rate of decay of the tails of the impulse response function so that it contains negligible energy outside its main lobe and therefore contributes negligible ISI at slightly mistimed sampling instants. By now we have come to expect that this gradual narrowing of the duration of the impulse function as 𝛼 is increased from 0 to 1 must come at the price of bandwidth broadening. Taking the FT of Eq. (12.9) yields the corresponding transfer function H(f ) plotted in Figure 12.5b. The analytic form of this transfer function is ⎧1, ⎪ [ )] ( |f | − f1 1 ⎪1 H(f ) = , × ⎨ 1 + cos 𝜋 Rs ⎪ 2 f2 − f1 ⎪0, ⎩ f1 = (1 − 𝛼)Rs ∕2;
f2 = (1 + 𝛼)Rs ∕2;
|f | ≤ f1 f1 ≤ |f | ≤ f2 |f | ≥ f2 0≤𝛼≤1
(12.10)
For 𝛼 > 0, this spectrum represents an LPF having a constant gain portion up to f = f 1 , followed by a portion in which the gain decreases gradually until it reaches zero at f = f 2 . The decrease in gain with frequency in the interval f 1 ≤ f ≤ f 2 follows a cosine function to which the number 1 is added to limit the result to nonnegative values. This filter is therefore called a raised cosine filter. Its null bandwidth (≡ occupied bandwidth) is obtained
769
770
12 Pulse Shaping and Detection
h(t)
1
(a)
α=0
α=1
t
0 –0.4 –4Ts
α = 0.5 –3Ts
–2Ts
–Ts
0
Ts
2Ts
H(f)
1/Rs
3Ts
4Ts
α=0 α = 0.5
(b) α=1
0 –Rs
–Rs/2
0
Rs/2
Rs
f
Figure 12.5 (a) Impulse response h(t) and (b) transfer function H(f ) of raised cosine filter for zero-ISI, at various values of roll-off factor 𝛼.
in Worked Example 4.16 as
Bocc
⎧ Rs ⎪(1 + 𝛼) , Baseband 2 =⎨ ⎪(1 + 𝛼)Rs , Bandpass ⎩
(12.11)
and it can be seen that at 𝛼 = 1 the (baseband) raised cosine filter has occupied bandwidth Bocc = Rs which is double the bandwidth of the unrealisable Nyquist channel, so that as expected the reduction in the tails of the filter’s impulse response (to make the filter realisable and to improve its tolerance to timing error) is achieved at the expense of an increase in transmission bandwidth. Other noteworthy features of the raised cosine filter include: ●
●
●
The parameter 𝛼 is the roll-off factor which controls the size of the frequency interval f 2 − f1 = 𝛼Rs over which the filter transitions from maximum normalised unit gain in the passband to zero gain in the stopband. 𝛼 = 0 corresponds to the special case of the Nyquist filter having no transition band at all, whereas 𝛼 = 1 gives the full-cosine roll-off filter having the most gradual roll-off or transition that starts with unit normalised gain at f = 0 and reaches zero gain at f = Rs . The raised cosine filter gain response exhibits antisymmetry about the vertical at the Nyquist bandwidth Rs /2. This means that starting at f = Rs /2 the gain response increases by the same amount when one moves to the left along the frequency axis by, say, Δf as it decreases when one moves by Δf to the right. Strictly speaking, the raised cosine filter is unrealisable because it exhibits zero gain or infinite attenuation at f = f 2 and beyond. However, its gradual roll-off makes the raised cosine filter characteristic easier to approximate than the ideal Nyquist filter using a realisable tapped delay line (also known as finite-duration impulse response (FIR)) filter. The required number of taps and hence filter complexity increase as 𝛼 decreases.
12.2 Anti-ISI Filtering ●
Putting 𝛼 = 1 in Eq. (12.9) and simplifying yields the impulse response of the full-cosine roll-off filter as h(t) =
sin(2𝜋t∕Ts ) 2𝜋t∕Ts [1 − 4(t∕Ts )2 ]
(12.12)
This has zero crossings at 2𝜋t∕Ts = n𝜋, t=n
Ts , 2
n = · · · , ±3, ±2, 2, 3, · · · or
n = · · · , ±3, ±2, 2, 3, · · ·
which means that, beyond the main lobe, this pulse has double the zero-crossing rate of the sinc pulse, a feature which is very useful in extracting timing information on T s for synchronisation.
12.2.3 Square Root Raised Cosine Filtering A practical arrangement for zero-ISI transmission is as shown in Figure 12.6a. We assume that equalisation has been employed to eliminate channel distortion as illustrated in Figure 4.59 and discussed in Section 4.7.4. The channel is therefore shown as distortionless. Figure 12.6a is applicable to both baseband and bandpass transmission systems. In a bandpass system the line coder and line decoder blocks are replaced by a modulator and a demodulator, respectively, as shown. The transmit and receive filters serve fundamentally different roles, the former being required for shaping the pulses generated by the line coder or modulator in order to reduce ISI, whereas the latter helps to reduce the impact of noise on symbol detection. Assuming (as before) for convenience that the line coder produces a weighted impulse in each signalling interval, the output of the transmit filter in each interval is the impulse response hx (t) also proportionately weighted. This pulse passes through the distortionless channel and arrives along with additive white Gaussian noise (AWGN) w(t) at the input of the receive filter. We show in Section 12.4 that the optimum detection of hx (t) in the presence of w(t) is achieved when the receive filter is a matched filter having an impulse response that is a time-reversed (a) Data in
Transmit filter hx(t) ⇔ Hx(f)
Line coder
Distortionless channel Noise, w(t)
or
Receive y(t) filter hy(t) ⇔ Hy(f) t = nTs
or
Modulator (BP system)
Demodulator (BP system)
(b) Data in
Data out Line decoder
Distortionless RRC filter
Line coder
He(f) Noise, w(t)
or Modulator (BP system)
Figure 12.6
Hc(f)
RRC filter
y(t)
Line decoder
Data out
t = nTs or Demodulator (BP system)
Arrangement for zero-ISI transmission in baseband or bandpass (BP) systems.
771
772
12 Pulse Shaping and Detection
and delayed version of hx (t). That is (12.13)
hy (t) = hx (Ts − t)
Taking the FT of both sides, noting (from Eq. (4.86)) that time-reversing a real signal has the effect of complex-conjugating its FT and (from Eq. (4.80)) that a delay of T s in the time domain corresponds to a factor exp(−j2𝜋fT s ) in the frequency domain, we obtain the gain response of the receive filter as |Hy (f )| = |Hx∗ (f ) exp(−j2𝜋fTs )| = |Hx∗ (f )||exp(−j2𝜋fTs )| = |Hx (f )|
(12.14)
Normalising the transmission path to unit gain and zero delay and ignoring w(t) – since our focus in this section is solely on ISI, the output pulse y(t) at the sampling point of the receiver in response to an impulse 𝛿(t) generated by the line coder is y(t) = 𝛿(t) ∗ hx (t) ∗ hy (t) = hx (t) ∗ hx (Ts − t) ≡ h(t)
(12.15)
where we have used the fact – see Eq. (4.92) – that convolving a signal with 𝛿(t) leaves the signal unchanged, and that for zero ISI, y(t) must be the raised cosine filter impulse response h(t) given in Eq. (12.9) with transfer function H(f ) given in Eq. (12.10). Taking the FT of Eq. (12.15) and retaining the magnitude yields |H(f )| = |F[hx (t) ∗ hx (Ts − t)]| = |Hx (f )Hx∗ (f ) exp(−j2𝜋fTs )| = |Hx (f )|2
(12.16)
Thus, in view of Eq. (12.14), we may state that √ |Hx (f )| = |Hy (f )| = |H(f )| √ ≡ Raised cosine filter gain response ≡ |HRRC (f )|
(12.17)
That is, the requirements of zero ISI and optimum detection in the presence of white noise may be jointly satisfied by using a pair of identical square root raised cosine (RRC) filters, one at the transmitter to shape the output pulses of the line coder and the other as the matched filter at the receiver. Figure 12.6b shows the transmission block diagram based on this remarkable result. Using Eq. (12.10) in Eq. (12.17) yields the transfer function H RRC (f ) of the RRC filter as ⎧1, |f | ≤ f1 ⎪ ) ( ⎪ 1 𝜋 |f | − f1 , f1 ≤ |f | ≤ f2 HRRC (f ) = √ × ⎨cos 2 f2 − f1 Rs ⎪ ⎪0, |f | ≥ f2 ⎩ f1 = (1 − 𝛼)Rs ∕2;
f2 = (1 + 𝛼)Rs ∕2;
0≤𝛼≤1
The inverse FT of this expression gives the impulse response hRRC (t) as follows √ ( [ ( ) )] 1∕ Ts t t 4𝛼 hRRC (t) = cos 𝜋(1 + 𝛼) (1 − 𝛼)sinc (1 − 𝛼) + Ts 𝜋 Ts 1 − (4𝛼t∕Ts )2
(12.18)
(12.19)
The RRC filter impulse response and transfer function are plotted in Figure 12.7, and, like the raised cosine type, may be closely approximated using a tapped delay line. Notice that, except for the case 𝛼 = 0, the zero crossings of
12.2 Anti-ISI Filtering
hRRC(t)
1.3 Rs
α=1
(a) α=0 α = 0.5 t
0 –1.3 Rs –3Ts
–2Ts
–Ts
0
Ts
2Ts
3Ts
HRRC(f)
1/ Rs
α=0 α = 0.5
(b)
α=1
0 –Rs
f –Rs/2
0
Rs/2
Rs
Figure 12.7 (a) Impulse response hRRC (t) and (b) transfer function HRRC (f ) of root raised cosine filter at various values of roll-off factor 𝛼.
the RRC impulse response are not at integer multiples of T s . Therefore, a single RRC filter cannot eliminate ISI, although it has the same bandwidth as its raised cosine counterpart. RRC filters must be used in pairs, as shown in the block diagram of Figure 12.6b so that after passing through both filters a transmitted pulse will arrive at the sampling point of the receiver having been filtered by the equivalent of one raised cosine filter causing it to have zero crossings at nonzero integer multiples of the sampling interval T s , which therefore averts ISI. A few words are in order on how the channel equaliser H e (f ) shown in Figure 12.6b might influence the specification of the transmit and receive filters. Channel attenuation tends to increase with frequency, so the equaliser would need to have a gain that increases with frequency in order to compensate. If that is the case then the noise reaching the receive filter, having passed through the equaliser, will no longer be white but will be ‘coloured’, having an amplitude spectrum that increases with frequency. Under this condition, a matched filter is one that attenuates the higher-frequency components more drastically in order to reduce noise, maximise SNR, and optimise symbol detection. The RRC receive filter must therefore be modified accordingly. However, this will also attenuate the desired pulse energy at these frequencies, so the RRC transmit filter must be modified to proportionately boost the high-frequency components of the pulse in preparation for their increased attenuation in the receive filter. In this way the combination of transmit and receive filters still yields a raised cosine filter and hence ISI-free pulses at the decision point, and there is also optimum performance in the presence of noise, but the transmit and receive filters are no longer identical RRC filters, as specified in Eq. (12.18). It is worth noting that the well-known scheme of pre-emphasis and de-emphasis employed in analogue frequency modulation (FM) transmission (Chapter 8) is based on the above principle. Noise at the output of the frequency discriminator is coloured, increasing as the square of frequency. So, an LPF (called a de-emphasis filter) is placed at the output of the FM demodulator to attenuate high-frequency components more drastically in order to reduce noise. And in preparation for this ‘controlled distortion’ of the message signal, a highpass filter (called
773
774
12 Pulse Shaping and Detection
a pre-emphasis filter) is used at the transmit end to proportionately boost the high-frequency components of the message.
12.2.4 Duobinary Signalling Note from the outset that this is an entirely different technique that should not be confused with the raised cosine filtering technique discussed above. A raised cosine filter averts ISI by operating on each pulse individually and independently to shape it into having zero-crossings at nonzero integer multiples of sampling interval at the detection point in the receiver. The duobinary signalling technique, on the other hand, averts ISI at the receiver by combining two or more adjacent pulses at the transmitter, thereby introducing a controlled amount of ISI. To borrow from medicine, the technique cures the ISI disease by vaccinating each transmitted pulse with a controlled amount of ISI. The technique of zero-ISI transmission discussed here is variously described in the literature as partial response signalling or correlative coding, and was first introduced by Adam Lender in 1963 [1] as duobinary signalling, the prefix duo- indicating a doubling in bit rate (or operation at half the bandwidth) when compared to binary transmission using a raised cosine filter of roll-off 𝛼 = 1. Our discussion will assume all filtering operation to be based at the transmitter, but actual systems share filtering between the transmitter and the receiver [2]. In this case, a procedure like our presentation on the square RRC filter is followed to derive filter specifications for the transmitter and receiver necessary to achieve the required overall filter characteristic. Different extensions and modifications to the basic duobinary signalling scheme are possible, all based on the same principle of introducing a controlled ISI into the transmitted pulses in order to gain various advantages [3]. We will discuss in detail the basic duobinary implementation which leads to the cosine filter, introduce a variant called modified duobinary that leads to the sine filter, comment briefly on polybinary signalling, and discuss the trade-off involved in this class of zero-ISI techniques. 12.2.4.1 Cosine Filter
Consider the arrangement shown in Figure 12.8a. We wish to determine the impulse response h(t) of the filter comprising the highlighted portion of the block diagram. The filtering process involves summing the pulses in the current and previous signalling intervals and then passing this combined pulse through an ideal LPF. But don’t worry about the ideal LPF because we don’t intend to build it. What we will build is the overall highlighted filter having impulse response h(t) and transfer function H(f ) to be determined by making the input an impulse 𝛿(t) as shown. The input to the ideal LPF is x(t) = 𝛿(t) + 𝛿(t − Ts ) Taking the FT of this equation gives the signal spectrum at the input of the ideal LPF as X(f ) = 1 + exp(−j2𝜋fTs ) Multiplying this by the transfer function of the ideal LPF (which is T s rect(fT s ) – see Eq. (12.7) and Figure 12.4b) – yields the spectrum at the output of the ideal LPF as H(f ) = [1 + exp(−j2𝜋fTs )]Ts rect(fTs ) = Ts rect(fTs ) + Ts rect(fTs ) exp(−j2𝜋fTs )
(12.20)
Noting that H(f ) consists of one rectangular function and another that is scaled by the factor exp(−j2𝜋fT s ) which corresponds to a delay T s in the time domain, we see that the inverse FT, which gives h(t), will comprise an impulse function and a delayed impulse function. That is ( ) ( ) t − Ts t + sinc (12.21) h(t) = sinc Ts Ts
12.2 Anti-ISI Filtering
cosine filter δ(t)
x(t) Ideal LPF
(a)
δ(t – Ts)
Delay Ts h(t)
4
π
h(t) ⇌ H(ƒ)
|H(f)|
2Ts
1
t
0 –4Ts –3Ts –2Ts –Ts
0
Ts
2Ts
3Ts
0 4Ts –Rs
(b) Figure 12.8
f –Rs/2
0
Rs/2
Rs
(c)
The cosine filter: (a) block diagram; (b) impulse response; (c) transfer function.
We may manipulate Eq. (12.20) into the following more familiar form for the gain response of the filter |H(f )| = |[ej𝜋fTs + e−j𝜋fTs ]e−j𝜋fTs Ts rect(fTs )| = |2Ts cos(𝜋fTs )rect(fTs )e−j𝜋fTs | = 2Ts cos(𝜋fTs )rect(fTs ) ⎧2 ⎪ cos(𝜋f ∕Rs ), = ⎨ Rs ⎪0, ⎩
−
R Rs ≤f ≤ s 2 2
(12.22)
Otherwise
These impulse response and gain response functions are plotted in Figure 12.8b,c. The gain of the filter varies according to a cosine function and therefore the filter is known as a cosine filter. It is remarkable that it has exactly the Nyquist bandwidth of Rs /2, with zero gain above this cut-off frequency. However, unlike the Nyquist filter, the cosine filter does not have a sharp transition from passband to stop band so it may be more readily synthesised using a tapped delay line to achieve a very high but noninfinite attenuation beyond f = Rs /2. It can also be seen that, whereas the raised cosine filter impulse response (see Figure 12.5a) is nonzero only at one sampling point nT s = 0, the cosine filter impulse response has nonzero value h(t) = 1 at two sampling points t = nT s for n = 0, 1. This shows that energy in one input pulse has been spread over two signalling intervals, which is indicative of the controlled ISI introduced. Furthermore, unlike the Nyquist filter, the cosine filter impulse response decays rapidly beyond its main lobe, making the transmission system more tolerant to slight errors in the setting of sampling instants. We will discuss the use of a cosine filter to facilitate zero-ISI transmission based on the block diagram shown in Figure 12.9. The precoder section in the transmitter, which performs modulo-2 summing of the current bit and the previous output to produce the current output, is essential in every application of the cosine filter in order to
775
776
12 Pulse Shaping and Detection
XOR
Decoder
Line coder
Precoder
Data in mn
dn
dn–1
Binary bipolar coder
Channel
Cosine y(t) bnδ(t) filter
yn t = nTs
–
Data out
+
mˆ n
+1
Noise
Delay Ts
|yn|
||
Figure 12.9 Block diagram of zero-ISI transmission system using a cosine filter. The pre-coder section is essential and must have the structure shown, but the line coder and corresponding decoder can be of a design other than the ones shown here.
prevent error propagation at the receiver whereby an error in one bit corrupts future decoding decisions. Cosine filters can be used with any M-ary symbol generator, but here we will employ a binary bipolar line coder, which produces an impulse of weight bn = +1 for binary 1 and weight bn = −1 for binary 0. To understand the operation of the system we will examine how the message sequence mn = 100101 is handled, where m1 = 1, m2 = 0, m3 = 0, …, are the message bits in the first, second, third signalling intervals, etc. The processing of this bit sequence by the transmission system is shown in the table of Figure 12.10, where the first column is the signalling interval n; the second column is the message sequence mn ; the third column is the previous output dn−1 of the precoder; the fourth column is the current output dn of the precoder, which is also the line coder input; the fifth column is the line coder output and cosine filter input bn ; y(t) is the waveform at the output of the cosine filter (plotted in the graph of Figure 12.10), which is sampled at t = nT s to obtain the sample yn listed in the seventh column; and the ̂ n . For comparison, we also show the waveform z(t) that would last column is the recovered message sequence m reach the sampling point if the message sequence was transmitted through a raised cosine filter of roll-factor 𝛼 = 1 instead, first by applying it directly to the line coder – a raised cosine filter does not require precoding – and then n 0 1 2 3 4 5 6
mn dn–1 0 (start-up) 0 (pre-set) 1 0 0 1 0 1 1 1 0 0 1 0
dn 0 1 1 1 0 0 1
bn –1 1 1 1 –1 –1 1
2
zn
yn
ˆn m
1 –1 –1 1 –1 1
0 2 2 0 –2 0
1 0 0 1 0 1
y(t) z(t)
1
t
0 –1 –2 Sampling instants n = 0
1
2
3
4
5
6
Figure 12.10 Operation of block diagram of Figure 12.9 for input data = 100101. The waveform reaching the sampling point is y(t) for a cosine filter and z(t) for a raised cosine filter.
12.2 Anti-ISI Filtering
passing the line coder output through the raised cosine filter. This waveform z(t) is sampled at t = nT s to obtain the sequence zn listed in the sixth column of the table. The decision device that converts zn into the recovered message sequence (not shown in Figure 12.10) is a comparator that decides in favour of binary 1 if zn > 0 and in favour of binary 0 otherwise. Still on Figure 12.10, at start-up (signalling interval n = 0), there is no previous output bit d−1 , so we define a start-up bit m0 = 0, which is not part of the message and combine this with d−1 = 0 to obtain d0 = 0. The rest of the precoding then proceeds with message bits mn combined with the previous output (column 3), which is obtained by translating from column 4, as shown by the arrows in the table. The line coder converts dn to normalised impulses −𝛿(t) for dn = 0, and 𝛿(t) for dn = 1. Thus, the impulses have (normalised) weights ±1 given by bn in the Table. This sequence of impulses is processed by the cosine filter to produce the waveform y(t) at the sampling point. Note that the channel does add noise to y(t), which has been ignored in Figure 12.10 since we are here primarily concerned with ISI. Sampling y(t) at sampling instants nT s , n = 1, 2, …, 6, yields yn , which has one of ̂ n = binary 0, but if three possible values: −2, 0, +2. The correct decoding decision rule is that if yn = ±2 then m ̂ n = binary 1. In practice, noise will make yn vary somewhat from these precise values, so the decoder yn = 0 then m is implemented as a rectifier followed by a comparator that compares |yn | to a (normalised) threshold level of +1 and outputs binary 0 if |yn | > 1, and binary 1 otherwise. In this way the original message sequence is correctly recovered. 12.2.4.2 Signal Power Trade-off
We found earlier that we could avert ISI by using a realisable raised cosine filter of roll-off factor 𝛼 > 0. The price we paid then was in using a larger bandwidth B = (1 + 𝛼)Rs /2 than the minimum that is theoretically possible, namely the Nyquist bandwidth Rs /2. However, by using a cosine filter we can operate at Nyquist bandwidth using a realisable filter free from ISI-induced error. There is always a price or trade-off in engineering, so what is the cosine filter trade-off? To understand this trade-off, look again at Figure 12.10 and notice that the raised cosine filter converts a binary input ±1 from a line coder into output z(t) from which the sampled sequence zn is obtained, having two possible values ±1 (normalised). In the presence of noise these two values will of course map into two intervals separated by a threshold level chosen to be equal to the mean level (which would be zero in this case, assuming that binary 1 and binary 0 are equally likely in the message sequence mn ). The cosine filter, on the other hand, converts the same binary input into output waveform y(t) which when sampled yields a ternary sequence yn having values drawn from the alphabet {−2, 0, 2}. In other words, the raised cosine filtering technique maintains binary signalling (or whatever M-ary signalling is delivered to it by the line coder used), whereas the cosine filtering technique transforms binary signalling into ternary, or more generally M-ary signalling into (2M − 1)-ary. This happens because a cosine filter introduces a controlled ISI between the current and previous pulses, so that if these had binary levels ±1 the possible outcomes will be adding two positive pulses, or two negative pulses, or two opposite polarity pulses, which leads to the output levels +2, −2, or 0, respectively. Precoding allows each level to be linked to a unique identity of the bit in the current interval, thus making instantaneous decoding possible as illustrated in Figure 12.10. It was important to explain the transformation of M-ary signalling into (2M − 1)-ary signalling by a cosine filter before stating the consequence, which is that M-ary signalling is more robust to channel noise than (2M − 1)-ary signalling at the same signal power and frequency. You may recall from the discussion of M-ary modulation in Chapter 11 that the states of M-ary amplitude shift keying (ASK), phase shift keying (PSK), and amplitude and phase shift keying (APSK) are closer together in signal space as M increases. This means, for example, that channel noise added to a pulse sent with amplitude in the midpoint of one interval can more readily cause the pulse to be received with amplitude in an adjacent interval, giving rise to bit errors. To maintain the same bit error ratio (BER) as M increases, we must increase signal amplitudes and hence signal power in order to preserve a constant spacing between states. These comments are also applicable to PSK where the states have equal amplitude but different phases and can be treated as being arranged on a circle centred at the origin, with radius equal to normalised signal
777
778
12 Pulse Shaping and Detection
amplitude. As the number of states on the circle increases, the only way to maintain a constant distance between the states is to increase the radius of the circle and hence signal amplitude and power. Cosine filtering therefore eliminates ISI while operating at Nyquist bandwidth but requires more signal power in order to ensure that its ternary transmission (assuming a binary line coder) has the same BER as a zero-ISI binary transmission using a raised cosine filter at larger bandwidth. So this is further evidence of a fact emphasised throughout this book that engineering just does not do free lunches: in building a realisable ISI-free system using a raised cosine filter, the price we pay is more bandwidth than the minimum, but in building the system at the minimum bandwidth based on a cosine filter, the price we pay is more signal power (for the same transmission quality). 12.2.4.3 Sine Filter
Consider a modification to the basic scheme discussed above whereby in Figure 12.8 we increase the delay to 2T s and subtract (rather than add) this delayed element, as shown in Figure 12.11a. The result is a sine filter with impulse response ( ) ( ) t − 2Ts t − sinc (12.23) h(t) = sinc Ts Ts and gain response, derived using a similar manipulation as for the cosine filter, given by ⎧2 ⎪ |sin(2𝜋f ∕Rs )|, |H(f )| = ⎨ Rs ⎪0, ⎩
−
R Rs ≤f ≤ s 2 2
(12.24)
Otherwise
These are shown in Figure 12.11b and c. We see that the sine filter has a spectral null at DC and its output pulse is therefore well suited to capacitor- or transformer-coupled channels that block DC. This sine filter is referred to sine filter δ(t)
–
(a)
h(t) 1
h(t) ⇌ H(ƒ)
δ(t – 2Ts)
Delay 2Ts
|H(f)|
2Ts
t
0
–1 –5Ts –4Ts –3Ts –2Ts –Ts
0
(b) Figure 12.11
x(t) Ideal LPF
+
Ts
0 2Ts 3Ts 4Ts 5Ts –Rs
f –Rs/2
0
Rs/2
(c)
The sine filter: (a) block diagram; (b) impulse response; (c) transfer function.
Rs
12.2 Anti-ISI Filtering
in the literature as a modified duobinary filter. Notice in Figure 12.11 that the impulse response of the sine filter has nonzero value h(t) = ±1 at two sampling points t = nT s for n = 0, 2, which indicates that, in this variant of the scheme, controlled ISI is introduced between each pulse and its next but one subsequent neighbour. The impulse response of the sine filter also decays rapidly beyond its twin main lobe which minimises ISI in the event of small timing errors at the receiver. 12.2.4.4 Polybinary Signalling
Another modification to the basic scheme discussed here leads to so-called polybinary signalling in which the delay element in Figure 12.8 is replaced by a tapped delay line so that controlled ISI is introduced amongst multiple adjacent pulses resulting in multilevel signalling and hence further improvement in bandwidth efficiency without ISI, but at the price of a larger signal power requirement. See reference [4] for a discussion of the application of polybinary signalling in optical transmission systems. Worked Example 12.1
(a) Determine the occupied bandwidth of a 6B4T baseband system that transmits at 139264 kb/s using pulses with a full-cosine roll-off characteristic. (b) If the bandwidth requirement is to be reduced to 60 MHz, calculate the roll-off factor of the raised-cosine filter. (c) Suggest how the same bit rate may be transmitted using less than the bandwidth of the ideal Nyquist channel. (a) We determine in Worked Example 10.6 that the symbol rate for this baseband system is Rs = 92.84 MBd. Pulses with a full-cosine roll-off characteristic are produced by a raised cosine filter with roll-off factor 𝛼 = 1, which, from Eq. (12.11) for a baseband system has bandwidth Bocc = (1 + 𝛼)
R Rs = (1 + 1) s = Rs = 92.84 MHz 2 2
(b) Making 𝛼 the subject of Eq. (12.11) and putting Bocc = 60 MHz and Rs = 92.84 MBd, we obtain 𝛼=
2Bocc 2 × 60 −1 −1= Rs 92.84
= 0.293 (c) If an ideal Nyquist channel (𝛼 = 0) is used then the bandwidth can be reduced to Rs /2 = 46.42 MHz. This is the minimum bandwidth required to transmit at Rs = 92.84 MBd. The only way to reduce the bandwidth any further without incurring significant impairment due to ISI is by reducing the symbol rate, and this would require changing the coding scheme if the same bit rate Rb is to be maintained. In the 6B4T system, each ternary symbol carries 6/4 = 1.5 bits. We must adopt a scheme in which each symbol carries k > 1.5 bits. That is, we represent a block of k bits using one code symbol. Clearly, we require M = 2k unique symbols to cover all possible k-bit input blocks. Following the naming convention for block codes (Chapter 10), this coding scheme may be identified as kB1M, where M denotes M-ary (just as T denotes ternary in, e.g. 6B4T). In general, using M unique symbols (sometimes called levels) allows us to represent up to k = log2 (M) bits per symbol, which gives a symbol rate Rs =
Bit Rate Rb ≡ k k
(12.25)
779
780
12 Pulse Shaping and Detection
For example, if M = 16, we have k = 4, and a symbol rate Rs = Rb ∕4 = 139264000∕4 = 34.82 MBd can be used in this example. With this coding scheme, we can (ideally) transmit at 139264 kb/s using a transmission bandwidth of only 17.41 MHz. In theory, we can reduce transmission bandwidth indefinitely by correspondingly increasing k. So why, you must be asking, are kB1M block codes (called M-ary modulation in modulated systems) not used in baseband systems? There are two main reasons. Codec complexity increases dramatically, and the unique code symbols become so close in identity that it is difficult to correctly detect them at the receiver in the presence of noise. As a result, symbol error becomes more frequent and excessive signal power may be required to bring errors down to an acceptable level. The impact is even more severe because each symbol error potentially affects not just one bit but up to k bits. See Section 11.11 for an in-depth discussion of these trade-offs in the context of M-ary modulated systems.
12.3 Information Capacity Law Equation (12.3) suggests that we can increase bit rate indefinitely simply by increasing M. For example, a baseband channel of bandwidth B = 3.1 kHz can be made to support a bit rate of 62 kb/s by choosing M = 1024 symbols, so that Eq. (12.3) yields Rbmax = 2 × 3.1 × log2 1024 = 62 kb∕s. The bit rate could be doubled to 124 kb/s in the same channel by increasing M to 1 048 576. You are right to wonder whether there is no restriction on the maximum possible value of M. Consider Figure 12.12 for the case where the M symbols differ only in amplitude between 0 and A. In Figure 12.12, the receiver associates any symbol having amplitude within a shaded interval with the k bits assigned to that interval. As M increases, these amplitude intervals become smaller. For a fixed value of A, the symbols would remain distinguishable as M → ∞ only if the channel were noiseless and the receiver had the capability to distinguish between infinitesimally close levels. We may use the following intuitive argument to determine the limiting effect of noise on bit rate. With a noise power Pn at the receiver, the spacing of symbol √ levels cannot be less than the root-mean-square (rms) noise voltage Pn . Otherwise, adjacent symbols would be indistinguishable due to noise. The received signal (of power Ps ) and noise (of power Pn ) are uncorrelated, so √ their powers add to give a range of symbol levels equal to Pn + Ps . With a noise-imposed minimum separation √ Pn between levels, the maximum number of distinguishable symbol levels is therefore √ Pn + Ps √ = 1 + Ps ∕Pn M= √ Pn which gives the maximum number of bits per symbol kmax as √ kmax = log2 ( 1 + Ps ∕Pn ) 1 = log2 (1 + Ps ∕Pn ) 2
(12.26)
Substituting in Eq. (12.3) yields the maximum rate at which information may be transmitted through a channel without error, i.e. with the transmitted information-bearing symbols remaining distinguishable at the receiver so that error-free operation is possible. This maximum rate is known as the information capacity Rbmax of the system. Thus Rbmax = 2Bkmax = Blog2 (1 + Ps ∕Pn ) ≡ Blog2 (1 + C∕N) b∕s
(12.27)
Equation (12.27) is the celebrated information capacity law, referred to as the Shannon–Hartley law in recognition of the work of Claude Shannon [5] building on the early work of Hartley [6]. A rigorous and complete derivation
12.3 Information Capacity Law
A
M=2
M=4
M=8 000
00 001 0 011 01 Amplitude
010
110 11 111 1 101 10 100 0
M = 16 0000 0001 0011 0010 0110 0111 0101 0100 1100 1101 1111 1110 1010 1011 1001 1000
M = 32
......
00000 00001 00011 00010 00110 00111 00101 00100 01100 01101 01111 01110 01010 01011 01001 01000 11000 11001 11011 11010 11110 11111 11101 11100 10100 10101 10111 10110 10010 10011 10001 10000
Figure 12.12 M distinct states represent log2 M bits per state as shown for amplitude states. Bits allocated to adjacent states differ in only one bit position, an arrangement called Gray coding.
of this equation is presented in Claude Shannon’s seminal paper [5]. It should by now be clear that when transmitting through a bandlimited noisy channel ISI places a limit on symbol rate as stated in Eq. (12.2), whereas noise precludes an unlimited bit rate, which would otherwise have been possible by indefinitely increasing M as per Eq. (12.3). It is the latter that places a limit on bit rate when signal power is finite, as stated in Eq. (12.27), which lays down the rule that governs how bandwidth and signal power may be exchanged in the design of a transmission system affected by noise. Shannon’s channel coding theorem states as follows: There exists a coding scheme which can be used to transmit with a vanishingly small probability of error over a channel of capacity kmax at a rate not greater than kmax bits per symbol. For rates greater than kmax it is not possible by any encoding method to have an arbitrarily small probability of error. Thus, reliable communication through a channel cannot be carried out at a bit rate higher than stipulated by Eq. (12.27). In practice, communication is deemed reliable if the probability of error or BER satisfies a specified threshold, such as BER ≤ 10−4 for voice communication and BER ≤ 10−7 for data. The Shannon–Hartley law indicates that as bandwidth B is increased the signal power required for reliable communication decreases. Letting Eb denote the average signal energy per bit and N o the noise power per unit bandwidth then (see Eq. (11.19) in Section 11.3.1 and Eq. (11.124) in Section 11.11) ER E R C = b bmax ≡ b • bmax N No B No B
781
12 Pulse Shaping and Detection
where we have assumed the best-case (ideal) filtering situation in which occupied bandwidth and noise-equivalent bandwidth are equal and denoted B. The ratio between bit rate and occupied bandwidth is known as the bandwidth efficiency of the system in b/s/Hz, denoted 𝜂 (see Section 11.10.1). Substituting this relation in Eq. (12.27) allows us to state that the Shannon–Hartley law specifies a relationship between Eb /N o and the highest achievable bandwidth efficiency 𝜂 of an error-free transmission system as ( ) Eb Rbmax ≡ 𝜂 = log2 1 + 𝜂 b∕s∕Hz B No Eb 2𝜂 − 1 (12.28) ⇒ = No 𝜂 . With Rb /B defined as bandwidth efficiency, the dimensionless ratio Eb /N o is usually interpreted as giving a measure of the power efficiency of the transmission. The information capacity law therefore shows us how to trade between bandwidth efficiency and power efficiency. The latter is an important consideration, for example in portable transceivers where battery life must be prolonged. The Shannon–Hartley law, in the form of Eq. (12.28), is plotted in Figure 12.13, which shows bandwidth efficiency (b/s/Hz) against Eb /N o in dB. The shaded region of the graph lying above the curve of 𝜂 versus Eb /N o is the region of unattainable information capacity or bandwidth efficiency. We devote the rest of this section to discussing the many implications of the Shannon–Hartley law of Eq. (12.27). Equation (12.27) indicates that, by increasing bandwidth, bit rate Rb can be proportionately increased without degrading noise performance (i.e. without increasing BER) or raising carrier-to-noise ratio C/N. Thus, if the current operating point is at bit rate Rb , noise equivalent bandwidth B, carrier-to-noise ratio C/N, and bit error ratio 32 24
able attain ficiency n u f ef no Regio bandwidth r o y it capac
16 12 Bandwidth Efficiency η, bits/s/Hz
782
r ion fo perat systems o f o n n Regio ransmissio t l a c i t rac
8 6 4 3
p
2 1 0.5
0.25 1/8 1/16
0
Figure 12.13
10
20
30
40 Eb/No, dB
50
60
70
80
Specification of Shannon’s information capacity theorem as bandwidth efficiency in b/s/Hz versus E b /No in dB.
12.3 Information Capacity Law
BER, denoted (Rb , B, C/N, BER), and the bit rate is to be increased by a factor of n at the same C/N and BER then bandwidth B must be increased by the same factor n so that the new operating point is (nRb , nB, C/N, BER). In practice, this change is facilitated by staying with the same modulation or line coding scheme which has a fixed number of bits k per symbol, but increasing the bandwidth by a factor of n in order to allow symbol rate Rs to be increased by the same factor without incurring ISI. In this way, bit rate Rb = kRs is increased by the required factor. It is important, however, to note that to keep C/N constant when B increases by a factor of n, signal power Ps will also have to increase by the same factor n since noise power N ≡ Pn = N o B. Thus, the original operating point is (Rb , B, Ps /N o B, BER), whereas the new one, with a factor of n increase in bit rate at an unchanged carrier-to-noise ratio, is (nRb , nB, nPs /nN o B, BER). Equation (12.27) also indicates that bit rate can be increased while keeping bandwidth and BER fixed, but this requires an exponential increase in C/N. For example, with B = 4 kHz and C/N = 63, we obtain Rbmax = 24 kb/s. To double the bit rate to 48 kb/s at the same bandwidth, we must increase the carrier-to-noise ratio to (C/N)2 = 4095. Note that (C/N)2 = [C/N + 1]2 −1. In general, to increase bit rate by a factor of n without increasing bandwidth or BER, we must move from operating point (Rb , B, C/N, BER) to a new point (nRb , B, [C/N + 1]n −1, BER). This change is carried out in practice by increasing k (the number of bits per symbol) by a factor of n while maintaining the same symbol rate Rs (since bandwidth is fixed). This requires switching to a higher M-ary APSK modulation scheme which necessitates an exponential increase in power if transmitted symbols are to continue to be detected at the same BER as in the lower M-ary scheme. Here, we use APSK to include ASK and PSK but specifically exclude frequency shift keying (FSK). The law does not place any lower limit on the bandwidth required to support any given bit rate. For example, Eq. (12.27) indicates that it is possible to reliably communicate at Rb = 100 kb/s over a 10 Hz bandwidth. The required C/N is obtained by manipulating the equation to obtain 100 × 103 = 10log2 (1 + C∕N) ⇒
3
3
C∕N = 210×10 − 1 ≈ 10log10 (210×10 ) dB ≈ 10 × 10 × 103 log10 (2) = 30,103 dB
And if the bandwidth is further reduced to just 1 Hz then reliable communication at 100 kb/s over this 1 Hz bandwidth is still possible, provided C/N ≥ 301 030 dB, and so on without limit. The practical constraint is obvious. If noise power is −210 dBW/Hz (typical) then received signal power would need to be in excess of 300 820 dBW or 1030 076 MW to allow reliable communication at 100 kb/s using a 1 Hz bandwidth. You would need to install a sun transmitter to get that kind of power, but it would vaporise the entire planet on being turned on! Therefore, while in theory there is no lower limit on required bandwidth, in practice there is an indirect limit due to constraints on achievable or safe or authorised levels of signal power. In general, increasing channel bandwidth allows us to reduce the signal power needed for reliable communication. However, the law does place a lower limit on signal power below which reliable communication is not possible even at infinite transmission bandwidth. This limit is given in terms of Eb /N o and is called the Shannon limit. Since 𝜂 → 0 when B → ∞, this limiting value is given by the value of Eb /N o as 𝜂 → 0. You can do this either graphically or analytically. For the analytic approach, Eq. (12.28) yields Eb || 2𝜂 − 1 = lim | No |B→∞ 𝜂→0 𝜂 2𝜂 ln 2 || = 1 ||𝜂=0 = ln 2 = 0.69315 = −1.6 dB (Shannon limit)
(12.29)
783
784
12 Pulse Shaping and Detection
In the above, the second line is obtained by – applying L’Hôpital’s rule – taking the derivatives (with respect to 𝜂) of the numerator and denominator of the expression under the limit operator. The channel capacity Rblim at this Shannon limit is obtained by evaluating Eq. (12.27) in the limit B → ∞ )] [ ( P Rb lim = lim Blog2 1 + s B→∞ No B )] [ ( P log2 e = lim Bloge 1 + s B→∞ No B ] [ P = B s log2 e No B P = 1.442695 s b∕s (12.30) No A little examination of Figure 12.13 reveals that the benefit gained from a trade-off between bandwidth and signal power depends on the operating point of the transmission system. At low bandwidth efficiency 𝜂, which corresponds to a scenario of plentiful bandwidth and low or scarce signal power, a small increase in Eb /N o yields the benefit of a large increase in 𝜂. But at high bandwidth efficiency, which corresponds to a scenario of scarce bandwidth and plentiful signal power, a small decrease in 𝜂 (which means a little extra bandwidth) produces the benefit of a large reduction in Eb /N o . What this means is that the factor by which bandwidth must be increased to compensate for a given amount of reduction in signal energy per bit depends on the operating point. For example, if Eb /N o is reduced by 1.5 dB, the required increase in bandwidth is a factor of 16 at Eb /N o = 0 dB, but a factor of only ≈ 1.02 at Eb /N o = 60 dB. This also means that the amount of saving in energy per bit achieved by increasing bandwidth depends on the operating point. For example, doubling bandwidth at 𝜂 = 1/8 (by reducing 𝜂 from 1/8 to 1/16) delivers a reduction in Eb /N o of only 0.1 dB, whereas doubling bandwidth at 𝜂 = 32 (by reducing 𝜂 from 32 to 16) allows a reduction in Eb /N o by 45.2 dB. Finally, we must point out that Shannon’s channel coding theorem merely tells us that it is possible to have error-free transmission at the maximum bit rate given by the information capacity law of Eq. (12.27) but does not show us how to design such a system. In practice, all digital transmission systems – both modulated and baseband – fall short of achieving the specified maximum bit rate for a given bandwidth and C/N. Equation (12.27) is nevertheless a useful benchmark against which the performance of practical systems can be measured. Worked Example 12.2 In this worked example, we want to compare three practical transmission systems to the Shannon–Hartley benchmark. You may wish to also read the related Worked Example 6.9. A digital transmission system operates at a message bit rate Rb = 2048 kb/s in AWGN using a raised cosine filter of roll-off factor 𝛼 = 0.1. Reliable communication is set at BER = 10−7 . Determine how each of the following implementations of this system compares with the Shannon–Hartley benchmark of Eq. (12.28) and comment on your results. (a) Using quadriphase shift keying (QPSK) modulation with a modem having modem implementation loss Lmil = 0.5 dB and no error control coding. (b) Using 16-APSK modulation with a modem having modem implementation loss Lmil = 1 dB and no error control coding. (c) Using 16-FSK modulation with a modem having modem implementation loss Lmil = 1 dB and no error control coding. (d) Using 16-APSK modulation with a modem having modem implementation loss Lmil = 1 dB along with error control coding of code rate r = 1/2 and coding gain Gc = 10.5 dB.
12.3 Information Capacity Law
(a) For QPSK modulation, M = 4, so symbol rate Rs = Rb ∕log2 M = 2048∕2 = 1024 kBd Occupied bandwidth Bocc = (1 + 𝛼)Rs = (1 + 0.1)Rs = 1126.4 kHz And bandwidth efficiency 𝜂 = Rb ∕Bocc = 2048∕1126.4 = 1.8182 Using Eq. (12.28), the minimum required value of Eb /N o for operating at this bandwidth efficiency is determined as (Eb ∕No )min =
2𝜂 − 1 21.8182 − 1 = = 1.3895 = 1.43 dB 𝜂 1.8182
We determine the Eb /N o value required by our QPSK system, denoted (Eb /N o )QPSK , by reading the BER versus Eb /N o curve of Figure 11.45 for QPSK at BER = 10−7 (which gives Eb /N o = 11.3 dB) and adding the extra value of 0.5 dB required due to modem implementation loss. Thus, (Eb /N o )QPSK = 11.8 dB. Since the Shannon–Hartley law specifies a minimum Eb /N o of 1.43 dB for this bandwidth efficiency, we conclude that our QPSK is less power efficient than the benchmark specification by ΔP = (Eb ∕No )QPSK − (Eb ∕No )min = 11.8 − 1.43 = 10.4 dB (b) For 16-APSK modulation, M = 16, so symbol rate Rs = Rb ∕log2 M = 2048∕4 = 512 kBd Occupied bandwidth Bocc = (1 + 𝛼)Rs = (1 + 0.1)Rs = 563.2 kHz And bandwidth efficiency 𝜂 = Rb ∕Bocc = 2048∕563.2 = 3.6364 Equation (12.28) yields the minimum required value of Eb /N o as (Eb ∕No )min =
2𝜂 − 1 23.6364 − 1 = = 3.1447 = 4.98 dB 𝜂 3.6364
We determine the Eb /N o value required by our 16-APSK system, denoted (Eb /N o )16APSK , by reading the BER versus Eb /N o curve of Figure 11.52 for 16-APSK at BER = 10−7 (which gives Eb /N o = 15.2 dB) and adding the extra 1 dB required due to modem implementation loss. Thus, (Eb /N o )16APSK = 16.2 dB. We conclude that our 16-APSK system is less power efficient than the benchmark specification by ΔP = (Eb ∕No )16APSK − (Eb ∕No )min = 16.2 − 4.98 = 11.2 dB (c) For 16-FSK modulation, M = 16, so symbol rate Rs = Rb ∕log2 M = 2048∕4 = 512 kBd Occupied bandwidth is determined using Eq. (11.84) as ) ( M+1 + 𝛼 Rs = (8.5 + 0.1) × 512 = 4403.2 kHz Bocc = 2 And bandwidth efficiency 𝜂 = Rb ∕Bocc = 2048∕4403.2 = 0.4651
785
786
12 Pulse Shaping and Detection
Equation (12.28) yields the minimum required value of Eb /N o as (Eb ∕No )min =
2𝜂 − 1 20.4651 − 1 = = 0.818 = −0.87 dB 𝜂 0.4651
We determine the Eb /N o value required by our 16-FSK system, denoted (Eb /N o )16FSK , by reading the BER versus Eb /N o curve of Figure 11.48 for 16-FSK at BER = 10−7 (which gives Eb /N o = 8.9 dB) and adding the extra 1 dB due to modem implementation loss to obtain (Eb /N o )16FSK = 9.9 dB. We conclude that our 16-FSK system is less power efficient than the benchmark specification by ΔP = (Eb ∕No )16FSK − (Eb ∕No )min = 9.9 − (−0.87) = 10.8 dB (d) For 16-APSK modulation with error control coding of code rate r = 1/2, the bit rate is increased by the inserted redundant bits to Rbc = Rb × 1/r = 4096 kb/s, so the symbol rate is now Rs = Rbc ∕log2 M = 4096∕4 = 1024 kBd Occupied bandwidth Bocc = (1 + 𝛼)Rs = (1 + 0.1) × 1024 = 1126.4 kHz And bandwidth efficiency 𝜂 = Rb ∕Bocc = 2048∕1126.4 = 1.8182 Equation (12.28) yields the minimum required value of Eb /N o as (Eb ∕No )min =
2𝜂 − 1 21.8182 − 1 = = 1.3895 = 1.43 dB 𝜂 1.8182
We determine the Eb /N o required by this error control coded 16-APSK system, denoted (Eb /N o )c16APSK , by reading the BER versus Eb /N o curve of Figure 11.52 for 16-APSK at BER = 10−7 (which gives Eb /N o = 15.2 dB), adding the extra 1 dB due to modem implementation loss and subtracting the coding gain of 10.5 dB. Thus (Eb ∕No )c16APSK = (Eb ∕No )theoretical + Lmil − Gc = 15.2 + 1 − 10.5 = 5.7 dB We conclude that our error control coded 16-APSK system is less power efficient than the benchmark specification by ΔP = (Eb ∕No )c16APSK − (Eb ∕No )min = 5.7 − 1.43 = 4.27 dB Examining the above results, we note that the use of error control coding enabled us to approach closer to the benchmark level of power efficiency than is possible with modulation without error control coding. To compare the bandwidth and power trade-off involved in each implementation, we will use the uncoded 16-APSK system as a reference and will need the value of C/N o for each system, which is obtained from the equation (see Eq. (11.63)) C∕No = (Eb ∕No )theoretical + 10log10 (Rb ) + Lmil − Gc
(12.31)
Substituting relevant values, noting that Gc = 0 for the first three systems and Rb in the fourth system is the coded bit rate (= 4096 kb/s), we obtain (C∕No )QPSK = 74.9 dB;
(C∕No )16APSK = 79.3 dB;
(C∕No )16FSK = 73.0 dB;
(C∕No )c16APSK = 71.8 dB
Thus, switching from (uncoded) 16-APSK to (uncoded) QPSK enabled a signal power saving of 79.3−74.9 = 4.4 dB at the price of a factor of 2 increase in bandwidth (from 563.2 to 1126.4 kHz), whereas switching from (uncoded) 16-APSK to the coded 16-APSK having the above specified capability (r = 1/2; Gc = 10.5 dB) enabled
12.4 The Digital Receiver
a larger saving in signal power of 79.3−71.8 = 7.5 dB at the same price of a factor of 2 increase in bandwidth. Switching from (uncoded) 16-APSK to (uncoded) 16-FSK enabled a power saving of 79.3−73 = 6.3 dB but at the price of a factor of 7.8 increase in bandwidth (from 563.2 to 4403.2 kHz). Error control coding in general facilitates a more efficient exchange between bandwidth and signal power than is possible by only switching modulation schemes.
12.4 The Digital Receiver We establish in Section 12.2 that at the decision point in the receiver the received pulse, although spread by the transmission medium and of a longer duration than it had when generated at the transmitter, must have the right waveform or shape (such as a since pulse) to avoid ISI. The focus of Section 12.2 is on mitigating the impact of pulse spreading caused by a finite system bandwidth. In addition to spreading, the transmission medium or channel does distort the transmitted pulse in both amplitude and phase in a manner fully described by the channel transfer function H c (f ). Furthermore, noise is also added to the transmitted pulse in the transmission medium and the front end of the receiver. This section briefly considers the measures of equalisation, matched filtering, and clock extraction used at the receiver to optimise the detection of a transmitted pulse sequence in the presence of channel distortion and noise.
12.4.1 Adaptive Equalisation In nearly every case, the channel effect is undesirable and, as discussed in Section 4.7.4, a filter known as an equaliser is needed at the receiver to remove the channel distortion. That is, we must have Hc (f )He (f ) = 1 (normalised)
(12.32)
where H e (f ) is the transfer function of the equaliser as shown in the block diagram of Figure 12.6b. Note that Eq. (12.32) specifically excludes propagation delay 𝜏, which can be accounted for by inserting the factor exp(−j2𝜋f𝜏) in the right hand side. In most cases, as in the public switched telephone network (PSTN), the channel characteristic and hence H c (f ) is time-varying in a nondeterministic way. The equalisation then has to be adaptive, with H e (f ) constantly and automatically adjusting itself to satisfy Eq. (12.32). One way of achieving this is by using a tapped-delay-line filter (such as in Figure 10.16). In this case, however, the tap gains are adjustable to minimise the mean squared error between expected and received pulse shapes averaged over several sampling instants. This can be done with the help of a training sequence of bits transmitted prior to the information bits. The optimum values computed for the tap gains are then maintained for the duration of the call on the assumption that the channel does not change significantly during this (short) period.
12.4.2 Matched Filter Let us now consider the effect of noise in order to correctly specify the transfer function H y (f ) of the receiver filter indicated in the block diagram of Figure 12.6. The noise w(t) indicated in this block diagram includes contributions from both the channel and the front end of the receiver. We will assume that this noise is AWGN. As discussed in detail in Chapter 6, the noise is described as additive because it is present in the same amount regardless of signal value. It is white because it contains all frequency components at the same amplitude (i.e. it has a uniform amplitude spectrum, or equivalently, a uniform power spectral density), just as white light contains equal amounts of all colours. Finally, it is described as Gaussian because it has a normal (i.e. Gaussian) probability density function.
787
788
12 Pulse Shaping and Detection
x(t)
g(t) ⇌ G(f)
(a)
Matched filter h(t) ⇌ H(ƒ)
y(t) ⇌ Y(f) go(t) + ñ(t)
w(t) g(t)
g(–t)
y(Ts) = go(Ts) + ñ(Ts) t = Ts
g(Ts – t) ≡ h(t)
h(–τ)
h(t – τ) ≡ g(τ+ Ts – t)
(b) t Ts
t
t Ts
–Ts Delay by Ts
Time reversal
τ
τ Ts–t
–Ts
Delay by t
Time reversal; switch to τ-axis
Ts
t
Correlator gi(t)
(c)
∫0
Ts
go(Ts)
g(t)
Figure 12.14 Optimum detection of a signal g(t) in the presence of additive white Gaussian noise (AWGN): (a) the matched filter; (b) graphical illustration of relationship between input pulse g(t) and matched filter impulse response h(t); (c) correlation implementation of matched filter.
The receive filter must be designed to minimise the effect of noise. This is accomplished by maximising the ratio between signal power and noise power at the receiver output at the decision instant T s . The situation is as illustrated in Figure 12.14a in which a pulse g(t) has AWGN w(t) added to it. Our task is to specify a matched filter which will produce an output y(t) that has a maximum SNR. This ensures that the sample of y(t) taken at t = T s will contain the smallest possible amount of noise perturbation ñ(T s ). We derive the matched filter’s transfer function H(f ) and impulse response h(t) required to optimise the detection of the pulse in the presence of AWGN and then discuss a practical implementation approach using correlation processing. 12.4.2.1 Specification of a Matched Filter
The transfer function H(f ) of the matched filter in Figure 12.14a may be obtained using a heuristic approach involving three increasingly prescriptive observations. Once H(f ) has been specified, the impulse response h(t) is obtained by taking the inverse FT. 1. The bandwidth of the filter must be just enough to pass the incoming pulse g(t). If filter bandwidth is too wide, noise power is unnecessarily admitted, and if it is too narrow then some pulse energy is blocked. Thus, the filter transfer function H(f ) must span the same frequency band as the FT G(f ) of the pulse g(t). How should the spectral shape of |H(f )| compare to |G(f )|? 2. The gain response |H(f )| of the filter should not necessarily be flat within its passband. Rather, it should be such that the filter attenuates the white noise significantly at those frequencies where G(f ) is small – since these frequencies contribute little to the signal energy. And the filter should boost those frequencies at which G(f ) is large in order to maximise the output signal energy. Therefore, the filter should be tailored to the incoming pulse, with a gain response that is small where G(f ) is small and large where G(f ) is large. In other words, the gain response of the filter should be identical in shape to the amplitude spectrum of the pulse. That is |H(f )| = K|G(f )| where K is some constant.
(12.33)
12.4 The Digital Receiver
3. To complete the specification of the filter, its phase response is required. The filter output y(t) may be written as ̃(t) y(t) = go (t) + n
(12.34)
where go (t) is the signal component and ñ(t) is coloured noise – coloured because after white noise passes through a filter its amplitude spectrum is no longer flat but is stronger at some frequencies than at others. The maximum instantaneous output signal power occurs at the sampling instant t = T s if every frequency component (i.e. cosine function) in go (t) is delayed by the same amount T s and has zero initial phase so that go (t) = A1 cos[2𝜋f1 (t − Ts )] + A2 cos[2𝜋f2 (t − Ts )] + A3 cos[2𝜋f3 (t − Ts )] + · · ·
(12.35)
which results in the maximum possible signal sample at t = T s given by go (Ts ) = A1 + A2 + A3 + · · · where A1 , A2 , A3 , …, are the amplitudes of the sinusoidal components of go (t) of respective frequencies f 1 , f 2 , f 3 , … Note that, in practice, these frequencies will be infinitesimally spaced, giving rise to a continuous spectrum Go (f ). Rewriting Eq. (12.35) in the form go (t) = A1 cos(2𝜋f1 t − 2𝜋f1 Ts ) + A2 cos(2𝜋f2 t − 2𝜋f2 Ts ) + A3 cos(2𝜋f3 t − 2𝜋f3 Ts ) +··· makes it clear that the phase spectrum of go (t) is 𝜙o (f ) = −2𝜋fTs From the discussion in Chapter 4, this output phase response is the sum of the input signal’s phase spectrum 𝜙g (f ) and the filter’s phase response 𝜙H (f ). That is, 𝜙o (f ) = 𝜙g (f ) + 𝜙H (f ), which means that 𝜙H (f ) = 𝜙o (f ) − 𝜙g (f ), and hence 𝜙H (f ) = −2𝜋fTs − 𝜙g (f )
(12.36)
Equation (12.33) gives the required filter gain response and Eq. (12.36) gives its phase response. What remains is for us to combine these two into a single expression for the filter’s transfer function H(f ) and then take the inverse FT of H(f ) to obtain the filter’s impulse response h(t) H(f ) = K|G(f )| exp[j𝜙H (f )] = K|G(f )| exp[−j𝜙g (f )] exp(−j2𝜋fTs ) = KG∗ (f ) exp(−j2𝜋fTs )
(12.37)
where * denotes complex conjugation, and we have used the fact that G(f ) = |G(f )| exp[j𝜙g (f )];
G∗ (f ) = |G(f )| exp[−j𝜙g (f )]
Noting (from Eqs. (4.84) and (4.86)) that complex conjugation of G(f ) corresponds to a time reversal of the real signal g(t) to give g(−t); and (from Eq. (4.80)) that multiplying G*(f ) by the exponential term exp(−j2𝜋fT s ) corresponds to delaying g(−t) by T s , we see that the inverse FT of Eq. (12.37) yields h(t) = Kg(Ts − t)
(12.38)
This is an important result which states that if the detection filter is to be matched to an AWGN-corrupted pulse g(t) of duration T s – matched in the sense of maximising the output signal power when compared to the noise
789
790
12 Pulse Shaping and Detection
power at the decision instant t = T s – then the filter must be designed with an impulse response that is (except for a nonzero positive scale factor K) a time-reversed replica of the pulse, delayed by the sampling interval T s . See Figure 12.14b for a graphical illustration of the process of converting g(t) into g(T s −t) for any given value of T s . It may also be useful to review the material on time shifting and time reversal in Sections 3.2.1 and 3.2.2 for more in-depth information. Note that here and elsewhere in this chapter the term delay is used with reference to time t and amounts to a rightward translation in the +t axis direction, which corresponds to an advance with reference to −t and the −t axis direction. 12.4.2.2 Matched Filter by Correlation
It is informative to consider how the matched filter relates to the correlation receiver employed in Chapter 11 for coherent detection of modulated pulses. A matched filter designed to detect a known pulse g(t) will have the impulse response h(t) given by Eq. (12.38). As discussed in Section 3.6, the output go (t) of this filter in response to an input pulse gi (t) is obtained by convolving gi (t) with the impulse response of the filter. Thus go (t) = gi (t) ∗ h(t) ∞
=
gi (𝜏)h(t − 𝜏)d𝜏
∫−∞ Ts
=
∫0
gi (𝜏)g(Ts − t + 𝜏)d𝜏
(12.39)
where the second line is merely the definition of the convolution integral, and in the last line we make use of the fact that the input pulse is of duration T s to narrow the range of integration, and we substitute h(t) ≡ g(T s − t), ignoring the scale factor K or normalising it to unity. Refer to Figure 12.14b for a graphical aid in seeing how h(t − 𝜏) becomes g(T s − t + 𝜏). You will notice the series of steps that transform the known pulse g(t) for which the matched filter is designed, shown as the leftmost waveform of Figure 12.14b, to h(t − 𝜏), the rightmost waveform. By comparing these two waveforms you will be able to see that the rightmost is simply the first plotted on a 𝜏 axis and made to start earlier (i.e. advanced) by T s − t, hence it is g(𝜏 + T s − t). Sampling the matched filter output at t = T s simply means replacing t by T s on both sides of Eq. (12.39). This yields the result Ts
go (Ts ) =
∫0
gi (𝜏)g(Ts − Ts + 𝜏)d𝜏
Ts
=
∫0
gi (𝜏)g(𝜏)d𝜏
(12.40)
The right-hand side of this equation shows that to obtain go (T s ) – which is the matched filter’s output sampled at t = T s – we may use an arrangement that multiplies the incoming pulse gi (t) by the known pulse g(t) and then integrates the product over the pulse interval 0 → T s . This process is known as the correlation of gi (t) with g(t). A block diagram of this implementation of the matched filter is shown in Figure 12.14c. Note therefore that the coherent demodulator (Figure 7.30), correlation receiver (Figures 11.8c and 12.14c), and matched filter (Figure 12.14a) in fact perform equivalent operations, as stated in Chapter 11. This equivalence is further emphasised in Figure 12.15. When comparing Figure 12.15a,c, recall that an integrator is just a special LPF – with a gain response that decreases linearly with frequency. In Eqs. (12.39) and (12.40), time t is measured from the start of each pulse interval, hence the range 0 to T s even though we are in the nth pulse interval, where n can take on any positive integer value. The correlator performs an integrate-and-dump operation whereby an accumulator is reset to zero at the start of each pulse interval, and the product of gi (t) and g(t) is then accumulated for a period T s at the end of which the accumulator contains the result go (T s ) for that interval. This result is passed to a decision device and the accumulator is immediately reset to zero ready to repeat the same computation in the next pulse interval. It is worth emphasising that the various descriptions of a demodulator as coherent, a carrier or pulse as known, and a filter as matched to a known pulse
12.4 The Digital Receiver
Figure 12.15 Equivalence between (a) correlation receiver; (b) matched filter; (c) coherent demodulator.
Correlation receiver (a)
Received pulse
Ts
∫0
gʹ(t)
go(Ts)
g(t) (known pulse)
(b)
Matched filter
Received pulse
h(t) = g(Ts – t)
gʹ(t)
go(t) sample at t = Ts
go(Ts)
Coherent demodulator (c)
Modulated signal
LPF
Message signal
Known carrier
imply accurate knowledge of the phase and frequency of the incoming carrier or pulse. This information is usually obtained by clock extraction as discussed after the following worked examples. 12.4.2.3 Matched Filter Worked Examples
Worked Example 12.3 Matched Filter Impulse Response: Graphical Approach We wish to sketch the impulse response of a matched filter for receiving the pulse g(t) shown in Figure 12.3a, where the pulse duration T s = 5 μs. Further discussion of the operations of time delay and time reversal is available in Sections 3.2.1 and 3.2.2. The required impulse response is given by Eq. (12.38) with K = 1. This can be obtained in two steps. First, the pulse g(t) is time-reversed to give g(−t). Then, g(−t) is delayed by T s to give g(T s − t), which is the required impulse response. The waveforms g(−t) and g(T s − t) are sketched in Figure 12.3b,c. It is important to understand how these two waveforms are obtained. Observe that the waveform of g(−t) may be obtained simply by flipping g(t) horizontally about t = 0, and that the waveform g(T s − t) results from delaying g(−t) by a time T s . In this case, since g(−t) ‘starts’ at the time t = −5 μs, it follows that g(T s − t) must ‘start’ at a time T s (= 5 μs) later, which is therefore the time t = 0. Table 12.1 provides verification of the above procedures. Noting that g(ti ) is the value of the pulse g(t) at t = ti , it follows by definition of g(t) in Figure 12.16a that g(−10) = 0, g(−5) = 0, g(1) = 1, g(4) = 0.5, g(10) = 0, and so on, where t is in μs. Table 12.1 gives values of the waveforms g(t), g(−t), and g(T s − t) at various values of t. For example, at t = 4, g(t) = g(4) = 0.5 (by definition); g(−t) = g(−4) = 0 (by definition); and g(T s − t) = g(5−4) = g(1) = 1 (by definition). Plotting the entries of this table leads to Figure 12.16, with column 3 plotted against column 1 to give Figure 12.16b, and column 4 plotted against column 1 to give Figure 12.16c.
791
792
12 Pulse Shaping and Detection
Table 12.1
Worked Example 12.3. Entries plotted in Figure 12.16. T s = 5 μs.
t (𝛍s)
g(t)
g(−t)
hr (t) = g(T s − t)
−10
0
= g(10) = 0
= g(5 − (−10)) = g(15) = 0
−5
0
= g(5) = 0
= g(5 − (−5)) = g(10) = 0
−4
0
= g(4) = 0.5
= g(5 − (−4)) = g(9) = 0
−3
0
= g(3) = 1
= g(5 − (−3)) = g(8) = 0
−2
0
= g(2) = 1
= g(5 − (−2)) = g(7) = 0
−1
0
= g(1) = 1
= g(5 − (−1)) = g(6) = 0
0
0
= g(0) = 0
= g(5 − 0) = g(5) = 0
1
1
= g(−1) = 0
= g(5 − 1) = g(4) = 0.5
2
1
= g(−2) = 0
= g(5 − 2) = g(3) = 1
3
1
= g(−3) = 0
= g(5 − 3) = g(2) = 1
4
0.5
= g(−4) = 0
= g(5 − 4) = g(1) = 1
5
0
= g(−5) = 0
= g(5 − 5) = g(0) = 0
10
0
= g(−10) = 0
= g(5 − 10) = g(−5) = 0
g(t) 1.0 (a)
0.5
–5
–4
–3
–2
–1
0
1
2
3
4
5
2
3
4
5
t, μs
g(–t) 1.0 0.5
(b)
–5
–4
–3
–2
–1
0
1
h(t) = g(Ts – t) 1.0 (c)
t, μs
Ts = 5 μs
0.5
–5 Figure 12.16
–4
–3
–2
–1
Worked Example 12.3.
0
1
2
3
4
5
t, μs
12.4 The Digital Receiver
Worked Example 12.4 Matched Filter Output: Graphical Approach We wish to determine the output pulse go (t) that is obtained at the decision point of the receiver when the transmitted pulse g(t) in the previous example (Figure 12.16a) is detected using a matched filter of impulse response h(t). See a more detailed discussion of the convolution operation in Section 3.6.2. What is needed here is the response go (t) of a (matched) filter of impulse response h(t) to an input signal g(t). We know from Section 3.6 that the output signal go (t) is given by go (t) = g(t) ⋆ h(t) ∞
=
∫−∞
g(𝜏)h(t − 𝜏)d𝜏
(12.41)
Equation (12.41) defines the convolution integral, which states that go (t) is given at each time instant t by the total area under the function g(𝜏)h(t − 𝜏), which is the product of the input waveform and a time-reversed and delayed (by t) version of the impulse response. For convenience, the input pulse g(𝜏) and the impulse response h(𝜏) of the matched filter (obtained in the previous worked example) are sketched again in Figure 12.17a,b. When both g(t) and h(t) are of finite duration as in this case then it is easier, and indeed very illuminating, to evaluate the convolution integral graphically as follows. (i) Obtain the waveform h(t−𝜏) using the procedure described in the previous worked example. In Figure 12.17c, a few examples of h(t − 𝜏) are shown for t = −2, 0, 2, 5, 7, and 10 μs. (ii) Multiply together the waveforms h(t − 𝜏) and g(𝜏) to obtain the integrand g(𝜏)h(t − 𝜏) in Eq. (12.41). Note that this integrand is identically zero for those values of t that lead to a h(t − 𝜏), which does not overlap g(𝜏). It can be seen in Figure 12.17 that this happens for t ≤ 0, and t ≥ 10, which means that the output pulse go (t) is zero in these two regions of time. Example curves of g(𝜏)h(t−𝜏) are shown in Figure 12.17d, for t = 2, 5, and 7 μs. (iii) The value of the output pulse go (t) at a time t is the area under the curve of g(𝜏)h(t−𝜏). For example, it can be seen in Figure 12.17d that the area under the curve of g(𝜏)h(7−𝜏) is 1.5, which means that go (t) = 1.5 at t = 7 μs. (iv) Repeat the above steps for different values of t to obtain the output go (t) sketched in Figure 12.18. Note that the matched filter has distorted the transmitted pulse g(t) in such a way that the maximum value of the output pulse go (t) occurs at the decision instant t = T s . It can be seen in Figure 12.17c that h(T s − 𝜏) = g(𝜏). Since the pulse g(t) is a real signal, it follows from Eq. (12.41) that ∞
go (Ts ) =
∫−∞
∞
g(𝜏)h(Ts − 𝜏)d𝜏 =
∫−∞
g(𝜏)g(𝜏)d𝜏
∞
=
∫−∞
|g(𝜏)|2 d𝜏
≡ E ≡ Energy of signal g(t)
(12.42)
Thus, if a known pulse g(t) is presented to the input of a filter matched to it then when the output go (t) of the matched filter is sampled at time t = T s , that sample go (T s ) gives an estimate of the energy of the transmitted pulse g(t), assuming of course that the gain of the matched filter is normalised to unity and the effect of the transmission channel has been equalised according to Eq. (12.32). Worked Example 12.5 Irrelevance of Pulse Shapes We wish to show that the signal-to-noise ratio (SNR)o at the output of a matched filter depends only on the ratio between input pulse energy E and noise power density, and not on the particular shape of the pulse.
793
794
12 Pulse Shaping and Detection
g(τ)
1.0 (a)
0.5
τ, μs –5
–4
–3
–2
–1
0
1
2
3
4
5
h(τ)
1.0 0.5
(b)
τ, μs –5
–4
h(–2 – τ)
–3
–2
h ( – τ)
–1
0 1.0
h (2 – τ)
(c)
1
2
3
h (5 – τ)
4
5
h (7 – τ)
h (10 – τ)
0.5
τ, μs –7
–5
–4
–3
–2
–1
0
1
2
3
4
7
5
g(τ)h(t – τ)
1.0
g(τ)h(5 – τ) g(τ)h(7 – τ)
0.5
(d)
g(τ)h(2 – τ) –5 Figure 12.17
–4
–3
–2
–1
τ , μs 0
1
2
3
4
5
Worked Example 12.4.
It is clear from the last worked example, and more specifically Eq. (12.42), that the signal at the output of a matched filter has a maximum value E at the decision instant t = T s , where E is the transmitted pulse energy. Since instantaneous power is defined as the square of the absolute value of the signal at a given instant, it follows that the instantaneous power of the output signal go (t) at the decision instant is Ps = [go (Ts )]2 = E2
(12.43)
From Eq. (4.164) in our discussion of output spectral density of LTI systems (Section 4.7.2), the output noise power spectral density of the matched filter is So (f ) =
No |G(f )|2 2
(12.44)
where N o /2 is the power spectral density of white noise w(t) at the input of the matched filter, G(f ) is the spectrum of the transmitted pulse, and we have used Eq. (12.33) for the filter’s gain response with the constant K set to 1. Integrating So (f ) over the entire frequency axis yields output noise power as Pn =
No ∞ N |G(f )|2 df = o E ∫ 2 −∞ 2
(12.45)
12.4 The Digital Receiver
Decision instant
go(t)
3 ≡ go(Ts)
2
1
0
0
Figure 12.18
1
2
3
4
5 ≡ Ts
6
7
8
9
t, μs 10
Worked Example 12.4: output go (t) of matched filter.
where we obtained the final term by applying Rayleigh’s energy theorem (also known as Parseval’s theorem) given in Eq. (4.94). We obtain SNRo as the ratio between Ps in Eq. (12.43) and Pn in Eq. (12.45) Ps E2 = Pn ENo ∕2 2E = No
SNRo =
(12.46)
Equation (12.46) is the desired result. It is interesting that the shape or waveform of the transmitted pulse g(t) does not feature in the achievable signal-to-noise ratio. All that matters is the pulse energy, which may be increased to improve SNRo by increasing the amplitude and/or duration of the pulse. The latter option, however, would reduce symbol rate. In summary then, provided a matched filter is used at the receiver, all pulses of the same energy are equally detected in the presence of white noise irrespective of the pulse shapes. We must therefore emphasise that pulse shaping (studied in Section 12.2) is required for ISI minimisation and has no bearing whatsoever on the impact of white noise. Worked Example 12.6 Matched Filter Output: Mathematical Approach The outputs of various matched filters are presented in Chapter 11. See, for example, Figure 11.24. We now wish to show how some of these outputs were determined. We will derive an expression for the output y(t) of a matched filter for detecting a binary PSK pulse g(t), which is made up of a sinusoidal carrier that completes n cycles in its interval of duration T b from t = 0 to t = T b , where n is a positive integer. The expression for the binary PSK pulse is ) ( ) ( t − Tb ∕2 n g(t) = Ac cos 2𝜋 t rect Tb Tb
(12.47)
795
796
12 Pulse Shaping and Detection
You may wish to refer to discussions of the rectangular function rect() in Section 2.6.3 and sinusoidal function in Section 2.7, if in any doubt. In line with Eq. (12.38), setting K = 1, we replace t wherever it occurs in Eq. (12.47) with T b − t to obtain the impulse response h(t) of the matched filter as ) ( ) ( (Tb − t) − Tb ∕2 n (12.48) h(t) = Ac cos 2𝜋 (Tb − t) rect Tb Tb Next, we replace t wherever it occurs in Eq. (12.48) with t − 𝜏 to obtain ) ( ) ( (𝜏 + Tb − t) − Tb ∕2 n h(t − 𝜏) = Ac cos 2𝜋 (𝜏 + Tb − t) rect Tb Tb ) ( ) ( 𝜏 − (t − T n b ∕2) = Ac cos 2𝜋 (𝜏 + Tb − t) rect Tb Tb
(12.49)
Equation (12.41) specifies that the desired output y(t) is obtained by integrating the product signal g(𝜏)h(t − 𝜏) in the limits −∞ to ∞. The integration only needs to be carried out over the interval where this product signal is nonzero, i.e. in the region where the pulses g(𝜏) and h(t − 𝜏) overlap. Since ( rect
t Tb
)
⎧ ⎪1, =⎨ ⎪0, ⎩
−Tb ∕2 ≤ t ≤ Tb ∕2 Otherwise
it follows that
( ) ⎧ n ⎪Ac cos 2𝜋 T 𝜏 , 0 ≤ 𝜏 ≤ Tb b g(𝜏) = ⎨ ⎪0, Otherwise ⎩
and
( ) ⎧ n A cos 2𝜋 (𝜏 + T − t) , b ⎪ c Tb h(t − 𝜏) = ⎨ ⎪0, ⎩ ( ) ⎧ n A cos 2𝜋 (𝜏 + T − t) , c b ⎪ Tb =⎨ ⎪0, ⎩
−
Tb 2
) ( T ≤𝜏− t− b ≤ 2
Tb 2
Otherwise t − Tb ≤ 𝜏 ≤ t Otherwise
Note that the right end of the interval of h(t − 𝜏) is 𝜏 = t and the left end of the interval of g(𝜏) is 𝜏 = 0. Clearly there is no overlap if the right end of h(t − 𝜏) is below the left end of g(𝜏). This means that the output y(t) = 0 for t ≤ 0. Similarly, there is no overlap if the left end of h(t − 𝜏), which is 𝜏 = t − Tb, exceeds the right end of g(𝜏), which is t = T b . So, y(t) = 0 for t−T b ≥ T b , i.e. for t ≥ 2 Tb. Combining these two regions, it means that y(t) is zero outside the interval 0 ≤ t ≤ 2T b . Next, we note that when T b ≤ t ≤ 2T b the region of overlap is from the left end of h(t − 𝜏) to the right end of g(𝜏), whereas when 0 ≤ t ≤ T b , the region of overlap is from the left end of g(𝜏) to the right end of h(t − 𝜏). The output signal is therefore given by the integrations ( ) ( )} ⎧ { ⎪∫ t Ac cos 2𝜋 n 𝜏 •Ac cos 2𝜋 n (𝜏 + Tb − t) d𝜏, Tb Tb ⎪ 0 y(t) = ⎨ { ( ) ( )} n n ⎪∫ Tb Ac cos 2𝜋 𝜏 •Ac cos 2𝜋 (𝜏 + Tb − t) d𝜏, ⎪ t−Tb Tb Tb ⎩
0 ≤ t ≤ Tb Tb ≤ t ≤ 2Tb
12.4 The Digital Receiver
The integrand is the product of two sinusoidal functions of 𝜏. For the first interval, applying the trigonometric identity for the product of two cosines ( ) ( )} t{ A2c n n cos 2𝜋 (Tb − t) + cos 2𝜋 (2𝜏 + Tb − t) y(t) = d𝜏 2 ∫0 Tb Tb { ( ( )|𝜏=t )|𝜏=t } A2c Tb n n | | = 𝜏 cos 2𝜋 (Tb − t) | + sin 2𝜋 (2𝜏 + Tb − t) | | | 2 Tb 4𝜋n T b |𝜏=0 |𝜏=0 ( ( [ ( ) ) ( )]) Tb A2c 2𝜋nt 2𝜋nt 2𝜋nt t cos 2𝜋n − sin 2𝜋n + + − sin 2𝜋n − = 2 Tb 4𝜋n Tb Tb ( ) 2 A T = c t cos(2𝜋nt∕Tb ) + b [sin(2𝜋nt∕Tb ) − sin(−2𝜋nt∕Tb )] 2 4𝜋n ( ) A2c Tb t cos(2𝜋nt∕Tb ) + sin(2𝜋nt∕Tb ) = 2 2𝜋n Similarly, for the second interval { ( ) ( )} A2 Tb n n y(t) = c cos 2𝜋 (Tb − t) + cos 2𝜋 (2𝜏 + Tb − t) d𝜏 2 ∫t−Tb Tb Tb ( ) A2c Tb = (2Tb − t) cos(2𝜋nt∕Tb ) − sin(2𝜋nt∕Tb ) 2 2𝜋n To summarise ) ⎧( T ⎪ t cos(2𝜋nt∕Tb ) + b sin(2𝜋nt∕Tb ) , A2 ⎪ 2𝜋n y(t) = c ⎨( ) T 2 ⎪ (2Tb − t) cos(2𝜋nt∕Tb ) − b sin(2𝜋nt∕Tb ) , ⎪ 2𝜋n ⎩
0 ≤ t ≤ Tb (12.50) Tb ≤ t ≤ 2Tb
The symmetry of the two expressions for y(t) in Eq. (12.50) allows them to be merged into the following single expression for y(t) [ ( ( ) )] Tb ⎧ A2c 2𝜋n 2𝜋n (T sin − |t − T |) cos t − t , 0 ≤ t ≤ 2Tb ⎪ b b Tb 2𝜋n Tb y(t) = ⎨ 2 ⎪0, Otherwise ⎩ [ ( ) ( ) )] ( T t − Tb A2 2𝜋n 2𝜋n (12.51) t − b sin t rect = c (Tb − |t − Tb |) cos 2 Tb 2𝜋n Tb 2Tb This result is plotted in Figure 12.19 for n = 3. We see that the matched filter output y(t) has a maximum value y(Tb ) = A2c Tb ∕2 at the decision instant t = T b . Note that this maximum value is the energy Eb of the binary pulse, which is a sinusoidal signal of amplitude Ac and duration T b .
12.4.3 Clock Extraction Decision instants at the receiver must be accurately spaced at intervals of the transmitted symbol duration T s . This allows the matched filter output to be sampled at the optimum instants, for negligible ISI and maximum output SNR. Small short-term deviations from the optimum timing instants are known as timing jitter. If this is unchecked, especially in long-distance high-data-rate systems with many intermediate repeaters, it may accumulate sufficiently so that the timing error exceeds half the symbol duration, causing the decision instant to be set
797
798
12 Pulse Shaping and Detection
y(t)
Eb
Eb = A2c Tb/2
t
0
–Eb
0
Tb
Figure 12.19
2Tb
Worked Example 12.6: output of filter matched to BPSK pulse of amplitude Ac and duration T b .
within the wrong symbol interval, entirely missing out one or more intervals. This problem is known as symbol slip. It causes subsequent symbols to be in error until there is a realignment. Clock or timing extraction is a process that seeks to derive from the incoming symbol stream a sinusoidal signal of the correct phase and of a frequency equal to the symbol rate (Rs = 1/T s ). This sinusoid may then be passed through a comparator – a zero-crossing detector – to give a square wave clock signal of period T s . The incoming symbol stream is then decoded by arranging for the matched filter output to be sampled at every rising (or falling) edge of the clock signal. The need for the transmitted symbol stream to contain frequent voltage transitions (e.g. between ±V volts for binary coding) is emphasised in our discussion of line coding in Chapter 10. When this is the case, the symbol stream may contain a significant component at the sampling frequency f s (= Rs ), which can be directly filtered out using a narrow bandpass filter tuned to f s . However, some symbol patterns may only contain a fraction or multiple of the desired frequency component. Therefore, in general, the incoming symbol stream is passed through a suitable nonlinear device, e.g. a square-law device, a full-wave rectifier, etc. From our discussion of nonlinear distortion (Section 4.7.6), the output of such a device will contain the desired frequency component f s , which may then be filtered out. Figure 12.20 shows one possible arrangement for clock extraction. A phase-locked loop (PLL), discussed in Sections 7.5.2 and 8.6.2 may be used in place of the narrow band filter to improve the phase match between the clock signal used at the transmitter and that extracted at the receiver.
Received symbol stream fs /2, etc. Figure 12.20
Nonlinear device Clock extraction.
fs, etc.
Narrow band BPF or PLL fs
fs only
Threshold comparator
Clock signal Ts
12.5 Summary
(a)
t
1
1
0
´
(b)
1
0
0
Bit stream
Threshold level
Best decision instant Figure 12.21
(a) Incoming distorted NRZ waveform; (b) corresponding eye diagram.
12.4.4 Eye Diagrams An indication of the likelihood of decision error at the receiver due to the corruption of the incoming symbol stream by undesirable noise and filtering can be readily displayed using eye diagrams. Figure 12.21a shows a corrupted bipolar non-return-to-zero (NRZ) symbol stream, with adjacent symbol elements identified by different line patterns. If all the incoming symbol elements are superimposed in one symbol interval, the result is the plot shown in Figure 12.21b, which is called an eye diagram because it resembles the human eye. The eye diagram of an actual transmission can easily be displayed on an oscilloscope. The symbols in successive intervals will be automatically superimposed on the screen when the oscilloscope is triggered using the receiver’s clock signal. Useful performance information provided by the eye diagram include: ●
● ●
The width of the eye opening gives the timing error that can be tolerated in the sampling instants at the receiver. The best sampling instant is at the centre of the eye opening. The slope of the opening gives an indication of the sensitivity of the transmission system to timing error. The height of the eye opening gives the noise margin of the system.
It is therefore obvious that the larger the eye opening the lower will be the symbol error ratio of the system. Figure 12.22 demonstrates the impact of noise and timing error on the eye diagram of a binary system that uses raised-cosine-filtered pulses. A narrowing of the eye opening by these effects clearly indicates an increased probability of error. The eye diagram is indeed a very useful diagnostic tool for checking for the presence of timing error, noise, and pulse distortion in a digital transmission system.
12.5 Summary The design of a reliable communication system throws up many challenges which may be skilfully navigated through a sound understanding of the interplay amongst key design parameters and the trade-offs involved, as well as a good grounding in the tools and techniques needed to optimally exploit such trade-offs. One of the most
799
800
12 Pulse Shaping and Detection
Perfect
Timing error only
×
Noise only
×
Noise and timing error
×
Figure 12.22
×
Effects of noise and timing error on eye diagram.
significant equations in information theory is the Shannon–Hartley law, which lays down the rule governing how bandwidth and signal power may be exchanged in the design of a reliable transmission system affected by noise. We presented a simple intuitive argument leading to the law and then paid a great deal of attention to evaluating its implications. All practical systems fall short of the combined bandwidth and power efficiencies stipulated by this law. We demonstrated through a worked example that the use of error control coding facilitates a more efficient exchange between bandwidth and signal power than is possible by only switching modulation schemes and therefore enables a closer approach to the benchmark laid down by the law. We examined the techniques available for dealing with the challenges of inter-symbol interference (ISI) caused by the finiteness of the transmission system’s bandwidth and the challenges of noise arising from the transmission medium and the receiver’s front end. Various ISI-mitigating filtering techniques were discussed, and their merits and drawbacks compared in terms of occupied bandwidth, sensitivity to timing error, signal power requirement, and complexity or realisability in real-time. Of these, the most widely used is the raised cosine filter which permits an excellent trade-off between occupied bandwidth and complexity through its roll-off factor parameter. Using a heuristic and nonmathematical approach, we derived a specification for the transfer function of a matched filter which provides the best detection (in terms of maximising the output signal-to-noise ratio) of known pulses in the presence of additive white Gaussian noise (AWGN). It turns out that the impulse response of this filter is simply a time-reversed version of the pulse delayed by the pulse duration. We showed that, quite remarkably, both requirements of mitigating ISI and optimising symbol detection in the presence of noise can be met by using a pair of square root raised cosine (RRC) filters, one at the transmitter and the other at the receiver. In addition to noise and ISI, the communication channel will introduce channel distortion and the receiver may experience small timing errors in the clock it uses to set precise decision instants for pulse detection. We briefly discussed the techniques of equalisation and clock extraction and presented the eye diagram as a useful diagnostic tool for investigating the effects of ISI, timing error, and noise on received pulses.
Questions
In the next chapter, we undertake a step-by-step and comprehensive study of multiplexing strategies for multi-user communication systems, which will include information on various international telecommunication standards.
References 1 Lender, A. (1963). The duobinary technique for high-speed data transmission. Transactions of the American Institute of Electrical Engineers, Part I: Communications and Electronics 82 (2): 214–218. 2 Newcombe, E.A. and Pasupathy, S. (1980). Effects of filtering allocation on the performance of a modified duobinary system. IEEE Transactions on Communications, COM-28 (5): 749–752. 3 Kabal, P. and Pasupathy, S. (1975). Partial response signaling. IEEE Transactions on Communications, COM-23 (9): 921–934. 4 Walklin, S. and Conradi, J. (1999). Multilevel signalling for increasing the reach of 10 Gb/s lightwave systems. Journal of Lightwave Technology 17 (11): 2235–2248. 5 Shannon, C.E. (1948). A mathematical theory of communication. The Bell System Technical Journal 27 379–423, 623–656. 6 Hartley, R.V.L. (1928). Transmission of information. The Bell System Technical Journal 7: 535–563.
Questions 1
A wideband audio signal of baseband frequencies 50 Hz to 7 kHz is processed in a 10-bit linear analogue-to-digital conversion (ADC) at a sampling rate of 1.5 times the Nyquist rate. The resulting bit stream is conveyed in a noiseless channel using: (a) Binary (M = 2) signalling; and (b) Quaternary (M = 4) signalling. In each case, determine (i) the minimum required transmission bandwidth and (ii) the transmission bandwidth when the channel has a raised cosine response with 𝛼 = 0.5
2
Determine the minimum SNR and hence signal power required for error-free transmission of the signal in Question 12.1 over an AWGN channel of noise power per unit bandwidth 10−15 W/Hz and bandwidth (a) 10 kHz (b) 20 kHz (c) 200 kHz. Comment on the trend of your results.
3
Sketch the impulse response of a matched filter for detecting each of the pulses shown in Figure Q12.3.
4
Determine and sketch the matched filter output for each of the pulses in Question 12.3. What is the maximum value of each output pulse?
5
Repeat Questions 12.3 and 12.4 for a triangular pulse of amplitude A and duration T s .
6
Assuming zero modem implementation loss, a Nyquist channel (i.e. raised cosine filter of roll-off factor 𝛼 = 0) and reliable communication when BER = 1 × 10−8 , determine the limit placed by the Shannon–Hartley law
801
802
12 Pulse Shaping and Detection
5
g1(t), V
5 4
g2(t), V
4
2
6
t, μs
–5
–5 (a)
g3(t), V
4
t, μs
–5
5
(b)
(c)
Figure Q12.3 Question 12.3.
on the maximum possible error control coding gain when the modulation scheme is QPSK and the code rate is (a) r = 9/10 (b) r = 1/2 (c) r = 1/8. (Note: you may find the graphs of BER versus Eb /N o in Chapter 11 useful.) 7 a. ) Repeat Question 12.6 for a 256-FSK modulation scheme. b) Comment on the trends of your results in Questions 12.6 and 12.7(a). 8
Making the same assumptions as in Question 12.6, determine the limit placed by the Shannon–Hartley law on the maximum BER in the bit stream from a demodulator output to the input of a realisable error control decoder if reliable communication is to be achieved for the pairs of modulation scheme and code rate listed below. Comment on the trend of your results. (a) 16-APSK, r = 9/10 (b) 16-APSK, r = 1/8 (c) 256-APSK, r = 9/10 (d) 256-APSK, r = 1/8.
9
A digital transmission system operates at a message bit rate Rb = 139 264 kb/s in AWGN using a raised cosine filter of roll-off factor 𝛼 = 0.25. Reliable communication is set at BER = 10−6 . Determine how each of the following implementations of this system compares with the Shannon–Hartley benchmark of Eq. (12.28) and comment on your results. Assume that all modems have a modem implementation loss Lmil of 1 dB and determine BER versus Eb /N o values using Figures 11.39, 11.48, and 11.52 as appropriate. (a) Binary ASK modulation and no error control coding. (b) 64-APSK modulation and no error control coding. (c) 1024-FSK modulation and no error control coding. (d) 64-APSK modulation with error control coding using a codec of code rate 4/5 that can correct on average seven bit errors in every 100 bits.
10
You are given that a 1024-APSK modem has a BER of 0.1 at Eb /N o = 11.6 dB. In view of the Shannon–Hartley information capacity law, assess the possibility of realising a reliable transmission system that uses a 1024-APSK modem with modem implementation loss 1 dB in conjunction with a codec of code rate 9/10 that can correct on average one bit error in every 10 bits.
Questions
11
The pulse g(t) in Eq. (12.47) is a sinusoidal signal that is constrained to duration T b through multiplication by a rectangular window function. Using a raised cosine window function (instead of a rectangular window function) produces the sinusoidal pulse ( ( )[ )] ( ) n 2𝜋 1 1 t + cos grc (t) = Ac cos 2𝜋 t t rect Ts 2 2 Ts Ts which is a pulse that completes an even number n of cycles within its duration T s in the range −T s /2 to T s /2. (a) Derive an expression for the output yrc (t) of a matched filter that receives grc (t). (b) Make a graphical sketch of yrc (t) and discuss how it compares with the case of a rectangular-windowed pulse examined in Worked Example 12.6.
803
805
13 Multiplexing Strategies
No, an enemy did not build our prisons and set our limits in life. We did all that ourselves. In this Chapter ✓ A nonmathematical introduction of four classes of techniques for simultaneously accommodating multiple users in a communication system. ✓ Frequency division multiplexing (FDM): you will see that FDM is indispensable to radio communication services and learn various standardised hierarchical implementations of FDM telephony. ✓ Time division multiplexing (TDM): a step-by-step and detailed discussion of plesiochronous and synchronous digital hierarchies and an introduction to ATM (asynchronous transfer mode). ✓ Code division multiplexing (CDM): a discussion of spread spectrum techniques, including a detailed step-by-step graphical description of signal processing in CDM. You will learn the simplicity of this free-for-all sharing strategy. ✓ Space division multiplexing (SDM): this indispensable strategy for global and cellular mobile telecoms is discussed in the introductory section. ✓ Multiple access: a brief treatment of frequency division multiple access (FDMA), time division multiple access (TDMA), and code division multiple access (CDMA) in the final section.
13.1 Introduction The discussion in previous chapters concentrates mainly on the processing of a telecommunication signal emanating from a single source. There are several reasons why a communication system must be able to simultaneously handle signals from multiple and independent sources without mutual interference. ●
To satisfy the communication needs of a larger number of people. Modern lifestyle has become very dependent on telecommunication so that at any given time in an average city there will be a large number of people needing to make a phone call, send a text message, access the Internet, hold a teleconference, etc. If the communication system could only handle one signal at a time, and each user occupied the system continuously for an average duration of three minutes then only 480 users per day could be serviced, assuming inconvenient times (such as 2.00 a.m.) are not rejected. If such a communication system served a city of one million people then at this rate it would take nearly six years for every person to have just one three-minute access. Clearly, you couldn’t rely on such a system to call an ambulance in a health emergency. By the time it reached your turn on the service queue, you would either have fully recovered or been dead and buried.
Communication Engineering Principles, Second Edition. Ifiok Otung. © 2021 John Wiley & Sons Ltd. Published 2021 by John Wiley & Sons Ltd. Companion Website: www.wiley.com/go/otung
806
13 Multiplexing Strategies ●
●
●
●
To reduce the cost of the service to each user. This important consideration can be demonstrated by assuming a satellite communication system built exclusively for telephony at a total cost of £300 m, which includes design, construction, launching, and maintenance costs over a projected satellite lifetime of 10 years. Allowing a 16% profit margin, the operator must earn (by charging users of the service) a total sum of £348 m during a period of 10 years or 5.22 million minutes. Excluding system idle time of eight hours per day – you would not normally like to make or receive a phone call during sleeping hours – leaves us with 3.48 million income-yielding minutes over which to recover £348 m. It is easy to see that if the system could only handle one call at a time then the charge for each call would have to be £100 per minute. However, if we could somehow design the system to handle up to 24 000 simultaneous calls then, assuming on average 20 000 users every minute, the operator’s required earning could be spread out over this number of users, bringing down the charge per user to a mere half a pence per minute. To allow the coexistence of a multiplicity of telecommunication services in each geographical area or city. Audio broadcast, television broadcast, and mobile communication, to name but a few radio services, must operate simultaneously and independently without mutual interference. To improve the exploitation of the available bandwidth of a transmission medium. For example, if a coaxial cable of bandwidth 10 MHz is used to carry one voice signal (of bandwidth 4 kHz), only 0.04% of the cable capacity is being utilised. As the communication distance and hence link cost increases it becomes more and more important to dramatically increase the utilisation of the cable capacity by somehow packing many voice signals onto the cable medium. To allow the use of identical radio systems for the provision of localised broadcast and communication services in different geographical regions. For example, frequency modulation (FM) radio broadcast can be provided in two different cities using the same carrier frequency of, say, 98.5 MHz.
To realise the above benefits, there are four multiplexing strategies that may be used separately, but frequently in combination, to simultaneously accommodate multiple users and services in a common transmission medium. Figure 13.1 provides an illustration of these resource-sharing techniques for N users. Three axes are used, namely frequency, which represents the available bandwidth of the transmission medium; time, which represents the instants of usage of the medium; and space, which represents the physical location of the medium. ●
●
In time division multiplexing (TDM) the duration of usage of the transmission medium is divided into time slots each of which is allocated to a single user. Thus, each of the N signals has exclusive use of the entire transmission medium during the time slot allocated to it. TDM is briefly introduced in Sections 1.3.1.2 and 1.5.3.2, which you may wish to review at this point. A useful analogy to TDM is the sharing of the use of a lecture room by four different groups of students, each needing the room for a total period of one hour. We may draw up a schedule dividing the room usage into one-hour time slots so that each group occupies the entire room once in turn; or we may allocate 20-minute time slots so that it takes three slots for each group to complete their business, etc. There is, however, an important difference between this analogy and the way TDM is implemented in communication systems. Here there is a noticeable sense of queuing and waiting for one’s turn, whereas in real TDM the time slots are extremely short and are used to convey samples of each signal taken at regular and sufficiently short intervals. Thus, users of the TDM system are totally oblivious of the time-sharing roster and the receiver can reconstruct (without distortion) each of the original signals from their samples. See Chapter 9 if in any doubt about sampling. Frequency division multiplexing (FDM) gives each user exclusive use of a separate frequency band (often referred to as a channel) for all time. Ideally then, with an average channel bandwidth Bc , and total available bandwidth Bt in the transmission medium, the maximum number of users that can be accommodated is N = Bt ∕Bc
(13.1)
13.1 Introduction
Frequency Frequency
2
….
3
1
N Slot
1
(b) SDM
(a) TDM
3
Time
Zone N
Time
Space
2
Space Frequency
Frequency
(c) FDM
(d) CDM
Band N
All Users
Time
3 2 1
Time
Space
Space
f7 f6 f7 f6
f2 f1 f5 D
f2
f1 f5 f3
f4 f7 f6
f3 f4
f7 f6 f2 f1 f5
f7 f6 f2 f1 f5 f3 f4
f2 f1 f5
f3 f4 f7 f6
f3 f4 f7 f6 f2 f1 f5
f2
f1 f5
f3 f4
f3 f4
(e) Figure 13.1 Multiplexing strategies: (a) time division multiplexing; (b) space division multiplexing; (c) frequency division multiplexing; (d) code division multiplexing; (e) SDM in cellular telephony.
807
808
13 Multiplexing Strategies ●
●
Space division multiplexing (SDM) allocates the same frequency band or all the available bandwidth to more than one user for all time, but user signals of the same frequency are confined to physically separate regions or zones. In closed transmission media it means that each user has exclusive use of a separate line pair, whereas in open media it requires that the radiated strength of each signal be negligible outside the signal’s geographical region. In our lecture room analogy, we may apply SDM by allowing all four groups simultaneous use of the room, but with each group seated sufficiently far apart at different corners of the room. As long as the students follow a simple SDM rule of speaking softly then all groups can coexist with little mutual disturbance. An important area of application of SDM is in cellular mobile communications where the same frequency bands are reused many times. In this way a limited radio spectrum allocation is very efficiently utilised to meet a huge demand in a given serving area, such as a city. For example, in the North American advanced mobile phone system (AMPS) only 25 MHz in the ultra high frequency (UHF) band was available to one operator in a serving area. Of this, 12.5 MHz was for transmission in the forward direction from base station to mobile, and a further 12.5 MHz for transmission in the reverse direction. With 30 kHz per channel and making provision for control channels it follows that only about 400 users could be accommodated simultaneously in the available bandwidth. This was grossly inadequate to meet the demand for mobile services. The use of SDM dramatically increased capacity, enabling the operator to handle tens of thousands of simultaneous calls. A typical SDM or frequency reuse plan is shown in Figure 13.1e. The serving area is divided into small zones called cells, each of which has one base station for communication with mobile units. A group of cells (enclosed in bold lines in the diagram) across which the entire bandwidth allocation is used up is called a cluster. Figure 13.1e shows a cluster size of 7, but it can also be 3, 4, 9, 12, or multiples of these. The available channels are shared amongst the cells in each cluster. We identify the sets of channels as f 1 , f 2 , f 3 , etc. A mobile unit wanting to make a call is assigned an available channel from the set allocated to its cell. Notice how the frequencies are reused in cells separated by a distance D, meaning, for example, that calls can be made at the same time in each of the shaded cells using exactly the same set of frequency bands. Obviously, radiated power in each cell must be limited to minimise co-channel interference, i.e. interference between cells that use the same frequency. The choice of cell diameter and cluster size is influenced by many factors, such as required capacity, acceptable carrier-to-interference ratio, etc. A smaller cell size allows a particular frequency band to be reused more times in the serving area, thus increasing capacity, but handover (the process of a mobile unit’s transmission being changed from one channel to another as the mobile crosses a cell boundary) occurs more frequently. Code division multiplexing (CDM) is a kind of free-for-all sharing strategy in which multiple users transmit in the same frequency band at the same time and in the same physical medium. The secret is that each user is assigned a unique pseudorandom code sequence with which their signal is spread over a wide bandwidth giving it a noise-like appearance. A target receiver equipped with exactly the same code sequence is able to extract the wanted signal from the received mix of signals, and to effectively block out the unwanted signals from other users. Returning to our lecture room analogy, the four groups of students may simultaneously share the entire room in CDM fashion, with one group speaking, say, in German, another in Swahili, another in Igbo, and the remaining in Chinese. So long as the students understand only the language of their group then secure and effective communication can take place, with only a slight inconvenience of background noise in each group.
It should be noted that these multiplexing strategies are rarely used in isolation. In fact, by taking a broad interpretation of FDM we see that it is inherent in all radio communication systems to allow a multiplicity of services in a given locality. Similarly, to allow the reuse of the same radio band in different regions (or localities in some cases) of the world, SDM is inherent in nearly all radio systems, except, for example, international broadcasting at high frequency (HF). Thus, if TDM is used on a satellite link at, say, 6 GHz then we could describe the system as employing SDM/FDM/TDM. However, we will henceforth define multiplexing more restrictively in terms of how
13.2 Frequency Division Multiplexing
multiple signals are combined for transmission on a common link. Therefore, the satellite system in this example is regarded simply as a TDM system. In the remaining sections of this chapter we consider FDM, TDM and CDM in detail, and study the implementation of various standard FDM and TDM hierarchies.
13.2 Frequency Division Multiplexing 13.2.1 General Concepts Frequency division multiplexing stacks a number of narrowband signals in nonoverlapping frequency bands and transmits them simultaneously in a common transmission medium. The required frequency translation of each baseband signal can be achieved using any of the modulation techniques studied in Chapters 7 and 8, such as FM, phase modulation, and amplitude modulation (AM) or its variants, namely double sideband (DSB) and single sideband (SSB). The use of SSB modulation minimises bandwidth and power requirements and hence the cost of high-capacity long-distance transmission systems. However, for low-capacity short-haul systems (e.g. in rural telephony) the use of AM with its simple modulation and demodulation circuits may yield a more cost-effective FDM implementation. The following discussion is confined to the use of SSB, a much more prevalent choice for which international standards exist in FDM telephony. It is a straightforward matter to extend the principle to other modulation techniques. A detailed discussion of SSB modulation and demodulation is presented in Section 7.7.2. For convenience, Figure 13.2 shows a block diagram of an SSB modulator. Note that it contains pre- and post-modulation filters of the indicated passbands. The roles of these filters will become clear in the following discussion. Figure 13.3a shows an FDM multiplexer that combines N independent input signals v1 (t), v2 (t), …, vN (t) to give a composite (FDM) signal vfdm (t). Each input signal occupies the baseband spectrum with frequencies in the range f a to f b . Clearly, combining these signals in their baseband form would result in an unacceptable state of complete mutual interference. To avoid this, each signal vi (t), i = 1, 2, …, N, is first passed through a lower-sideband SSB modulator supplied with a unique carrier f ci . The resulting modulation converts vi (t) to the signal vssbi (t), which occupies an exclusive frequency band or channel f ci − f b → f ci − f a . This frequency translation process is illustrated in Figure 13.3b for i = 1, 2. A symbolic baseband spectral shape – a triangle by convention – is used for the illustration. Note that the spectrum of vssbi (t) is inverted relative to that of vi (t) because it is the lower sideband that is taken by the post-modulation filter from the product of vi (t) and the carrier. An erect spectrum of vssbi (t) in the frequency band f ci + f a → f ci + f b would be obtained if the upper sideband (USB) were taken. Following this modulation, the SSB signals are added together in a summing device to yield the desired FDM signal (13.2)
vfdm (t) = vssb1 (t) + vssb2 (t) + · · · + vssbN (t) ʋ(t) Message signal
Pre-modulation filter (fa → fb)
× Carrier (fc)
Figure 13.2
(Lower sideband) SSB modulator.
Post-modulation filter (fc – fb → fc – fa)
ʋssb(t) SSB signal
809
810
13 Multiplexing Strategies
ʋ1 (t)
ʋ2 (t)
(a)
ʋN (t)
SSB Modulator (f c1 ) SSB Modulator (f c2 )
SSB Modulator (f cN )
N message signals
ʋssb 1 (t)
ʋssb 2 (t)
ʋfdm (t)
Σ
FDM signal
ʋssbN (t)
N modulated carriers |Vssb1(f)|
|V1(f)| SSB fc1
0 fa
fb
f
fc1 fc1 – fb
0
(b)
fc1 – fa
f
|Vssb2(f)|
|V2(f)| SSB fc2
0 f a
fb
f
fc2 fc2 – fb
0
fc2 – fa
f
|Vfdm(f)| (c)
…...
0
Figure 13.3
fc1
fc2
G
G
f cN–1 B fdm
fcN
f
G
FDM: (a) multiplexer; (b) frequency translation effect of SSB modulation; (c) spectrum of FDM signal.
The remarkable outcome of this multiplexing process is that vfdm (t) is a composite signal containing N independent signals each of which can be extracted at the receiver without mutual interference. Figure 13.3c shows the spectrum of vfdm (t), denoted V fdm (f ). An arrangement for de-multiplexing vfdm (t) is shown in Figure 13.4. The FDM signal is connected to a bank of N bandpass filters. Clearly, the filter with passband f ci − f b → f ci − f a passes the ith component vssbi (t) in Eq. (13.2) and blocks all others. The signal vssbi (t) so extracted is demodulated in an SSB demodulator which is supplied with a carrier of frequency f ci . This yields the original signal vi (t). In this way, all the multiplexed signals are successfully recovered. There are a number of conditions that must be satisfied for the above implementation of FDM to be free of interference in any of the channels.
13.2 Frequency Division Multiplexing
ʋfdm(t)
Figure 13.4
●
ʋssb1(t)
Bandpass filter (fc2 – fb → fc2 – fa)
ʋssb2(t)
SSB demodulator (fc2)
ʋ2(t)
Bandpass filter (fcN – fb → fcN – fa)
ʋssbN(t)
SSB demodulator (fcN)
ʋN(t)
SSB demodulator (fc1)
FDM demultiplexer.
Each of the N input signals to the multiplexer must be bandlimited, with frequency components in the range f a → f b , where 0 < fa < fb < ∞.
●
ʋ1(t)
Bandpass filter (fc1 – fb → fc1 – fa)
(13.3)
If this condition is not satisfied and f b is infinite then the signals cannot be confined within exclusive bands. On the other hand, if f a = 0 then SSB cannot be used for frequency translation since it becomes impossible to separate the sidebands using a realisable filter. To satisfy the condition of Eq. (13.3), a pre-modulation filter is employed to remove all nonessential frequency components below f a and above f b in each input signal. In speech telephony, for example, f a = 300 Hz and f b = 3400 Hz. Video signals contain essential components down to DC. This means that f a = 0. It is for this reason that SSB cannot be used for obtaining the FDM of television signals. The carrier frequencies f c1 , f c2 , …, f cN used by the bank of SSB modulators in the multiplexer must be sufficiently spaced to allow a frequency gap, called a guard band (GB), between adjacent spectra of the SSB signals that constitute the FDM signal. Without such a gap, a nonrealisable brickwall bandpass filter would be required at the receiver to extract each of the SSB signals. In Figure 13.3c a guard band G is shown. Thus, the bandwidth of each signal, and hence the spacing of the carrier frequencies, is given by B = fb − fa + G.
(13.4)
With the bank of carrier frequencies starting at f c1 for the lowest channel, it follows that the value of the ith carrier is fci = fc1 + B(i − 1)
(13.5)
and the bandwidth of the composite FDM signal is Bfdm = NB = N(fb − fa + G).
●
(13.6)
A GB of 900 Hz is used in speech telephony, with f b and f a as given above, so that the channel bandwidth B = 4 kHz. Amplifiers in the transmission system must be operated in the linear region of their transfer (i.e. input to output) characteristic. Any nonlinearity causes harmonic and intermodulation products to be generated in one channel that may fall in the frequency interval of some other channel, giving rise to noise. See Section 4.7.6 for a discussion of this effect.
811
812
13 Multiplexing Strategies ●
The post-modulation filter must suppress the USB; otherwise, any remnants will interfere with the next higher channel, causing unintelligible crosstalk – since the interfering USB is inverted relative to the wanted lower sideband (LSB) of the next channel. In practice, perfect elimination of the USB is not possible, and it is sufficient for the post-modulation filter to reduce the USB by 60 dB or more relative to the LSB.
13.2.2 Demerits of Flat-level FDM We have so far presented a flat-level FDM in which N signals are frequency translated in one step using N carriers uniformly spaced in frequency from f c1 to f c1 + B(N − 1). This approach has several serious drawbacks when used in the implementation of high-capacity systems where N may be very large, e.g. up to 10 800 for FDM telephony. First, consider the design of the post-modulation filter, which is required to suppress one of the sidebands at the output of the product modulator. Figure 13.5 shows a piecewise linear approximation of the response of this filter (for the ith channel, i = 1, 2, …, N), indicating the passband for the LSB, the transition width, and the stopband to block the USB. A standard measure of the selectivity of a filter is its quality factor Q, which is defined by Q= = =
fr Filter centre frequency = Filter bandwidth fb − fa fci − 12 (fa + fb ) fb − fa fc1 + (i − 1)B − 12 (fa + fb )
(13.7)
fb − fa
A related measure of filter performance is the required steepness in the roll-off of the filter’s response in order to block the unwanted sideband while passing the wanted sideband. This measure is best given by the ratio between the centre frequency of the transition band and the transition width, which we will call the slope factor ℤ f Transition centre frequency = ci Transition width 2fa f + (i − 1)B = c1 2fa
ℤ=
(13.8)
Note that a brickwall filter (an unrealisable ideal device) has zero transition width and hence ℤ = ∞. Eqs. (13.7) and (13.8) show that both Q and ℤ increase with i, the channel number. For example, consider the post-modulation filters required for the 10th and 10 000th channels of a high-capacity FDM telephony system, with f c1 = 64 kHz,
Post-modulation filter response LSB
fr Stopband
2fa Transition width
Figure 13.5
Passband
fb – fa
USB
f
fci 2fa
Stopband
Transition width
Post-modulation filter response: a piecewise linear approximation.
13.2 Frequency Division Multiplexing
f a = 0.3 kHz, f b = 3.4 kHz, and B = 4 kHz. Substituting these values in the above equations yields Q = 32,
ℤ = 167;
Q = 12 922,
ℤ = 66 767;
for i = 10 for i = 10 000
Note that the channel i = 10 000 requires a filter with very high values of quality and slope factors, which is both expensive and difficult to achieve. So we see that in a flat-level FDM it is difficult to realise post-modulation filters for the higher-frequency channels. The same argument holds for the bandpass filters required in the demultiplexer at the receiver. The other problems posed by flat-level FDM implementation include the following. ●
●
●
●
●
Provision would have to be made for generating N different carrier frequencies at the transmitter, and the same number at the receiver. Considering that the carrier frequencies are required to be highly stable (to guarantee distortionless transmission of especially nonvoice signals), it is obvious that such a system would be very complex for large N. The required Q and ℤ factors of the filter in each channel depend on the channel number, according to Eqs. (13.7) and (13.8). So, no two filters are identical, leading to N different filter designs. Building and maintaining a system with, say, 10 800 unique filters is, to say the least, very cumbersome. The structure of each FDM system depends on the number of channels. Standardisation is therefore lacking. Standardisation makes it easier and cheaper to set up systems of various capacities by using a small set of standardised equipment obtainable from various manufacturers. The summing device at the multiplexer is fed by N different sources, whereas at the receiver the incoming FDM signal is connected to N different bandpass circuits. As N increases, the problem of loading becomes significant, necessitating, for example, a much higher signal level to drive the bandpass filters. We require N different pairs of wires to carry the signals to the single multiplexing point. This can be very expensive (and unsightly if carried on overhead poles) for providing telephone services to, say, N different homes. Preferably, we would like to perform the multiplexing in stages and use a single wire pair to carry signals from a small cluster of homes.
To overcome the above problems FDM is implemented in a hierarchy. Hierarchical arrangements were standardised for FDM telephony, which we discuss in detail in Section 3.2.4. The way nonvoice signals were accommodated within these FDM telephony hierarchy plans are also be briefly addressed in that section. However, we must first make an important clarification about the future of FDM technology.
13.2.3 Future of FDM Technology We should point out straightaway that the hierarchical FDM telephony discussed in the next section is an analogue technology that has no place in our twenty-first-century digital telecommunication network. The future of FDM in general is, however, assured. For as long as radio communication endures (and this is guaranteed), there will always be the need to allocate different frequency bands to different services and users (as in two-way radio systems and mobile cellular telephony). These are all instances of FDM. Also, wavelength division multiplexing (WDM) is an application of FDM in optical fibre communication that has a very promising future. FDM will also continue to be an important multiple access technique in satellite communications, for example. Here, a satellite transponder is partitioned into frequency bands, which are assigned to users who (almost certainly) use digital transmission techniques within their allotted band. The deployment of (analogue) FDM telephony reached its peak in the mid-1970s. Since then, developments in digital transmission techniques with all their advantages led to a rapid digitalisation of the telephone network and a replacement of FDM telephony with TDM. The telephone network in most countries is now practically 100% digital. There was a period of transition from analogue to digital transmission, which lasted for a few years because of the huge investment that had been made in analogue transmission technology.
813
814
13 Multiplexing Strategies
The International Telecommunication Union (ITU) specified some transition equipment, namely transmultiplexers (TMUXs), FDM codecs, and transition modems, that allowed the interconnection of digital and analogue systems during this transition period. The TMUX transformed an FDM telephony signal to TDM telephony in one direction of transmission and performed the opposite conversion in the other direction. For example, a 60-channel TMUX transformed a supergroup (SG) signal (discussed below) to two 2048 kb/s TDM signals, and vice versa, whereas a 24-channel TMUX converted between two group signals (see below) and one 1544 kb/s TDM signal. An FDM codec was used for digitising an FDM signal before transmission over a digital link. At the other end of the link, the codec converted the incoming bit stream back into the original FDM signal. The ITU recommended two types of transition modems for high-speed data transfer over an analogue link, namely the data-in-voice (DIV) modem and the data-over-voice (DOV) modem. A suitable carrier was modulated by the digital signal in both modems. The DIV modem displaced several FDM channel assemblies, whereas the DOV modem placed the signal above the frequency band occupied by the voice signals and so did not replace them. Therefore, although the future of FDM technology in general is assured, the material on FDM telephony hierarchy presented in the next section is a (now obsolete) twentieth-century analogue technology and may be skipped in a tight curriculum.
13.2.4 FDM Hierarchies There were three different hierarchical implementations of FDM telephony in the world, namely the UK System, the European System, and the Bell System (used in North America). These schemes are identical at the first two levels of the hierarchy and differ only at the higher levels. It is important to bear this in mind as we initially discuss these first two levels, which achieve the multiplexing of 60 voice channels into an FDM signal known as the supergroup (SG) signal. The three FDM standards are realised by multiplexing SG signals in different ways, which we discuss in the relevant sections below. Figure 13.6 shows how a SG signal is generated using two stages or levels of multiplexing. A notation f ci,j has been used to identify the carrier frequencies in the SSB modulators. It denotes the carrier frequency used in the ith level of the FDM hierarchy to translate the jth signal combined at that level. In the first multiplexing level, 12 voice signals each of which contains frequency components from f a = 0.3 kHz to f b = 3.4 kHz are frequency translated and summed to give an FDM signal known as a group signal. The 12 carriers used are spaced at 4 kHz with frequencies 64, 68, 72, …, 108 kHz. This carrier spacing allows a 900 Hz GB between the translated voice signals and gives a nominal voice channel bandwidth of 4 kHz. Since it is the LSB that is selected, it follows that the first voice signal is translated by the 64 kHz carrier to the band 60 → 64 kHz. Note that this band includes the 900 Hz GB, made up of two gaps of 600 Hz and 300 Hz on either side of the voice spectrum. The second voice signal is translated by the 68 kHz carrier to the band 64 → 68 kHz, and so on, until the 12th voice signal, which is translated by the 108 kHz carrier to the band 104 → 108 kHz. Thus, the group signal, which comprises 12 independent and noninterfering voice signals, lies in the frequency band 60 → 108 kHz, and has a bandwidth of 48 kHz. The spectrum of a group signal is shown in Figure 13.7a. A standard existed that allowed a 33% increase in system capacity with the same transmission bandwidth by packing 16 voice channels into the 48 kHz group signal. To do this, each voice signal is restricted to the frequencies 0.25 → 3.05 kHz, and the frequency translation is performed in a manner that allows a GB of 0.2 kHz between each of the translated voice bands. The voice channel bandwidth is therefore 3 kHz. The use of a 16-channel group signal (i.e. 3 kHz voice channels) was restricted mostly to submarine cable installations where equipment cost was very high. We will henceforth concentrate on the 4 kHz voice channel, which was prevalent but will consider further details of the 3 kHz channel in Question 13.1. In the second level, five group signals are frequency translated and summed to give a SG signal. The carriers used to effect these translations are spaced apart by 48 kHz and have frequencies 420, 468, 516, 564, and 612 kHz. Clearly, the first group signal (of frequencies f a = 60 kHz to f b = 108 kHz) is translated by the carrier of frequency
13.2 Frequency Division Multiplexing
12 voice signals (0.3 → 3.4 kHz)
ʋ 1 (t)
SSB modulator (fc1,1 = 64 kHz)
ʋ 2 (t)
SSB modulator (fc1,2 = 68 kHz)
ʋ 3 (t)
SSB modulator (fc1,3 = 72 kHz)
ʋ 4 (t)
SSB modulator (fc1,4 = 76 kHz)
ʋ 5 (t) ʋ 6 (t) ʋ 7 (t) ʋ 8 (t) ʋ 9 (t) ʋ 10 (t) ʋ 11 (t) ʋ 12 (t)
5 group signals (60 → 108 kHz)
SSB modulator (fc2,1 = 420 kHz)
SSB modulator (fc1,5 = 80 kHz) SSB modulator (fc1,6 = 84 kHz) SSB modulator (fc1,7 = 88 kHz) SSB modulator (fc1,8 = 92 kHz) SSB modulator (fc1,9 = 96 kHz)
SSB modulator (fc2,2 = 468 kHz)
Σ
SSB modulator (fc2,3 = 516 kHz)
Σ
Supergroup signal (312 → 552 kHz)
SSB modulator (fc2,4 = 564 kHz) SSB modulator (fc2,5 = 612 kHz)
SSB modulator (fc1,10 = 100 kHz) SSB modulator (fc1,11 = 104 kHz) SSB modulator (fc1,12 = 108 kHz)
FDM Level 1 (Group) Figure 13.6
FDM Level 2 (Supergroup)
First two levels of FDM hierarchy: generation of supergroup (SG) signal.
f c2,1 = 420 kHz to the band f c2,1 − f b → f c2,1 − f a , which is 312 → 360 kHz. The other group signals are similarly translated, and the fifth group signal is translated to the band 504 → 552 kHz. Thus, the SG signal occupies the frequency band 312 → 552 kHz, has a bandwidth of 240 kHz, and contains 60 voice channels. The spectrum of this SG signal is shown in Figure 13.7b. Note particularly that the spectrum is erect, having undergone double inversion in the two multiplexing stages.
815
816
13 Multiplexing Strategies
(a)
60
4 kHz voice channel
64
68
72
76
80
84
88
92
96
100
104
108
f, kHz
48 kHz bandwidth (b)
312
48 kHz group channel
360
408
456
504
552
f, kHz
240 kHz bandwidth Figure 13.7
(a) Spectrum of group signal; (b) spectrum of SG signal.
The advantages of this two-level hierarchical multiplexing are immediately obvious. The most stringent filter performance required is for the 12th voice channel at the first multiplexing level and has a quality factor of 34. Using flat-level FDM to combine 60 voice signals would require a filter with Q = 96 for the 60th channel. Secondly, standardised equipment can be used for the hierarchical implementation. A group signal is generated using a channel translating equipment (CTE) shown in Figure 13.8a, and a SG signal by a group translating equipment (GTE) in Figure 13.8b. This means that a 60-channel FDM system can be set up very quickly using only five CTEs and one GTE connected, as shown in Figure 13.8c. To build systems of higher capacity, we must go to higher levels in the FDM hierarchy, and this is where the adopted standards differed. The ITU recommended two procedures. The European system corresponded to ITU Procedure 1, the UK system to ITU Procedure 2, whereas the Bell system used in North America did not conform to either of the two recommendations. 13.2.4.1 UK System
Figure 13.9 shows a self-explanatory block diagram of the supergroup translating equipment (STE) in the UK system. Fifteen SG signals vsg1 (t), vsg2 (t), …, vsg15 (t) are multiplexed to give one hypergroup (HG) signal vhg (t). Clearly, vhg (t) contains 60 × 15 = 900 independent and noninterfering voice channels. Examining Figure 13.9, we can make the following observations on the operation of the STE in this system. ●
●
The first SG signal vsg1 (t) is connected directly to the summing point without any frequency translation. It will therefore have an erect spectrum in the range 312 → 552 kHz within the spectrum of the HG signal vhg (t). The remaining SGs, namely vsgi (t), i = 2, 3, …, 15, all have inverted spectra at the output since they are frequency translated using a carrier of frequency f c3,i . Thus, vsgi (t) occupies the frequency range f c3,i − 552 → f c3,i − 312 kHz within the spectrum of vhg (t). The carriers used for the frequency translation of vsg2 (t), vsg3 (t), …vsg15 (t) have frequencies spaced apart by 248 kHz and starting from 1116 kHz to 4340 kHz. Since the SGs have a bandwidth of 240 kHz, it follows that the spectrum of the composite HG signal includes a GB of 8 kHz between each of the component spectra, except between the first and second component spectra, which are separated by 12 kHz. You should be able to see
13.2 Frequency Division Multiplexing
(a)
12 voice signals
Channel Translating Equipment (CTE)
One group signal (60 → 108 kHz)
(b)
5 group signals
Group Translating Equipment (GTE)
One SG signal (312 → 552 kHz)
Voice signal inputs ʋ1 (t) CTE ʋ12 (t) ʋ13 (t) CTE ʋ24 (t) ʋ25 (t) (c)
CTE ʋ36 (t)
GTE
60-channel FDM signal
ʋ37 (t) CTE ʋ48 (t) ʋ49 (t) CTE ʋ60 (t) Figure 13.8
●
(a) CTE; (b) GTE; (c) 60-channel FDM.
that vsg2 (t) is translated to the band 564 → 804 kHz, whereas vsg1 (t) of frequency range 312 → 552 kHz is directly added, hence the separation of 12 kHz (i.e. 564–552) between the two bands. Following the above observations, the spectrum of the HG signal can be easily sketched, as was done in Figure 13.7a for a group signal. This is left as an exercise in Question 13.2, which you may wish to tackle at this point. Note that the last SG signal vsg15 (t) is translated using a 4340 kHz carrier from its baseband at 312 → 552 kHz to the band 3788 → 4028. Thus, reckoning from the location of vsg1 (t), we see that the HG signal occupies the band 312 → 4028 kHz. It therefore carries 900 voice signals in a bandwidth of 3716 kHz.
817
818
13 Multiplexing Strategies
ʋsg1 (t)
HG signal (312 → 4028 kHz)
SG signals (312 → 552 kHz)
ʋsg2 (t)
SSB modulator (fc3,2 = 1116 kHz)
ʋsg3 (t)
SSB modulator (fc3,3 = 1364 kHz)
ʋsg4 (t)
SSB modulator (fc3,4 = 1612 kHz)
ʋsg5 (t)
SSB modulator (fc3,5 = 1860 kHz)
ʋsg6 (t)
SSB modulator (fc3,6 = 2108 kHz)
ʋsg7 (t)
SSB modulator (fc3,7 = 2356 kHz)
ʋsg8 (t)
SSB modulator (fc3,8 = 2604 kHz)
ʋsg9 (t)
SSB modulator (fc3,9 = 2852 kHz)
ʋsg10 (t)
SSB modulator (fc3,10 = 3100 kHz)
ʋsg11 (t)
SSB modulator (fc3,11 = 3348 kHz)
ʋsg12 (t)
SSB modulator (fc3,12 = 3596 kHz)
ʋsg13 (t)
SSB modulator (fc3,13 = 3844 kHz)
ʋsg14 (t)
SSB modulator (fc3,14 = 4092 kHz)
ʋsg15 (t)
SSB modulator (fc3,15 = 4340 kHz)
Σ
ʋhg(t)
SG translating equipment (STE) Figure 13.9
UK System: generation of hypergroup (HG) signal.
The HG signal is used in a fourth level of the FDM hierarchy as a building block to assemble more voice channels depending on the required capacity. A few examples are given below. ●
Multiplexing two HG signals, as shown in Figure 13.10a, to obtain an 1800-channel FDM signal with frequencies in the range 312 → 8120 kHz, and a bandwidth of 7.808 MHz. This FDM signal is used to frequency-modulate a suitable high-frequency carrier and transmitted by radio.
13.2 Frequency Division Multiplexing
HG1 (a) HG2
Σ
1800-channel FDM signal (312 → 8120 kHz)
Σ
2700-channel FDM signal (312 → 12336 kHz)
SSB modulator (fc4,2 = 8432 kHz)
HG1
(b)
HG2
SSB modulator (fc4,2 = 8432 kHz)
HG3
SSB modulator (fc4,3 = 12648 kHz)
Hypergroup signals (312 → 4028 kHz) HG1
SSB modulator (fc4,1 = 8432 kHz)
HG2
SSB modulator (fc4,2 = 12648 kHz)
HG3
SSB modulator (fc = 12648 kHz)
SSB modulator (fc4,3 = 25520 kHz)
HG4
SSB modulator (fc = 12648 kHz)
SSB modulator (fc4,4 = 30360 kHz)
HG5
SSB modulator (fc = 12648 kHz)
SSB modulator (fc4,5 = 35200 kHz)
HG6
SSB modulator (fc = 12648 kHz)
SSB modulator (fc4,6 = 39600 kHz)
HG7
SSB modulator (fc = 12648 kHz)
SSB modulator (fc4,7 = 44000 kHz)
HG8
SSB modulator (fc = 12648 kHz)
SSB modulator (fc4,8 = 48400 kHz)
HG9
SSB modulator (fc = 12648 kHz)
SSB modulator (fc4,9 = 55000 kHz)
HG10
SSB modulator (fc = 12648 kHz)
SSB modulator (fc4,10 = 59400 kHz)
HG12
SSB modulator (fc = 12648 kHz)
SSB modulator (fc4,11 = 63800 kHz)
HG12
SSB modulator (fc = 12648 kHz)
SSB modulator (fc4,12 = 68200 kHz)
10800-channel FDM signal (4404 → 59580 kHz)
Σ
Figure 13.10 Examples of UK system high capacity FDM built using HG signals as building blocks: (a) 1800-channel system; (b) 2700-channel system; (c) 10 800-channel system.
819
820
13 Multiplexing Strategies ●
●
●
Multiplexing three HG signals, as shown in Figure 13.10b, to obtain a 2700-channel FDM signal with frequencies in the range 312 → 12 336 kHz, and a bandwidth of 12.024 MHz. This signal could be conveyed as is on a coaxial cable system or by radio using FM. A 3600-channel FDM signal with frequencies in the range 312 → 16 612 kHz and a bandwidth of 16.3 MHz, which is obtained by multiplexing four HG signals. It was suitable for transmission on 18-MHz coaxial cable systems. A 10 800-channel FDM signal occupying the frequency band 4404 → 59 580 kHz and resulting from the multiplexing of 12 HG signals. This was recommended for transmission on 60 MHz coaxial cable systems. Figure 13.10c shows in a self-explanatory manner how the signal was assembled. You will have an opportunity in Question 13.3 to determine the most stringent filter performance (in terms of Q and ℤ factors) required in the entire 10 800-channel hierarchical FDM system, and to compare this with the case of flat-level FDM.
13.2.4.2 European System
In the European system, a two-stage multiplexing procedure was employed to assemble (starting from SG signals) a 900-channel FDM signal, called supermastergroup (SMG). Figure 13.11 shows a block diagram of an SMG generator. In the first stage, which corresponds to level 3 of the overall FDM hierarchy, five SG signals are translated and combined to give one mastergroup (MG) signal occupying the band 812 → 2044 kHz. The translation uses carriers at frequencies 1364, 1612, 1860, 2108, and 2356 kHz. Thus, an MG signal has bandwidth 1232 kHz, contains 300 voice channels and includes a GB of 8 kHz between its component translated SG signals. In the second stage, which corresponds to level 4 in the overall FDM hierarchy, three MG signals are translated and combined into an SMG signal occupying the band 8516 → 12 388 kHz. The translation uses carriers at frequencies 10 560, 11 880, and 13 200 kHz. Thus, an SMG signal has bandwidth 3872 kHz, contains 900 voice channels and includes a GB of 88 kHz between the spectra of the translated MG signals that form it. Higher-capacity systems were built by multiplexing various combinations of the MG and SMG signals at level 5 of the overall FDM hierarchy. For example, four MGs were combined to give a 1200-channel signal in the band 312 → 5564 kHz. Two SMGs were combined to give an 1800-channel signal in the band 316 → 8204 kHz. A 2700-channel system of baseband 316 → 12 388 kHz was realised by combining three SMG signals, a 3600-channel 5 SG signals (312 → 552 kHz)
SG1
SSB modulator (fc3,1 = 1364 kHz)
SG2
SSB modulator (fc3,2 = 1612 kHz)
SG3
SSB modulator (fc3,3 = 1860 kHz)
SG4
SSB modulator (fc3,4 = 2108 kHz)
SG5
SSB modulator (fc3,5 = 2356 kHz)
3 MG signals (812 → 2044 kHz)
Σ
SG translating equipment (STE)
Figure 13.11
MG1
SSB modulator (fc4,1 = 10560 kHz)
MG2
SSB modulator (fc4,2 = 11880 kHz)
MG3
SSB modulator (fc4,3 = 13200 kHz)
SMG signal (8516 → 12388 kHz)
Σ
SMG
MG translating equipment (MTE)
European System. Generation of mastergroup (MG) and supermastergroup (SMG) signals.
13.2 Frequency Division Multiplexing
UMG signal (564 → 3084 kHz)
10 SG signals (312 → 552 kHz)
SG1
SSB modulator (f c3,1 = 1116 kHz)
SG2
SSB modulator (f c3,2 = 1364 kHz)
SG3
SSB modulator (f c3,3 = 1612 kHz)
SG4
SSB modulator (f c3,4 = 1860 kHz)
SG5
SSB modulator (f c3,5 = 2108 kHz)
SG6
SSB modulator (f c3,6 = 2356 kHz)
SG7
SSB modulator (f c3,7 = 2652 kHz)
SG8
SSB modulator (f c3,8 = 2900 kHz)
SG9
SSB modulator (f c3,9 = 3148 kHz)
SG10
SSB modulator (f c3,10 = 3396 kHz)
Σ
UMG
SG translating equipment (STE) Figure 13.12
Bell System. Generation of Bell U600 MG (UMG) signal.
system of baseband 316 → 17 004 kHz was realised by combining four SMG signals, and a 10 800-channel system of baseband 4332 → 59 684 kHz was realised by combining 12 SMG signals. 13.2.4.3 Bell System
Figure 13.12 shows the implementation of level 3 of the overall FDM hierarchy in the Bell system. Ten SG signals are translated and combined into one MG signal occupying the band 564 → 3084 kHz and referred to as the U600 mastergroup (UMG) signal. The carrier frequencies used for the translation are as indicated in the diagram, and are spaced at 248 kHz, except at the seventh carrier, where the frequency increment is 296 kHz. The UMG signal therefore has a bandwidth of 2520 kHz, contains 600 voice channels, and includes a GB of 8 kHz between the constituent supergroups, except between the sixth and seventh, where the gap is 56 kHz. Systems of various capacities were built using mainly the UMG. For example, six MG signals may be multiplexed to form what is referred to as a jumbogroup (JG), which contains 3600 voice channels and occupies the band 564 → 17 548 kHz. To form the JG, one UMG is connected directly to the summing point without frequency translation, whereas the other five UMGs are added after being frequency translated using carriers at frequencies 6336,
821
822
13 Multiplexing Strategies
9088, 11 968, 14 976, and 18 112 kHz. A 10800-channel system in the band 3124 → 51 532 kHz could be realised by multiplexing 18 UMGs or three JGs. 13.2.4.4 Nonvoice Signals
Only telephony speech signals have been considered thus far. However, the following nonvoice signals were also transmitted in the hierarchical FDM systems discussed above. 13.2.4.4.1 Pilot Tones
The translating equipment at the transmitter inserts a reference pilot sinusoid into the composite signal at its output. This signal can be monitored at various points in the transmission system for fault location – a missing pilot indicates a malfunctioning of the relevant translating equipment. The pilot signal is also used at repeater stations and receivers for automatic gain control to compensate for variations in the attenuation along the cable path due especially to seasonal temperature changes. A CTE adds a group reference pilot at 84.080 kHz, a GTE adds a supergroup reference pilot at 411.920 kHz, an STE adds a hypergroup or mastergroup reference pilot at 1552 kHz, and an MTE (mastergroup translating equipment) adds an SMG reference pilot at 11092 kHz. Note that the group and SG reference pilots lie in the 0.9 kHz gaps between translated voice bands, whereas the 1552 and 11 092 kHz pilots are in the GBs between translated SG and HG/MG assemblies, respectively. Cable transmission systems also included regulation pilots with frequencies (shown in Table 13.1) at the top end of the passband of the cable, where the unwanted variation in path attenuation is at a maximum. The gain of each repeater along the transmission system is adjusted to maintain a constant level of the regulation pilot signal. In addition, the frequency comparison pilots (FCPs) shown in Table 13.1 are included at the bottom end of the cable passband, where phase error is at a minimum. The FCP was employed at frequency translation points to maintain the frequency stability of the master oscillator from which all carrier frequencies were derived. 13.2.4.4.2 Data
The individual 4 kHz voice channels were extensively used to carry data signals. First, the bit stream of the data signal is used to modulate a voice-frequency carrier in a modem. The technique of digital modulation is covered in Chapter 11. Early modems (e.g. Bell 202 standard) achieved bit rates of 1.2 kb/s using frequency shift keying (FSK). Bit rates up to 56 kb/s were achieved (e.g. in ITU V.90 standard) using amplitude and phase shift keying (APSK) modulation formats. To achieve higher bit rates, an entire group channel of bandwidth 48 kHz or SG channel of bandwidth 240 kHz was used to transmit data, which were of course again carried using a suitable carrier frequency. Data transmission is particularly sensitive to phase distortion and it was necessary to employ an adaptive equaliser at the receiver (see Chapter 12) to compensate for group delay distortion. Table 13.1 Regulation and frequency comparison pilots of various cable transmission systems. Cable System Bandwidth (MHz)
No. of channels
Cable size (mm)
Repeater spacing
Ref. Pilot (kHz)
FCP (kHz)
1.3
300
1.2/2.4
8 km
1 364
60
4
960
1.2/2.4
4 km
4 092
60
12
2 700
1.2/2.4
2 km
12 435
308
18
3 600
2.6/9.5
4 km
18 480
564
60
10 800
2.6/9.5
1.5 km
61 160
564
13.2 Frequency Division Multiplexing
13.2.4.4.3 Wideband Audio
The bands of several adjacent voice channels were used for transmitting a single wideband audio or sound programme signal. The three standards specified by the ITU included: ● ● ●
2 voice channels for 50 Hz → 6.4 kHz audio 3 voice channels for 50 → 10 kHz audio 6 voice channels for 30 → 15 kHz audio.
Note that the lower-frequency limit of the above audio signals is lower than the value of 300 Hz allowed for voice signals. This gives rise to a more stringent filter requirement, increasing ℤ by a factor of about 6 and 10 for the 50 Hz and 30 Hz limits, respectively. An entire group channel was also used for the transmission of stereophonic sound. 13.2.4.4.4 Television
Analogue television signals could also be carried in high-capacity FDM systems. Because of the significant low-frequency content, which makes frequency translation by SSB impracticable, and the large video signal bandwidth (∼ 6.0 MHz), a modulation technique known as vestigial sideband (VSB) was employed to place the television signal in the desired frequency band. VSB is discussed at length in Chapter 7. One television signal (in the upper band) and up to 1200 voice channels (in the lower band) could be accommodated in a 12 MHz coaxial cable system. In the 18 and 60 MHz coaxial cable systems, one television signal could be carried in two adjacent HG or SMG bands. Thus, the 18 MHz system could carry a maximum of two television signals, and the 60 MHz system could carry six. Alternatively, 1800 voice channels + one television signal were simultaneously carried in the 18 MHz system. And the 60 MHz system could carry 9000 voice channels + one television signal, or 7200 voice channels + two television signals.
13.2.5 Wavelength Division Multiplexing FDM is used in some high-capacity optical fibre communication systems, where it is referred to as wavelength division multiplexing (WDM). The reason for this change of nomenclature is that above the radio band 3 kHz → 3000GHz physicists have traditionally identified electromagnetic radiation by its wavelength, rather than frequency. And since the electromagnetic radiation that can propagate in a fibre transmission medium with minimum attenuation lies in the infrared band, we have followed physicists to identify the carrier signal by its wavelength. WDM then results when multiple carrier signals, each carrying an independent bit stream, are transmitted simultaneously along one fibre. Figure 13.13a shows a basic implementation of WDM in which N independent bit streams are multiplexed onto a single fibre transmission medium. Each bit stream is a TDM signal (see Section 13.3), for example the OC-48 signal, which carries 32 256 voice channels, or the OC-192 carrying 129 024 voice channels. The bit streams are represented as non-return-to-zero (NRZ) voltage waveforms, and each modulates (by on–off keying) the optical emission of a separate laser source of respective wavelengths 𝜆1 , 𝜆2 , …, 𝜆N . The ITU has defined six optical transmission bands in the infrared region. See Table 5.1 in Chapter 5, where Figure 5.24 also shows the fibre loss per kilometre in these bands, which is roughly 0.5 dB/km in the original (O) band and 0.2 dB/km in the conventional (C) band. A separation of 2 nm between the wavelengths of the optical emissions (i.e. carrier signals) from the laser sources would allow up to N = 50 WDM channels in O band (1260–1360 nm) and N = 17 channels in C band. Current systems have three different WDM regimes depending on the value of N. Normal WDM is the most basic in which N = 2 and the two simultaneously transmitted wavelengths are usually 1310 nm and 1550 nm on a single optical fibre. Coarse WDM (CWDM) describes systems that have N > 2 and a moderate spacing between wavelengths. In 2003, the ITU standardised an 18-channel CWDM involving simultaneously transmitted wavelengths
823
13 Multiplexing Strategies
Laser source 1
Bit stream 2
Laser source 2
Bit stream N
Laser source N
λ1
λ1
λ2
λN
Optical demultiplexer
Bit stream 1
Optical multiplexer
824
Single fibre
Transmitter
λ2
λN
Optical detector 1
Bit stream 1
Optical detector 2
Bit stream 2
Optical detector N
Bit stream N
Receiver (a) Lens
λ1 Output fibres Input fibre
λ2
Diffraction grating
λN
λ1 + λ2 +…+ λN (b)
Figure 13.13
(a) WDM system; (b) optical demultiplexer.
from 1271 to 1611 nm at a spacing of 20 nm on a single optical fibre. Finally, dense WDM (DWDM) refers to systems implemented in the optical C band using a very dense spacing of wavelengths. For example, using speed of light c = 299 792 458 m/s, a 40-channel DWDM in C band (1530–1565 nm) requires optical carrier spacing of 109.55 GHz, which corresponds to a wavelength spacing of 0.875 nm. And an 80-channel DWDM requires an optical carrier spacing of 54.77 GHz or 0.4375 nm. Raman amplification enables the usable wavelengths to be extended into the optical L band (1565–1625 nm), which allows the number of channels in DWDM systems to be increased even further. The optical multiplexer is a passive coupler, which may be realised by butting all N laser diodes to a large-diameter fibre or mixing rod of short length. A single fibre butted to the other end of the rod collects the composite signal, which is a well-diffused mixture of the emissions from all the diodes. Not surprisingly, there is a significant insertion loss penalty in this simple multiplexer realisation. Demultiplexing is based on the spatial dispersion of the mixture of wavelengths by a prism or diffraction grating, as shown in Figure 13.13b. The incoming optical signal consisting of N wavelengths is focussed by a lens onto a diffraction grating, which returns the incident beam back to the lens with the different wavelengths separated at different angles. The lens focuses the separated optical signals onto different output fibres so that each fibre carries one of the originally multiplexed signals to an optical detector (e.g. a PIN diode or avalanche photodiode), which extracts the corresponding bit stream. A 16-channel WDM of OC-48 TDM signals enables a single fibre to carry 16 × 32 256 or 516 096 voice channels. Applying the same multiplexing strategy to OC-192 TDM signals allows one fibre to handle a massive 2 064 384 voice channels. Note that this is an example of hybrid multiplexing in which independent signals are packed into a common transmission medium using more than one multiplexing strategy. In this case, the OC-48 or OC-192 signal is assembled by TDM of a number of digitised voice signals. These TDM signals are then stacked in different frequency bands of the fibre transmission medium using frequency (all right, wavelength) division multiplexing.
13.3 Time Division Multiplexing
13.3 Time Division Multiplexing 13.3.1 General Concepts We see in Chapter 9 that an analogue signal can be perfectly reconstructed from its samples taken at intervals of T s (= 1/f s ), provided the sampling frequency f s is at least equal to the bandwidth of the analogue signal. Based on this (sampling) theorem, all bandlimited analogue signals can be digitised (e.g. converted to a pulse code modulation (PCM) signal), as is discussed in detail in Chapter 10. The signal is sampled at a suitable frequency f s and each sample is represented using k bits, referred to as a word. The case k = 8 is called a byte or octet. This yields an information-bearing bit stream having bit rate Rc = kf s
bits∕second
(13.9)
Equation (13.9) gives the bit rate of one signal, which in this context is also referred to as a channel or tributary. We wish to examine how N such tributaries may be combined into one composite bit stream by TDM and the steps necessary to ensure accurate recovery of each channel at the receiver. An analogue TDM system is discussed in Chapter 1 (Section 1.5.3.2) using Figures 1.16 and 1.17, which you may wish to refer to at this point. Figure 1.17 shows a TDM signal (for N = 3) obtained by interleaving samples from each of the N tributaries. Here we are dealing with a digital system in which the samples have been digitised and each is represented with k bits. Thus, for correct reconstruction at the receiver, each of the N channels must have k bits in time slots of duration T s . This time interval over which one word has been taken from each of the N channels is known as a frame. There are two types of frame organisation. ●
●
Word-interleaved frame: the frame (of duration T s ) is filled by an interleaver, which visits each of the N channel ports once during the interval T s , and at each visit takes one word (of k bits) from the storage dedicated to that channel. These bits are clocked out serially to give the TDM signal. The result is the frame structure shown in Figure 13.14a, and we see that a frame contains kN message bits. In this diagram, Wordj is the k-bit code bk-1 …b2 b1 b0 of the sample taken from the jth channel during the interval T s , where bk−1 is the most significant bit of the word, b0 the least significant bit (lsb), etc. Bit-interleaved frame: a bit-interleaved frame is formed by taking one bit at a time from each of the N channel ports visited by the interleaver in a cyclical order (i.e. 0, 1, 2, …, N − 1, 0, 1, 2, …). The bits are clocked out in a serial fashion to give the output TDM signal. Since each channel requires a word of k bits to be sent in each interval of T s , the interleaver must visit each port k times during this interval. The structure of the bit-interleaved frame is therefore as shown in Figure 13.14b, where b0 (j) is the lsb of the sample from the jth channel, etc. Note that both types of frames (bit- and word-interleaved) are of the same duration T s and contain the same number of message bits kN. However, bit-interleaving does not require storage at the tributary ports, as does word-interleaving, to hold each message word until it is read by the interleaver. We will see that TDM is obtained at the first level of the plesiochronous digital hierarchy (PDH) by word-interleaving, whereas bit-interleaving is used at the higher levels.
Synchronisation between transmitter and receiver is crucial to the correct operation of any digital transmission system. In Section 12.4.3 we discuss clock extraction, which enables the receiver to achieve bit synchronisation with the transmitter and hence to use precisely correct decision instants for detecting the incoming bit stream. However, the packaging of bits from N tributaries into frames introduces a further synchronisation requirement, known as frame alignment or frame synchronisation. This is needed to give the receiver a precise knowledge of the start of each frame so that the bits in the TDM signal can be correctly distributed to their respective channels without the need for additional address information. To this end, the multiplexer inserts at regular intervals a special pattern of bits known as a frame alignment word (FAW). This serves as a marker with which the demultiplexer is synchronised at the receiver. Two different arrangements of the framing bits are in common use.
825
826
13 Multiplexing Strategies
(a)
WordN–1
Word2
….
Word1
Word0
Frame = kN bits; duration = Ts
(b)
bk–1(N–1)
….
bk–1(1)
bk–1(0)
….
b1(N–1) ….
b1(1)
b1(0) b0(N–1) ….
b0(1)
b0(0)
Frame = kN bits; duration = Ts
Figure 13.14 ● ●
Frame organisation: (a) word-interleaving; (b) bit-interleaving.
Grouped or bunched FAW: here the FAW occupies a number of consecutive bit positions in each frame. Distributed FAW: a distributed FAW consists of several bits spread over one frame, or one bit per frame spread over several adjacent frames, called a multiframe or superframe.
Grouped FAW is employed in the European E1 TDM system, whereas the T1 system of North America uses a distributed FAW. There is a chance that a FAW can occur within the message bits leading to wrong alignment, and that a transmitted FAW can be corrupted by one or more bit errors. To minimise the problems posed by these two events, alignment is declared only after a correct FAW is detected at the same relative position within several (say three) consecutive time intervals. This interval is that over which a complete FAW was inserted at the transmitter, which could be a frame (for a bunched FAW) or a multiframe (for some distributed FAWs). Secondly, a loss of alignment is declared (and a free search for the FAW thereby initiated) only after a number of (say four) incorrect FAWs are received in consecutive intervals. Thirdly, the FAW is chosen to be of an intermediate length. Too long and it is more readily corrupted by noise; too short and it is more frequently imitated in the message bits. Furthermore, the FAW must be a sequence of bits that cannot be reproduced when a part of the FAW is concatenated with adjacent message bits (with or without bit errors), or when several FAWs are bit interleaved. The control of switching and execution of other network management functions require the transmission of signalling information in addition to the message and FAW bits discussed above. This is accomplished by inserting a few auxiliary bits in various ways. ●
●
●
Bit robbing: a signalling bit periodically replaces the lsb of a message word. This is done in every sixth frame in the T1 system that employs this technique. The resulting degradation is imperceptible for voice messages but is clearly totally unacceptable for data (e.g. American Standard Code for Information Interchange (ASCII)-coded) messages. For this reason, in a TDM system that uses bit-robbed signalling the lsb of message words in all frames are left unused when carrying data. Out-of-word signalling: within the sampling interval T s , the message word from each channel is accompanied by one signalling bit, which gives a signalling rate of f s bits/second per channel. Alternatively, a time slot of k bits in every sampling interval is dedicated as a signalling channel whose bits are assigned in turn to each of the N channels. The signalling rate in this case is therefore kf s /N bits/second per channel. We will see that the E1 system uses this type of signalling. Common signalling: one slot of k bits is dedicated in each time interval T s to signalling, which leads to an overall signalling rate of kf s bits/second. The entire signalling slot is assigned to one message channel at a time according to need. Some of the bits are, however, used to provide a label that identifies which channel the signalling belongs to.
13.3 Time Division Multiplexing
From the foregoing discussion we see that the bit rate of an N-channel TDM signal exceeds N times the bit rate of each tributary because of the insertion of overhead bits for frame alignment and signalling. There are kN message bits in each frame of duration T s . Thus, f s (= 1/T s ) is the frame rate. If we denote the total number of framing and signalling bits in each frame by l (for control bits), it follows that the bit rate of the TDM signal is given by R=
Nk + l Ts
= Nkf s + lf s = NRc + lf s
(13.10)
where Rc is the tributary bit rate stated earlier in Eq. (13.9). Considering the fraction of message bits in the TDM signal, we may define the data transmission efficiency as Number of message bits × 100% Total number of bits NRc × 100% = R
𝜂=
(13.11)
It is important to note the significance of the parameters on the right-hand side of Eq. (13.11). N is the number of message channels at the input of the nonhierarchical or flat-level TDM multiplexer, Rc is the message bit rate emanating from each channel, and R is the output bit rate of the multiplexer. Equation (13.11) can be applied to a TDM signal obtained after several hierarchical levels of multiplexing, with NRc being the total number of message bits per second in the TDM signal, which includes bits added ahead of the multiplexer to each of the tributary bit streams for error control. We have so far discussed in very general terms what is a flat-level TDM. To allow the building of high-capacity TDM systems using standardised equipment, a hierarchical multiplexing procedure was adopted.
13.3.2 Plesiochronous Digital Hierarchy The basic building block of the PDH is the 64 kb/s channel, which results from the digitisation of analogue speech in the following manner. Analogue speech signal is first filtered to limit its frequency content to a maximum value of 3400 Hz. It is then sampled at the rate f s = 8 kHz, meaning that the sampling interval is T s = 1/f s = 125 μs, which constitutes a frame. Each sample is quantised and represented using k = 8 bits. The coding scheme follows a nonuniform quantisation procedure, which is either A-law (in Europe) or 𝜇-law (in North America). Thus, each voice signal is converted to a bit stream generated at the rate 8000 samples/second × 8 bits/sample = 64 kb/s. This is the bit rate of each input channel or tributary at the very first level of the multiplexing hierarchy. This brief summary of sampling and digitisation is adequate for our treatment here, but you should feel free to consult Chapters 9 and 10 for a more detailed discussion. There are three different procedures for hierarchical multiplexing of these 64-kb/s channels, namely the E1 system in Europe, the T1 system in North America, and the (non-ITU standardised) J1 system in Japan. 13.3.2.1 E1 System
The first level of multiplexing combines 30 digitised speech signals, each of bit rate 64 kb/s, to give the Order-1 TDM signal or simply E1. The equipment used for this purpose is known as a primary muldex – a portmanteau of multiplexer and demultiplexer – a block diagram of which is shown in Figure 13.15 with emphasis on the multiplexing operation. Note that the PCM codec (for coder and decoder) is the A-law type. The E1 frame is often described as CEPT PCM-30, where CEPT refers to Conference of European Posts and Telecommunications, and 30 signifies the number of voice channels.
827
13 Multiplexing Strategies
30 analogue voice signal inputs
Order-1 TDM signal output
ʋ1 (t)
A-law PCM Codec
64 kb/s
ʋ2 (t)
A-law PCM Codec
64 kb/s
ʋ15 (t)
A-law PCM Codec
64 kb/s
ʋ16 (t)
A-law PCM Codec
64 kb/s
ʋ17 (t)
A-law PCM Codec
64 kb/s
Multiplex control bits 64 kb/s 0 1 2
Byte Interleaver
828
15 16 17 18
2048 kb/s
31 Slot Nos. 64 kb/s
ʋ30 (t)
A-law PCM Codec
64 kb/s
Signalling control bits
E1 Primary Muldex Figure 13.15
E1 first-order TDM.
13.3.2.1.1 E1 Frame Structure
A frame of duration T s = 125 μs is divided into 32 time slots, and 8 bits are placed in each slot. These bits are clocked out serially to give the E1 signal, which therefore has a bit rate 32 × 8 bits R1 = = 2048 kb∕s (13.12) 125 × 10−6 seconds Of the 32 slots or channels C0 to C31 in each frame, 30 carry message bits, which will vary from frame to frame according to the samples of the respective message signals. Two of the channels (C0 and C16) carry overhead bits for managing the multiplexing operation and signalling. The efficiency of the E1 signal therefore follows from Eq. (13.11), with N = 30, Rc = 64 kb/s, and R = 2048 kb/s 30 × 64 × 100% = 93.75% 𝜂1 = 2048 Out-of-word signalling is employed with channel C16 providing the signalling needs of two of the message channels at a time, 4 bits to each channel. It therefore takes 15 adjacent frames to cover the signalling of the 30 message channels. Dedicating channel C16 in the first frame for marking the beginning of this group of frames, we have what is known as a multiframe that consists of 16 adjacent frames and is of duration 16 × 125 μs = 2 ms. The complete content of channels C0 and C16 can be seen over an entire multiframe consisting of frames F1 to F16, as shown in Figure 13.16. In considering this multiframe, we ignore the contents of channels C1 to C15 and C17 to C31 in each frame since these are message bits. We note the following: ●
Signalling channel C16 In the first frame F1, the first 4 bits of the channel (C16) are used to carry a multiframe alignment word (MAW) = 0000, which marks the beginning of a multiframe and allows correct numbering of the component
13.3 Time Division Multiplexing
Multiframe = 16 frames
One frame = 32 channels
Alignment channel (CO) only Odd frames (F1, F3, …, F15)
F1 F2 F3 F4
Each frame
F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 F16
C0 C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15 C16 C17 C18 C19 C20 C21 C22 C23 C24 C25 C26 C27 C28 C29 C30 C31
FAW
IB 0 0 1 1 0 1 1
IB 1 AB NB NB NB NB NB
Signalling channel (C16) only
F1 only MAW
0 0 0 0 XB AB XB XB
Figure 13.16
Even frames (F2, F4, …, F16)
Other frames Fj; j = 2, 3, …, 16 b3 b2 b1
Signalling bits for message in channel Cj – 1
b0 b3 b2 b1
Signalling bits for message in channel Cj + 15
b0
E1 level-1 frame format.
frames. Three bits are unassigned extra bits (XB); and one bit is used as an alarm bit (AB) to signal the loss of multiframe alignment. AB is binary 0 during normal operation. In all other frames Fj (for j = 2, 3, …, 16), four bits of the channel carry signalling data (e.g. on or off hook, dialling digits, call progress, etc.) for channel Cj − 1, and the remaining four bits carry the signalling for channel Cj + 15. What this means is that channel C16 in frame F2 carries the signalling bits for message-carrying channels C1 and C17. In frame F3, the signalling channel C16 carries signalling bits for channels C2 and C18, and so on until in the last frame F16 of the multiframe it carries signalling bits for channels C15 and C31. In this way, each one of the message channels uses four signalling bits in every multiframe of duration 2 ms, which gives a channel signalling bit rate of Rsignalling = ●
4 bits = 2 kb∕s 2 ms
Alignment channel C0 The first bit of this channel in all frames is an international bit (IB), which is now used to provide a check sequence for error detection as follows. A 4-bit cyclic redundancy check (CRC-4) code is computed using all 2048 bits in frames F1 to F8, referred to as the first submultiframe (SMF). This code is conveyed in the first bit (IB) of frames F9, F11, F13, and F15 (i.e. within the next SMF). Another CRC-4 code is computed on the bits in the second SMF (i.e. frames F9 to F16), and conveyed in the first bit of frames F1, F3, F5, and F7 in the next
829
830
13 Multiplexing Strategies
SMF. At the receiver the same computation is repeated, and the result is compared with the received CRC-4 code. Any discrepancy is an indication of error in one or more bits of the relevant SMF. In the odd frames F1, F3, …, F15, the last seven bits of channel C0 carry a FAW = 0011011. In the even frames F2, F4, …, F16, the second bit is always set to binary 1 to avoid a chance imitation of the FAW within these even frames. The third bit is used as an alarm bit (AB), which is set to 1 to indicate the loss of frame alignment and is 0 during normal operation. The last five bits are designated as national bits (NB), which are set to 1 when an international boundary is crossed. 13.3.2.1.2 E1 Hierarchy
A digital multiplexing hierarchy, shown in Figure 13.17, is used to build TDM systems of the required capacity, which allows a better exploitation of the bandwidth available on the transmission medium. Five levels of multiplexing are shown. ●
●
In level 1, referred to as the primary level and discussed in detail above, 30 voice channels are multiplexed by byte-interleaving in a primary muldex to yield an Order-1 TDM or E1 signal of bit rate 2048 kb/s. This rate is often identified simply as 2 Mb/s. Four Order-1 TDM signals are combined in a muldex, more specifically identified as a 2–8 muldex. The output is an Order-2 TDM or E2 signal of bit rate 8448 kb/s (a rate often referred to as 8 Mb/s), which contains 4 × 30 = 120 voice channels. The multiplexing is by bit-interleaving in this and higher levels of the hierarchy. From Eq. (13.11), the efficiency 𝜂 2 of this Order-2 TDM signal is given by 𝜂2 =
●
●
●
120 × 64 kb∕s = 90.91% 8448 kb∕s
Four Order-2 TDM signals are combined in an 8–34 muldex. The output is an Order-3 TDM or E3 signal of bit rate 34 368 kb/s – often abbreviated to 34 Mb/s. This signal contains 4 × 120 = 480 voice channels and has an efficiency 𝜂 3 = 89.39%. Four Order-3 TDM signals are multiplexed in a 34–140 muldex to give an Order-4 TDM or E4 signal of bit rate 139 264 kb/s – often referred to simply as 140 Mb/s, which contains 4 × 480 = 1920 voice channels, and has an efficiency 𝜂 4 = 88.24%. Finally, four Order-4 TDM signals are multiplexed in a 140–565 muldex to give an Order-5 TDM or E5 signal of bit rate 564 992 kb/s – referred to simply as 565 Mb/s, which contains 4 × 1920 = 7680 voice channels, and has an efficiency 𝜂 5 = 87.00%. A further multiplexing level is possible that combines four Order-5 TDM signals to yield a 2.5 Gb/s TDM signal carrying 30 720 voice channels.
We observe that at each level of the hierarchy the bit rate of the output TDM signal is more than the sum of the bit rates of the input tributaries. The efficiency of the TDM signals therefore decreases monotonically as we go up 2048 kb/s (30 voice channels)
ʋ1(t) ʋ30(t)
Primary Muldex
30 analogue voice signals
Figure 13.17
Four Order-1 TDM signals
8448 kb/s (120 voice channels)
2–8 Muldex Four Order-2 TDM signals
34368 kb/s (480 voice channels)
8 – 34 Muldex Four Order-3 TDM signals
CEPT plesiochronous digital hierarchy.
139264 kb/s (1920 voice channels)
34 – 140 Muldex Four Order-4 TDM signals
564992 kb/s (7680 voice channels)
140 – 565 Muldex Order-5 TDM signal
13.3 Time Division Multiplexing
the hierarchy. The reason for this is the insertion of control bits into the TDM frame produced by each muldex in the hierarchy. It is worthwhile to examine this further. 13.3.2.1.3 Higher-level Frames Structure
The frame format of the primary muldex has already been discussed in detail. The frame structure at the other levels of the hierarchy is shown in Figure 13.18. We note the following. ●
A bunched FAW is inserted at the beginning of each frame, with FAW2 = FAW3 = 1111010000 FAW4 = 111110100000
●
●
This enables the demultiplexer to recognise the start of each frame and therefore to correctly route the bits to their tributaries. The tributary bits are filled by taking one bit at a time in order from each of the four (input) tributaries. That is, bit-interleaving is used. Ji is a justification bit for the ith tributary, i.e. J1 for tributary 1, J2 for tributary 2, and so on. It is either a dummy bit or a legitimate bit taken from the tributary. The basis of this decision and the purpose of the J bit are explained below. The role of the J bits is to allow correct multiplexing of four tributaries whose bit rates, although nominally equal, may drift slightly apart. In fact, this is why the hierarchy is referred to as plesiochronous, which means nearly synchronous. Each tributary bit stream is written into a buffer under the control of a clock frequency 848-bit E2 Frame
1536-bit E3 Frame
2928-bit E4 Frame
10 -bit FAW2
10 -bit FAW3
12 -bit FAW4
1 Alarm bit 1 National bit
4 Alarm bits 2 Alarm bits
472 Tributary bits C 1 C 2 C 3 C 4 (4 bits)
200 Tributary bits
372 Tributary bits
C 1 C 2 C 3 C 4 (4 bits)
C 1 C 2 C 3 C 4 (4 bits)
208 Tributary bits
380 Tributary bits
484 Tributary bits
C 1 C 2 C 3 C 4 (4 bits)
C 1 C 2 C 3 C 4 (4 bits)
C 1 C 2 C 3 C 4 (4 bits)
208 Tributary bits
380 Tributary bits
C 1 C 2 C 3 C 4 (4 bits)
C 1 C 2 C 3 C 4 (4 bits)
J 1 J 2 J 3 J 4 (4 bits)
J 1 J 2 J 3 J 4 (4 bits)
204 Tributary bits
376 Tributary bits
484 Tributary bits C 1 C 2 C 3 C 4 (4 bits)
484 Tributary bits C 1 C 2 C 3 C 4 (4 bits) 484 Tributary bits C 1 C 2 C 3 C 4 (4 bits) J 1 J 2 J 3 J 4 (4 bits) 480 Tributary bits
FAW2 = FAW3 = 1111010000; FAW4 = 111110100000 Figure 13.18 CEPT higher level frame formats. Nominal bit rates are 8448 kb/s, 34368 kb/s, and 139 264 kb/s for E2, E3, and E4, respectively.
831
832
13 Multiplexing Strategies
●
extracted from the bit stream. The buffer is then read by the interleaver under the control of a common clock of slightly higher frequency. Occasionally, to prevent buffer i (for tributary i) from emptying, a dummy bit Ji is given to the interleaver rather than a bit being read from the buffer. This is known as positive justification or bit stuffing. Thus, Ji will be either a legitimate bit from the ith tributary or a dummy bit that must be discarded at the demultiplexer. The demultiplexer must therefore have a way of knowing which one is the case. Ci is a control bit that is set to 1 to indicate that Ji is a dummy bit. Ci = 0 thus indicates that Ji is a legitimate bit from the ith tributary. To protect this important control bit from error, it is sent more than once at different locations within the frame. The demultiplexer decides on the value of Ci based on majority voting. For example, in the third-order multiplex frame, Ci is taken to be a 1 if up to two of its three repeated transmissions are 1’s. Note that a wrong decision about Ci and hence about whether Ji is a dummy bit would lead to a very serious problem of bit slip in subsequent bit intervals of the frame.
13.3.2.2 T1 and J1 Systems
The T1 system (in North America) and J1 system (in Japan) both have identical first-order or primary multiplexing. Twenty-four analogue voice signals are each digitised to 64 kb/s using 𝜇-law PCM and multiplexed by byte interleaving to yield a TDM signal, which is referred to as DS1. The DS1 frame, of duration T s = 125 μs, contains 24 time slots of 8 bits per slot, plus one extra bit used for framing. Thus, the bit rate R1 and efficiency 𝜂 1 of the DS1 signal are given by (24 × 8) + 1 bits = 1544 kb∕s 125 × 10−6 seconds 24 × 64 kb∕s × 100% = 99.48% 𝜂1 = 1544 kb∕s
R1 =
13.3.2.2.1 T1 Hierarchy
Figure 13.19a shows the North American PDH, which features four levels of multiplexing that are used to build systems of the required capacity. The first level of multiplexing generates the DS1 signal referred to above and discussed further shortly. Subsequent levels of multiplexing are based on bit-interleaving of the input tributaries, with extra bits inserted for distributed frame alignment, justification and justification control, and other services, such as alarm. The second level of multiplexing combines four DS1 signals into a 96-channel 6312 kb/s DS2 signal of efficiency 97.34%. At the third level, seven DS2 signals are multiplexed into a 672-channel 44 736 kb/s DS3 signal of efficiency 96.14%. There are three options at the fourth level of multiplexing. In one procedure, six DS3 signals are multiplexed into a 4032-channel 274 176 kb/s DS4 signal of efficiency 94.12%. Another standard involves the multiplexing of three DS3 signals into a 2016-channel 139 264 kb/s DS4 signal of efficiency 92.65%. Yet another procedure (not standardised by ITU) combines 12 DS3 signals into an 8064-channel 564 992 kb/s DS4 signal of efficiency 91.35%. Observe that the last two procedures yield signals of the same bit rates as the Order-4 and Order-5 TDM signals in the CEPT hierarchy, but the DS4 signals have a higher efficiency by about 4.4%. 13.3.2.2.2 T1 Frame Structure
Let us consider the constitution of the DS1 frame in more detail. Figure 13.19b shows what is referred to as a superframe, which consists of 12 adjacent DS1 frames numbered F1 to F12. We note that each frame (of duration 125 μs) contains 193 bits, which are assigned as follows: ● ● ●
The first bit of each frame in the superframe is used to provide a distributed 12-bit FAW = 100011011100. The remaining 192 bits are message bits taken 8 bits at a time from 24 input channels numbered C0 to C23. Every 6 frames – the 6th and 12th frames of the superframe, the bit interval of the lsb of each channel is used to send a signalling bit, which we have identified as A-bit for the 6th frame, and B-bit for the 12th frame. The distortion is imperceptible for voice signals but totally unacceptable for data. Two different approaches may be
13.3 Time Division Multiplexing
1544 kb/s (24 voice channels)
(a) ʋ1(t) ʋ24(t)
Primary DS1 Muldex
6312 kb/s (96 voice channels)
1.5 – 6 Muldex
DS2
4 DS1 signals
24 analogue voice signals
44736 kb/s (672 voice channels)
6 – 45 Muldex
274 176 kb/s (4032 voice channels)
DS3
45 – 274 Muldex
7 DS2 signals
DS4
6 DS3 signals or DS3
139 264 kb/s (2016 voice channels)
45 – 140 Muldex
DS4
3 DS3 signals or DS3
564 992 kb/s (8064 voice channels)
45 – 565 Muldex
DS4
12 DS3 signals
(b) 1 framing bit
F1 1
F2 0
F3 0
F4 0
F5 1
F6 1
F7 0
F8 1
F9 1
F10 1
F11 0
F12 0
8 bits
CO
CO
CO
CO
CO
CO A
CO
CO
CO
CO
CO
CO B
8 bits
C1
C1
C1
C1
C1
C1 A
C1
C1
C1
C1
C1
C1 B
A 8 bits C23
C23
C23
C23
C23
C23 A
B C23
C23
C23
C23
C23
C23 B
1 signalling bit
Figure 13.19 structure.
(a) North American PDH; (b) DS1 (or T1) superframe; (c): T2 (DS2) frame structure; (d) 4760-bit T3 (DS3) frame
adopted to get around this problem when transmitting data. (i) The 8th bit is not used at all in all channels of every frame. This restricts each channel to 7 bits per frame at 8000 frames per second, which gives a channel capacity of only 56 kb/s. Efficiency of the output TDM signal drops significantly from 99.48 to 87.05%. (ii) The 24th channel (C23) is devoted as a common signalling channel, called the D-channel. In this case efficiency equals 95.34%. This was the technique adopted for the North American primary rate integrated services digital network (PRI), termed 23B + D service, in which there were 23 bearer channels each of bit rate 64 kb/s, and one
833
13 Multiplexing Strategies
(c)
1176 -bit frame
834
M1 C1 F0 C1 C1 F1 M2 C2 F0 C2 C2 F1 M3 C3 F0 C3 C3 F1 A C4 F0 C4 C4 F1
48 tributary bits 48 tributary bits 48 tributary bits 48 tributary bits 48 tributary bits J 1 T 2 T 3 T 4 44 Trib. bits 48 tributary bits 48 tributary bits 48 tributary bits 48 tributary bits 48 tributary bits T 1 J 2 T 3 T 4 44 Trib. bits 48 tributary bits 48 tributary bits 48 tributary bits 48 tributary bits 48 tributary bits T 1 T 2 J 3 T 4 44 Trib. bits 48 tributary bits 48 tributary bits 48 tributary bits 48 tributary bits 48 tributary bits T 1 T 2 T 3 J 4 44 Trib. bits
Figure 13.19
(Continued)
1 st sub-frame (294 bits)
2 nd sub-frame (294 bits)
3 rd sub-frame (294 bits)
4 th sub-frame (294 bits)
Tk indicates a bit from tributary k = 1, 2, 3, 4; Trib. bits refer to bits from all tributaries.
●
64 kb/s data channel. The corresponding European PRI was 30B + D, providing 30 bearer channels and one data channel. These integrated services digital network (ISDN) services were carried over wire pairs and offered bit rates of up to 2.048 Mb/s (in multiples of 64 kb/s). They were termed narrowband to distinguish them from broadband integrated services digital network (B-ISDN), which was provided over optical fibre and offered data rates in excess of 45 Mb/s, up to 9.95 Gb/s. There exists a different signalling procedure for the T1 system in which 24 frames are grouped into what is known as an extended superframe (ESF). The first bit of each member-frame, formerly dedicated to framing only, is then used to perform various control functions. These 24 bits are assigned as follows. Six bits provide a distributed FAW = 001001, six bits are used for CRC error checking, and the remaining 12 bits are used to provide a management channel known as the facilities data link (FDL). However, signalling is still performed by bit-robbing the lsb of all message channels in every sixth frame.
13.3.2.2.3 DS2 Frame Structure
The structure of the 6312 kb/s DS2 or T2 signal generated at the second level of the North American PDH is shown in Figure 13.19c. One DS2 frame contains 1176 bits made up as shown in the diagram, including: ● ● ●
A distributed FAW M1 M2 M3 = 011. Alarm bit A which is set to bit 1 during normal operation. Repeating pattern F0 F1 = 01 which helps to identify the control bit time slots.
13.3 Time Division Multiplexing
Sub-frame 4
77 F 1 J 1 T 2 T 3 T 4 T 5 T 6 T 7 Trib. bits X 84 tributary bits F1 84 tributary bits C2 84 tributary bits F0 84 tributary bits C2 84 tributary bits F0 84 tributary bits C2 84 tributary bits 77 F T J T T T T T 1
Sub-frame 3
84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits
P F1 C3 F0 C3 F0 C3
1
2
3
4
5
6
7
77 F 1 T 1 T 2 J 3 T 4 T 5 T 6 T 7 Trib. bits M0 F1 C7 F0 C7 F0 C7
P F1 C4 F0 C4 F0 C4
84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits
77 F 1 T 1 T 2 T 3 J 4 T 5 T 6 T 7 Trib. bits M0 84 tributary bits F1 84 tributary bits C5 84 tributary bits F0 84 tributary bits C5 84 tributary bits F0 84 tributary bits C5 84 tributary bits 77 F T T T T J T T 1
Trib. bits
84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits
Sub-frame 7
Sub-frame 5
X F1 C1 F0 C1 F0 C1
Sub-frame 6
Sub-frame 2
Sub-frame 1
(d)
M1 F1 C6 F0 C6 F0 C6
1
2
3
4
5
6
7
Trib. bits
84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits
77 F 1 T 1 T 6 T 3 T 4 T 5 J6 T 7 Trib. bits
84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits 84 tributary bits
77 F 1 T 1 T 2 T 3 T 4 T 5 T 6 J7 Trib. bits Figure 13.19 ● ●
●
(Continued)
Justification bits J1 , J2 , J3 , J4 , which work as earlier described for the European CEPT PDH. Control bits C1 , C2 , C3 , C4 , which work as earlier described for the European CEPT PDH. Ck = 0 indicates that Jk is a legitimate bit from input tributary k; otherwise, (for Ck = 1) it indicates that Jk is a dummy (stuffing) bit. Ck is so vital that it is sent three times simply to protect it from error. All other bits are tributary bits (as indicated in the diagram) formed by interleaving the four input tributaries.
13.3.2.2.4 DS3 Frame Structure
Figure 13.19d shows the structure of the 44 736 kb/s DS3 or T3 signal generated at the third level of the North American PDH. One DS3 frame contains 4760 bits and each frame is divided into seven 680-bit subframes as
835
836
13 Multiplexing Strategies
1544 kb/s (24 voice channels)
ʋ1 (t) ʋ24 (t)
Primary Muldex
24 analogue voice signals
6312 kb/s (96 voice channels)
DS1
1.5 – 6 Muldex
DS2
4 DS1 signals 5 DS2 signals
Figure 13.20
32064 kb/s (480 voice channels)
6 – 32 Muldex
97728 kb/s (1440 voice channels) J3
32 – 98 Muldex
J4
Three J3 signals
Japanese PDH.
shown. There are seven input tributaries and the C bits (C1 to C7 ) and J bits (J1 to J7 ) work as discussed above for the DS2 structure. Bits F1 F0 F0 F1 = 1001 serve as a distributed FAW for each subframe. Bits M0 M1 M0 = 010 serve as a distributed multi-subframe alignment word for the entire DS3 frame. Bit X is permanently set to 1 or may be used for low-speed signalling. Bit P (sent in two separate locations within the frame) is the modulo-2 sum of the 4704 tributary and J bits in the previous frame. It serves as a parity check for error control. 13.3.2.2.5 J1 Hierarchy
The Japanese PDH is shown in Figure 13.20. The first two levels of multiplexing are identical with the North American hierarchy. Beyond this, at the third multiplexing, five DS2 signals are combined to give a 480-channel 32 064 kb/s TDM signal of efficiency 95.81%. We call this signal J3. At the fourth level, three J3 signals are multiplexed to obtain a 1440-channel 97 728 kb/s J4 signal of efficiency 94.30%. Worked Example 13.1 We wish to determine the following parameters for an M1–2 muldex that generates a DS2 signal from four DS1 inputs: (a) Nominal stuffing rate of a DS2 signal. (b) Maximum stuffing rate of a DS2 signal. (c) Allowable range of bit rate variation in DS1 signals if they are to be successfully multiplexed into a DS2 signal. (a) Figure 13.21a shows the nominal condition of the muldex that produces the DS2 signal. Each of the four (input) tributaries has its nominal DS1 bit rate of 1544 kb/s and the output DS2 signal has its nominal bit rate of 6312 kb/s. Looking at the DS2 frame structure given in Figure 13.19c, we see that 24 overhead bits (counting all nontributary bits except J, i.e. counting the M, A, F, and C bits) are inserted per 1176 bits. In addition, some stuffing bits are inserted (through some of the J bits) under nominal condition. This info is shown in Figure 13.21a. It means that for every 6 312 000 bits flowing out of the muldex (per second) under nominal condition, 1 544 000 × 4 are from the four tributaries, a fraction 24/1176 of the 6 312 000 bits are overhead bits, and the rest are stuffing bits. The stuffing bit rate, denoted Snom , is therefore determined by equating the number of all muldex input bits each second with the number of output bits. That is 24 × 6312000 + Snom = 6312000 1544000 × 4 + 1176 ⇒ Snom = 7184 b∕s = 1796 b∕s∕tributary where we divide by 4 to obtain the number of stuffing bits per tributary.
13.3 Time Division Multiplexing
DS1 (a)
DS1 DS1 DS1
1544000 b/s 1544000 b/s 1544000 b/s
6312000 b/s
1.5 – 6 Muldex
DS2
1544000 b/s Stuffing bits Overhead bits (24 per 1176-bit frame)
DS1
DS1 (b)
DS1 DS1
R 1min R 1min
1.5 – 6 Muldex
R 1min
DS1 (c) DS1 DS1
DS2
R 1min
Overhead bits (24 per 1176-bit frame)
DS1
6312000 b/s
J bits (4 per frame)
R 1max R 1max R 1max
1.5 – 6 Muldex
6312000 b/s DS2
R 1max
Overhead bits (24 per 1176-bit frame) Figure 13.21 Conditions of muldex that produces DS2 signal: (a) nominal; (b) minimum workable, which corresponds to maximum stuffing rate; (c) maximum workable, which corresponds to no stuffing bits required
(b) The maximum stuffing rate occurs when all J bits in the frame are stuff bits. This happens when the input tributaries are at their lowest workable rate R1min so that stuffing bits must be inserted in all J positions to complete an outgoing DS2 frame. This scenario is illustrated in Figure 13.21b including the fact (from Figure 13.19c) that there are four J bits per 1176-bit frame. The maximum stuffing rate (per tributary) is therefore Smax =
1 × 6312000 = 5367 b∕s∕tributary 1176
(c) At the highest workable input rate R1max , the tributary bits are coming in so fast that every J position in the frame must be used to convey a tributary bit. That is, every J bit is a tributary bit and there is no stuffing bit. This scenario is illustrated in Figure 13.21c. The allowable range of DS1 bit rate variation is from R1min in Figure 13.21b to R1max in (c). We determine these values by simply equating muldex input with output
837
838
13 Multiplexing Strategies
(per second) in each scenario. Thus 28 × 6312000 = 6312000 4 × R1min + 1176 ⇒ R1min = 1540.429 kb∕s And 4 × R1max + ⇒
R1max
24 × 6312000 = 6312000 1176 = 1545.795 kb∕s
That is, the allowable range of DS1 bit rate variation is from 1540.429 kb/s to 1545.795 kb/s. 13.3.2.3 PDH Problems ●
●
●
It is clear from the above discussions that the plesiochronous digital hierarchies used in North America (T1), Japan (J1), and the rest of the world (E1) are not compatible. The interconnection of PDH systems originating from different parts of the world thus requires proprietary conversion equipment, which maps 8-bit time slots (at the 1.5 and 2 Mb/s levels) from one hierarchy into slots in the other hierarchy, a process known as timeslot interchange. Note that when converting between E1 and T1 or J1 systems that carry voice traffic then it is also necessary to re-code each 8-bit word between A-law and 𝜇-law PCM. PDH was designed to cater for the basic 64 kb/s telephony rate. It is therefore unsuitable for the transmission of video traffic, which (even with the use of compression) requires much higher bit rates. Inherent limitations in PDH make it impossible to realise a well-managed system having the required video rates simply by concatenating several 64 kb/s channels. Furthermore, multiple stages of multiplexing are needed to realise higher-capacity systems for data and voice traffic. There is a small problem here in that control bits must be inserted at each stage to manage the multiplexing process. This leads to lower efficiencies at the higher levels of the hierarchy. For example, the T1 system has an efficiency of 99.48% at the first multiplexing stage that produces the 1544 kb/s 24-channel signal, but an efficiency of only 92.65% for the 139 264 kb/s 2016-channel signal generated at the fourth multiplexing stage. Beyond the first level of the hierarchy, PDH involves the multiplexing of plesiochronous tributaries, which requires occasional insertion of dummy bits into the TDM signal, as earlier explained. This leads to two problems, one minor and the other major. When the dummy bits are removed at the receiver, gaps are left in the extracted bit stream, which causes jitter that must be smoothed using a process that involves extra buffering. More seriously, the presence of these dummy bits and framing bits makes it impossible to drop and insert a lower rate tributary without completely demultiplexing (i.e. unpacking) the TDM signal down to its component tributaries at the desired rate. Thus, a multiplexing mountain is required at the drop and insert point, an example of which is shown in Figure 13.22 for the case where a 2 Mb/s bit stream is extracted from a 140 Mb/s system. PDH-based networks are therefore expensive, inflexible, and cumbersome to build with provision for cross-connect points. Such points are needed in modern networks to allow lower-rate tributaries to be dropped and inserted at intermediate points, channels to be provided for private networks, and subnetworks to be interconnected to provide alternative paths through a larger network as a backup against the failure of a particular link.
13.3.3 Synchronous Digital Hierarchy The synchronous digital hierarchy (SDH) is a multiplexing technique designed to address the above PDH shortcomings and to operate in synchronism with the digital switches used at network nodes. It allows individual tributaries (down to 64 kb/s) to be readily accessed, and it very conveniently accommodates the standardised PDH signals presented above, as well as being well suited for carrying ATM payloads (discussed later). Moreover, SDH
13.3 Time Division Multiplexing
34 – 140 Muldex
34 – 140 Muldex
34 Mb/s streams
2–8 Muldex
2–8 Muldex
2 Mb/s streams
2 Mb/s Extracted
140 Mb/s Out
8 – 34 Muldex
8 Mb/s streams 8 – 34 Muldex
140 Mb/s In
2 Mb/s Inserted
Muldexes performing demultiplexing Muldexes performing multiplexing
Figure 13.22
PDH multiplex mountain required to access one of the 2 Mb/s channels within a 140 Mb/s signal.
makes ample provision of channel capacity to meet all the requirements of advanced network management and maintenance for the foreseeable future. 13.3.3.1 SDH Rates
You will recall that the E1 frame (in Figure 13.16) contains 32 bytes in a 125 μs duration, which gives it a bit rate of 2048 kb/s. The basic SDH frame is called synchronous transport module at level 1 (STM-1) and contains 2430 bytes. It has the same duration T s = 125 μs, which means that a total of 8000 STM-1 frames are transmitted each second, i.e. frame rate = 8 kHz. Thus, the basic SDH bit rate is 2430 × 8 = 155.52 Mb∕s R1 = 125 × 10−6 The STM-1 frame can therefore accommodate all the American and European PDH multiplex signals up to the fourth level of the plesiochronous hierarchy, namely 1.5, 2, 6, 8, 34, 45, and 140 Mb/s. Higher SDH rates are obtained by multiplexing, through byte-interleaving, a number (N) of STM-1 frames to give what is referred to as the STM-N frame. The ITU has standardised those rates in which N is a power of 4. For example, the STM-4 and STM-16 frames have duration T s = 125 μs, contain 2430 × 4 and 2430 × 16 bytes, respectively, and hence have bit rates 2430 × 4 × 8 = 622.08 Mb∕s R4 = 125 × 10−6 2430 × 16 × 8 = 2488.32 Mb∕s R16 = 125 × 10−6 Figure 13.23 shows the assembly of higher-capacity transport modules up to STM-64. It can be seen that the output bit rate at each level is the sum of the bit rates of the input tributaries. This is because of the synchronous operation of these inputs, which makes bit-stuffing unnecessary. Note that although we have depicted a hierarchical assembly in Figure 13.23, an STM-N frame, whatever the value of N, can also be obtained by byte-multiplexing N STM-1 frames in a 1-N SDH muldex. 13.3.3.2 SDH Frame Structure
Figure 13.24 shows the structure of the STM-1 and STM-N frames. It is sufficient that we consider only the STM-1 frame structure in detail, since the structure of the STM-N frame follows straightforwardly from a byte-by-byte interleaving of N STM-1 frames.
839
840
13 Multiplexing Strategies
155.52 Mb/s
622.08 Mb/s
2488.32 Mb/s
9953.28 Mb/s
STM-1 SDH 1 – 4 Muldex
4 STM-1 signals
Figure 13.23
STM-4 SDH 4 – 16 Muldex
4 STM-4 signals
STM-16 SDH 16 – 64 Muldex
STM-64
4 STM-16 signals
Higher-capacity synchronous transport modules.
We have shown the 2430 bytes of the STM-1 frame arranged in nine rows of 270 bytes each. However, it must be emphasised that the frame is transmitted serially 1 bit at a time starting from row 1, then row 2, and so on to row 9. The MSB of each byte is transmitted first. One STM-1 frame is sent in an interval of 125 μs, followed by the next frame in the next 125 μs interval, and so on. Note that there are 2430 (= 270 × 9) cells in our rectangular-matrix representation of the STM-1 frame. Each cell corresponds to 8 bits transmitted in 125 μs, which represents a 64 kb/s channel capacity. Similarly, each column represents a channel capacity of 64 × 9 = 576 kb/s. Clearly then, one cell of the STM-1 frame can carry one PCM voice signal. Three columns can carry one DS1 signal (of bit rate 1544 kb/s), with some bits to spare. Four columns can carry one E1 signal (of bit rate 2048 kb/s), etc. We will have more to say on this when considering how the STM-1 frame is assembled. The STM-1 frame is divided into two parts. The first part is the frame header and consists of a 9-byte pointer field and a 72-byte section overhead (SOH). The frame header covers the first nine columns of the frame, which corresponds to a channel capacity of 5.184 Mb/s. It is used for carrying control bits, such as frame alignment, error monitoring, multiplex and network management, etc. The remaining part of the frame is the payload, which consists of 261 columns or a channel capacity of 150.336 Mb/s. This area is used for carrying a variety of signals and is therefore referred to as a virtual container (VC). More specifically, it is identified as VC-4, to distinguish it from smaller-sized virtual containers, since it is large enough to contain the 140 Mb/s PDH signal at the fourth level of the plesiochronous hierarchy. In general, the payload area provides virtual containers of various sizes identified as VC-j, which is large enough to accommodate the PDH signal at the jth level of the plesiochronous hierarchy, but too small for the signal at the next higher level. At lower levels j < 4, a second digit is appended to the identification to distinguish between the American (1) and European (2) signals. Thus, VC-11 (pronounced veecee-one-one) is a virtual container adequate for the American DS1 signal (of bit rate 1544 kb/s). Similarly, VC-12 is for the European E1 signal (of bit rate 2048 kb/s), VC-21 is for the American DS2 signal (of bit rate 6312 kb/s), VC-22 is for the European E2 signal (of bit rate 8448 kb/s), etc. VC-1 and VC-2 are described as lower-order virtual containers, whereas VC-3 and VC-4 are higher order. VCs include bits for a path overhead (POH), which is added at the point that the tributary signal is incorporated into the SDH system and is used to manage the transmission of the signal and ensure its integrity. The process of adding a POH is known as mapping. A VC without its POH is known simply as a container (C), which is therefore the maximum information payload available to a user in the VC. The entire first column (9 bytes) of a VC-4 is used for the POH. Thus, a C-4 has 260 × 9 bytes or a capacity of 149.76 Mb/s, which is the maximum information rate in a VC-4 – more than enough for the 140 Mb/s PDH signal. As a reminder we may write VC = C + POH
(13.13)
13.3 Time Division Multiplexing
STM-1 frame: 270 columns of 9 bytes each
RSOH
(a)
Frame header (9 Columns)
Payload (261 Columns) Row 1 Row 2 Row 3 Row 4 Row 5 Row 6 Row 7 Row 8 Row 9
A1 A 1 A1 A2 A2 A2 C 1 B1 M1 M2 E 1 M3 F1 D1 M4 M5 D2 M6 D3
MSOH
AU pointers B2 B2 B2 K1 D4 D5 D7 D8 D 10 D 11
K2 D6 D9 D 12
(b)
STM-N SOH 9 × N Columns
Col. 270
Col. 1 Col. 2 Col. 3 Col. 4 Col. 5 Col. 6 Col. 7 Col. 8 Col. 9 Col. 10
Z1 Z1 Z1 Z2 Z2 Z2 E 2
STM-N Payload 261 × N Columns
Row 1 Row 2 Row 3 Row 4 Row 5 Row 6 Row 7 Row 8 Row 9 STM-N frame: 270 × N columns of 9 bytes each
Figure 13.24
SDH frame structure: (a) STM-1; (b) STM-N, for N = 4, 16, 64, …
When a tributary signal is inserted into the SDH system, we say that it has been incorporated in a container. This process requires single-bit or asynchronous justification if the tributary and SDH clocks are not locked in frequency. Justification is discussed in Section 13.3.2 in connection with the multiplexing of nearly synchronous tributaries in PDH. The capacity of a container is always larger than that required by the tributary signal for which it is defined. So, as part of the mapping process, the spare byte positions in the container are filled with a defined filler pattern of stuffing bits to synchronise the tributary signal with the payload capacity. The POH and stuffing bits are removed at the drop point in the network where the tributary is demultiplexed. Before leaving this issue, it is worth pointing out that the maximum efficiency of an SDH system can be obtained as follows Max no.of information cells in frame × 100% 𝜂max = Total no.of cells in frame 260 × 9 × 100% = 96.30% (13.14) = 2430 A VC need not start at the first byte of the frame payload. Typically, it begins in one frame and ends in the next. The starting point of a VC (i.e. the location of its first byte) in an STM frame is indicated by a pointer, which keeps track of the phase offset between the two. There is therefore no need for delay-causing re-timing buffers at network nodes, which may be controlled by slightly different clock rates. A higher-order VC and its pointer constitute an
841
842
13 Multiplexing Strategies
administrative unit (AU). The pointer is called an AU pointer. It is 3 bytes long and located in the header part of the STM frame. The scenario just described applies to the simple case where no intervening multiplexing is required because the input tributary is large enough (e.g. the 140 Mb/s signal) to use up the available information capacity of the STM-1 frame. In this case a VC-4 is formed, followed by the addition of a pointer to give an AU-4. Finally, an SOH is added to the AU-4 to complete the STM-1 frame. A more general case involving the multiplexing (always through byte-interleaving) of several lower-rate signals to fill the STM-1 frame is considered under frame multiplexing. 13.3.3.2.1 Frame Header
Returning to Figure 13.24a we see that the 81-byte frame header consists of 27 bytes for a regenerator section overhead (RSOH), 9 bytes for AU pointers, and 45 bytes for a multiplex section overhead (MSOH). The RSOH is interpreted at regenerators along the transmission path, whereas the MSOH remains intact and is interpreted only at multiplexing points. ●
●
●
RSOH: The regenerative SOH consists of the following elements: ⚬ A 48-bit FAW A1A1A1A2A2A2, composed using two bytes A1 and A2 only. This marks the start of the STM-1 frame. ⚬ A label C1 that identifies the order of appearance of this frame in the STM-N frame, N = 4, 16, … ⚬ A byte B1 used for error detection by bit interleaved parity (BIP). In general, a BIP-n code provides error protection for a specified section of a transmitted bit stream. The jth bit of the code gives an even parity check on the jth bit in all blocks of n bits in the section. For example, you should verify that the BIP-3 code for protecting the bit stream 100010110111011000101 is given by BIP-3 code = 0 0 1. ⚬ Medium-specific bytes M1 to M6 , the use of which depends on the type of transmission medium – radio, coaxial cable, or optical fibre. ⚬ A 64 kb/s voice channel E1 , for use by system maintenance personnel. This communication channel is usually referred to as engineer’s orderwire (EOW), often also called engineering order wire. ⚬ A 64 kb/s user channel F1 . ⚬ Data communication channels (DCCs) D1 , D2 , and D3 , providing a 192 kb/s capacity for network management. AU pointers: the AU pointer field consists of 9 bytes, three of which indicate the exact location of one VC in the STM-1 frame. The frame may either carry one VC-4 in which case only 3 AU pointer bytes are used, or it may carry up to three VC-3’s. MSOH: the assignment of the multiplex SOH are as follows: ⚬ Byte B2 is used for error monitoring. ⚬ Bytes K1 and K2 are used for automatic protection switching (APS) in line transmission. ⚬ Bytes D4 to D12 provide a 576 kb/s data communication channel for network management. ⚬ Bytes Z1 and Z2 are reserved for future functions. ⚬ Byte E2 provides a 64 kb/s EOW.
13.3.3.2.2 Path Overhead (POH)
We indicate earlier that the frame payload may consist of only one virtual container VC-4, or several smaller-sized VCs. An important part of a VC is its POH, which is used to support and maintain the transport of the VC between its entry and exit path terminations. The POH of a VC-4 as well as that of a VC-3 is 9 bytes long and occupies the entire first column of the VC. Figure 13.25 shows the composition of this 9-byte POH. On the other hand, VC-1
13.3 Time Division Multiplexing
VC J1 B3 C2 G1 F2
Container
H4 Z3 Z4 POH
Z5
J 1 = Path trace. a unique identifier for verifying the VC-n connection B 3 = BIP-8 code. computed over all bits of the previous container and used for monitoring error. C 2 = Signal label. shows the composition of the VC. G 1 = Path status. indicates received signal status F 2 = Path user channel. for network operator communication between path equipment H 4 = Multiframe indicator. gives multiframe indication or cell start for ATM Z3 , Z 4 , Z 5 = Three bytes reserved for use by national network operator Figure 13.25
Composition of path overhead (POH) for VC-3 and VC-4.
and VC-2 each has a shorter POH that is only one byte long. The bits b7 b6 b5 b4 b3 b2 b1 b0 of this 1-byte POH are assigned as follows: ● ● ● ● ●
b7 b6 = BIP-2 for error monitoring. b5 = Far end block error (FEBE) to indicate receipt of a BIP error. b4 = Unused. b3 b2 b1 = Signal label (L1, L2, L3) to indicate type of VC payload. b0 = remote alarm to indicate receiving failure.
13.3.3.2.3 Frame Assembly
The process of constituting an STM-1 frame starting from a C-4, which may carry, for example, the 140 Mb/s PDH signal, is described earlier and is illustrated in Figure 13.26. The STM-1 frame may also be constituted using lower-rate tributaries in smaller-sized containers. An illustration of this procedure for C-12, which carries one 2.048 Mb/s E1 signal, is shown in Figure 13.27a. The result of this multiplexing procedure is that the STM-1 frame has been employed to convey 63 E1 signals, each of which can be easily extracted at a network node using a simple add/drop muldex (ADM) without having to unpack the entire frame. For this to be possible, the starting point of every lower-order VC must be indicated by a pointer known as a tributary unit (TU) pointer. The process of adding a pointer to a VC (whether higher order or lower order) is known as aligning. Other multiplexing possibilities are shown in the ITU-defined basic SDH multiplexing structure of Figure 13.27b. Note that Figures 13.26 and 13.27a were obtained by following two different routes in Figure 13.27b. Taking a moment to identify these routes will
843
13 Multiplexing Strategies
Figure 13.26
C-4
Assembly of STM-1 frame from C-4.
e.g. E4 + fixed stuff
POH
E4 + fixed stuff
x
POH
Add POH
E4 + fixed stuff
RSOH x MSOH
POH
844
E4 + fixed stuff
VC-4 Add AU Pointer AU-4
Add SOH
STM-1
help you understand how to interpret Figure 13.27b. The following discussion explains the new SDH terms that appear in these figures. A lower-order VC together with its TU pointer constitute what is known as a tributary unit, which occupies a certain number of columns in the STM-1 payload area. For example, it is shown in Figure 13.28 that TU-11 has 3 columns, TU-12 has 4, and TU-21 and TU-22 both have 12 columns. An assembly of identical-rate TUs, obtained using byte-interleaving, is known as a tributary unit group (TUG). A TUG-2 consists of one TU-2, or three TU-12’s, or four TU-11’s. And a TUG-3 consists of one TU-3, or seven TUG-2’s. Similarly, an assembly of identical-rate administrative units is known as an administrative unit group (AUG). Only two realisations of an AUG have been defined, namely one AU-4 or three AU-3’s. Finally, adding an SOH to an AUG yields an STM-1 frame, and N of these frames may be multiplexed to obtain the higher-capacity transport module STM-N. 13.3.3.3 SONET
The synchronous optical network (SONET) transmission standard was developed in 1988 by the T1X1 committee of the American National Standards Institute (ANSI). It is the forerunner of SDH. Both SONET and SDH are based on the same principles, the most noticeable differences between them being in terminology and the standardised transmission rates. The basic SONET frame is called the synchronous transport system level 1 (STS-1) or optical carrier level 1 (OC-1). This frame has duration 125 μs and contains 810 bytes, which corresponds to a rate of 51.84 Mb/s. The structure of the frame may be represented similarly to Figure 13.24 as a rectangular matrix of 90 columns of 9 bytes each. The first 3 columns (equivalent to 1.728 Mb/s) serve as the header, whereas the remaining 87 columns (or 50.112 Mb/s) are the payload. Thus the STS-1 frame can contain one DS3 signal (of bit rate 44.736 Mb/s) or 28 DS1 signals or 28 × 24 = 672 voice channels. Based on the SDH considerations discussed earlier, individual DS1 signals in the STS-1 frame can be extracted without having to disassemble the entire frame. Higher-capacity frames, called STS-N or OC-N, are obtained by multiplexing N basic frames. In particular, N = 3 gives the STS-3 frame, which has exactly the same capacity (155.52 Mb/s) as the basic SDH frame, namely STM-1. The other standardised SONET and SDH rates that are identical include OC-12 and STM-4 with a line rate of
13.3 Time Division Multiplexing
(b)
(a) C-12
C-4
C-3
C-2
C-12
C-11
VC-3
VC-2
VC-12
VC-11
TU-3
TU-2
TU-12
TU-11
Add POH VC-12 Add TU Pointer TU-12
×3
×1 Multiplex (×3) (Byte interleave)
×1
TUG-2
×4
TUG-2
×7
Multiplex (×7) (Byte interleave)
TUG-3
×7
VC-3
×3
Add AU Pointer
VC-4
VC-3
AU-4
AU-3
AU-3 Multiplex (×3) (Byte interleave) AUG Add SOH
×1
×3
AUG
×N STM-1
STM-N
Pointer processing Multiplexing Aligning Mapping
Figure 13.27
TU-11
Figure 13.28
(a) Assembly of STM-1 frame from C-12; (b) basic SDH multiplexing structure.
TU-12
TU-21 & TU-22
Tributary units. Each cell represents 64 kb/s, i.e. 8 bits per frame duration T s = 125 μs.
845
846
13 Multiplexing Strategies
622.08 Mb/s, OC-48, and STM-16 with a line rate of 2488.32 Mb/s, and OC-192 and STM-64 with a line rate of 9953.28 Mb/s. International transmission is based on SDH, with the required conversions performed at North American gateways. SONET has, however, been around longer than SDH and is currently more widely implemented in North America than SDH is in the rest of the world. As the name suggests, optical fibre is the transmission medium for which SONET was designed. The PDH-based T1 carrier systems in North America were gradually replaced by SONET technology. Offices in a metropolitan area can be linked together in an optical fibre ring network that runs the OC-48 system carrying 48 × 672 = 32 256 channels. An add/drop muldex at each office allows desired channels to be efficiently extracted and inserted.
13.3.4 ATM You may have observed that PDH and SDH techniques are optimised for voice transmission, their basic frame duration of 125 μs being the sampling interval of a voice signal. Nonvoice traffic of diverse bit rates cannot be simultaneously accommodated in a flexible and efficient manner. ATM is a flexible transmission scheme that efficiently accomplishes the following: ●
●
●
Accommodation of multiple users by statistical TDM, as demonstrated in Figure 13.29. Time slot allocation is not regular as in the (nonstatistical) TDM techniques discussed hitherto. Rather, each user is allocated time slots (and hence bandwidth) as required by their bit rate. A time slot contains a group of bits known as a cell or packet, which consists of user data plus identification and control bits called a header. If in any time slot there are no bits to send then an idle cell is inserted to maintain a constant bit rate (CBR) in the transmission medium. Provision of multiple services, such as transmission of voice, text, data, image, video, and high definition television, and connections to LAN, including LAN and WAN interconnections. Support of multiple transmission speeds or bit rates (ranging from 2 to 622 Mb/s) according to the requirements of each service.
ATM is more efficient than PDH and SDH because it dynamically and optimally allocates available network resources (e.g. bandwidth) via cell relay switching. It was the transfer mode or protocol adopted for B-ISDN, which supported all types of interactive point-to-point and distributive point-to-multipoint communication services. These include voice and video telephony, videoconferencing, high-speed data connection, email messaging, information retrieval, multimedia communication, video-on-demand, pay-per-view TV, digital audio broadcast, digital
Voice Codec
1 1 0 1…
Idle cell Video Codec
Data Source Figure 13.29
010011101011…
1
Statistical Multiplexer
01 …
Statistical multiplexing.
Transmission medium
= Cell header
13.3 Time Division Multiplexing
5 bytes
48 bytes
Header
Payload
Figure 13.30
ATM cell.
TV broadcast, and high-definition television (HDTV). In what follows, we briefly discuss the features, structure, and network components and interfaces of ATM. ATM breaks the information bit stream, whatever their origin (voice, video, text, etc.) into small packets of fixed length. A header is attached to each data packet to enable correct routing of the packets and reassembling of the bit stream at the desired destination. The fixed-length combination of service (or other) data and header is known as an ATM cell, which is shown in Figure 13.30. It is 53 bytes long, with a 48-byte payload that carries service data, and a 5-byte header that carries identification, control, and routing information. The maximum transmission efficiency of ATM is therefore 48 × 100% = 90.57% 𝜂ATM = 53 The size of the cell is a compromise between the conflicting requirements of high transmission efficiency and low transmission delay and delay variation. To see this, imagine that the header is maintained at 5 bytes and the cell size is increased to 50 000 bytes. The efficiency would increase to 99.99%, but so would the delay if two sources A and B attempted to send data simultaneously, and the cell from, say, source A had (inevitably) to wait temporarily in a buffer for the cell from B to go first. The waiting time is a switching delay given by the cell duration – in this simple case of waiting for only one cell 𝜏d =
Cell Size (in bits) Line Speed (in bits∕second)
(13.15)
Thus, at a typical line speed of 2 Mb/s and the above cell size, we have 𝜏 d = 200 ms. It is important to see the implication of this result. A received signal would have to be assembled at a destination from cells some of which were not buffered at all, and some of which were buffered for 200 ms or even longer in the event of a queue at the switch. This amounts to a variation in propagation time or cell delay variation of at least 200 ms, which is unacceptable for delay-sensitive traffic such as voice and video. On top of this, there is also another cell-size-dependent delay known as packetisation delay 𝜏 p . This occurs at the source of real-time signals and is the time it takes to accumulate enough bits to fill one cell. 𝜏p =
Cell Payload Size (in bits) Source Bit Rate (in bits∕second)
(13.16)
Thus, for a voice signal (of source bit rate 64 kb/s) and a 50 000-byte cell (with a 5-byte header as above) then 𝜏 p = 6.25 s. Some samples of the signal would be more than six seconds old before even beginning the journey from transmitter to receiver, and this is clearly unacceptable in interactive communication. At the other extreme, if we make the cell size very small, say 6 bytes then the efficiency is only 16.67%, but the packetisation delay and cell delay variation are also drastically reduced, with 𝜏 p = 125 μs and 𝜏 d = 24 μs, which are practically imperceptible. 13.3.4.1 ATM Layered Architecture
The functions performed in an ATM system can be organised hierarchically into layers with clearly defined interfaces. See Figure 13.31.
847
848
13 Multiplexing Strategies
Figure 13.31
ATM layered architecture.
Higher Layers ATM Adaptation Layer (AAL) ATM Layer Physical Layer 13.3.4.1.1 Physical Layer
The physical layer is divided into two sublayers, the physical medium (PM) sublayer, and the transmission convergence (TC) sublayer. The PM sublayer defines (i) the electrical/optical interface; (ii) the line code, i.e. the voltage waveforms used for representing binary 1’s and 0’s; (iii) the insertion and extraction of bit timing information; and (iv) the transmission medium, e.g. optical fibre, coaxial cable, or wire pair. The TC sublayer performs the following cell functions: ● ●
●
●
●
Transmission frame (e.g. STM-1) generation at the transmitter and recovery at destination. Transmission frame adaptation: this is the process of adapting the flow of cells to match the capacity and structure of the transmission frame. Generation of header error correction (HEC) at sending node, and verification at destination. Any cell arriving at a network node with an error in the header that cannot be corrected is discarded. Note that the ATM network does not perform error monitoring on bits in the payload. Cell delineation: this is a process of identifying the cell boundaries in a stream of bits arriving at the destination. If byte boundaries are known – this will be the case if ATM cells are transported using SDH frames – then the process is as follows. In the HUNT state, the receiver (using a 5-byte-long window) takes the first 5 bytes, performs HEC calculation on the first 4 and compares the result to the 5th. If they match then those 5 bytes are probably the header, and the receiver skips the next 48 (supposedly payload) bytes and repeats the process on the 5 bytes that follow. If a match is obtained several times (typically 6) in a row then it is concluded that the cell boundaries have been found, and the receiver enters the SYNCH state. However, if the specified number of consecutive matches is not obtained, the receiver simply slides the window before starting all over in the HUNT state. While in the SYNCH state, cell delineation is assumed lost, and the HUNT state is initiated, if the HEC calculation fails a certain number of times (typically seven) in a row. Cell rate decoupling: a constant cell rate is maintained by inserting idle cells (as necessary) for transmission, and removing them at the destination. Idle cells are indicated by the following standardised bit pattern in the cell header, listed from the first to the fifth byte: 00000000
00000000
00000000
00000001
01010010
All (48) bytes of the idle cell payload are filled with the bit pattern 01101010. 13.3.4.1.2 ATM Layer
The ATM layer provides the functionality of a basic ATM network and controls the transport of cells through the network of ATM switches. Specific functions performed include the following: ● ●
Generation and extraction of cell header, excluding HEC calculation. Multiplexing and demultiplexing of cells. Services are allocated bandwidth on demand by assigning to them only the number of cells they require. This is clearly a more efficient utilisation of transmission system resource
13.3 Time Division Multiplexing
●
●
than in nonstatistical TDM (e.g. PDH), which allocates a fixed time slot and hence system bandwidth to each service. Translation of values of virtual path identifier (VPI) and virtual channel identifier (VCI) at switches and cross-connect nodes. See later. Generic flow control (GFC). This controls the rate at which user equipment submits cells to the ATM network.
13.3.4.1.3 ATM Adaptation Layer
The ATM adaptation layer (AAL) defines how the higher layer information bits are bundled into the ATM cell payload. It is divided into two sublayers, namely the segmentation and re-assembly (SAR) sublayer and the convergence sublayer (CS). The functions of these sublayers depend on the class of service, of which ITU has defined four. Generally, the CS divides the higher-level information into suitable sizes, whereas the SAR sublayer segments them into 48-byte chunks, and at the destination re-assembles the payloads into data units for the higher layers. The ITU-defined classes of service are as follows: ●
●
●
●
Class A service has a CBR, is connection-oriented (CO), and requires end-to-end timing relation (TR). Examples of a class A service include voice and CBR video. The applicable protocol is identified as AAL1. One byte of the 48-byte payload is used as a header, which performs various functions, such as cell loss detection. If a cell is lost the receiver inserts a substitute cell to maintain the TR between transmitter and receiver. Numbering of the cells by the SAR sublayer enables the detection of cell loss. Class B service, like Class A is both CO and TR, but unlike Class A has a variable bit rate (VBR). A VBR arises in compressed video and audio where the compression ratio and hence bit rate at any time depends on the detailed content of the signal segment. Class C service, like Class B is both CO and VBR, but unlike Class B does not require timing relation, and so is described as a timing-relation-not-required (TRN) service. An example of this service is CO data transfer. Class D service, like Class C is both VBR and TRN, but unlike any of the other services is connectionless (CL). An example is CL data transfer. Two different AAL protocols have been defined for data transfer (Classes C and D).
●
●
In AAL3/4 the first 2bytes of the 48-byte payload are used as a header, which gives a message identifier (MID) that allows the multiplexing of several packets onto a single virtual channel. The last two bytes provide a trailer for a CRC to monitor bit errors. Thus, only 44 bytes of the 53-byte ATM cell carry data bits, giving an efficiency of only 83%. AAL5 is for those data transfer services that do not require shared media support and protection against mis-sequencing, e.g. point-to-point ATM links. A block of bits is formed in the convergence sublayer consisting of the following: (i) data payload ranging from 0 to 65 536 bytes, (ii) a padding ranging from 0 to 47 bytes to make the block an integer number of 48-byte segments, (iii) a length field, and (iv) a CRC-32 field for error detection. This block is then broken into 48-byte cells in the SAR sublayer and sent sequentially. A payload type identifier (PTI) bit in the ATM cell header is set to 1 to indicate the last cell. Thus, AAL5 makes (almost) the entire 48-byte ATM cell payload available for data. This yields an efficiency value approaching the maximum 90.57%.
13.3.4.1.4 Higher Layers
There are three types of higher layer information, usually identified as the user plane, the control plane, and the management plane. The user plane involves all types of user application information: voice, video, etc. The control plane deals with control information for setting up or clearing calls, and for providing switched services.
849
13 Multiplexing Strategies
The management plane provides network management information for monitoring and configuring network elements, and for communication between network management staff. 13.3.4.2 ATM Network Components
Figure 13.32 shows a network reference model for ATM made up of four types of equipment and three standard interfaces. ●
●
●
●
●
The customer equipment (CEQ) or B-ISDN terminal equipment (B-TE) communicates across the network, serving as a source and sink for the video, audio, and data bit streams carried by ATM. These streams are referred to as virtual channels (VCs). The interface between a CEQ and the network is known as the user network interface (UNI), and is standardised to allow interoperability of equipment and network from different manufacturers. The ATM multiplexer enables a number of VCs from different UNI ports to be carried over a single transmission line. In ATM parlance we say that the virtual channels have been bundled into a container, called a virtual path (VP) just as several letters are bundled into a postal sack in the postal system for easier transportation to a depot or sorting office. Note, however, that a VP is not synonymous with the physical link, and there may be several VPs on one link, just as there may be several postal sacks in one van. The ATM cross-connect routes a VP from an input port to an output port according to a routing table, leaving the contents of each VP (i.e. their VCs) undisturbed. In this respect a cross-connect is analogous to a postal depot where sacks may be moved unopened from one van to another. An ATM switch is the most complicated equipment of the ATM network, able not only to cross-connect VPs but also to sort and switch their VC contents. This is like a postal sorting office where some sacks are opened and the letters are re-sorted into new sacks that contain letters with a narrower range of destinations. Other sacks may be switched, i.e. loaded onto a designated van, with all their contents intact. The network node interface (NNI) is the interface between network nodes or subnetworks, whereas the internetwork interface (INI) is the interface between two ATM networks. INI includes features for security, control and administration of connections between networks belonging to different operators.
CEQ ATM multiplexer
850
CEQ
ATM crossconnect
CEQ ATM switch
CEQ
UNI
ATM switch
NNI
INI
2nd ATM network Figure 13.32
ATM network and interfaces.
13.3 Time Division Multiplexing
13.3.4.3 ATM Cell Header
The structure of the ATM cell header is shown in Figure 13.33. The header consists of 5 bytes or 40 bits in all with the following designations. ● ●
●
●
●
28 bits (at NNI) or 24 (at UNI) are VPI and VCI fields used for routing. At UNI, the first 4 bits provide a GFC field, which is used to control cell transmission between the CEQ and the network. The GFC field is only of local significance and is usually set to the uncontrolled access mode with a value of 0000 where it has no effect on the CEQ. Any other value in this field will correspond to the controlled access mode, where the rate of transmission from the CEQ is expected to be modified in some (yet to be specified) manner. The PTI field has 3 bits b4 b3 b2 . Bit b4 is set to 0 to indicate that the cell is carrying user information. A maintenance/operation information cell is identified with b4 = 1. Bit b3 is a congestion experience bit, which is set to 1 if the cell passes a point of network congestion, to allow a (yet unspecified) reaction. Bit b2 is carried transparently by the network and is currently used by AAL5 (as explained earlier) to indicate the last cell in a block of bits. One bit serves as the cell loss priority (CLP) field. When set (i.e. CLP = 1), it indicates that the cell is of lower priority and should be discarded (if need be) before cells with CLP = 0. The HEC field has 8 bits, which are used in one of two modes to provide error protection for the cell header. This is especially important to prevent an error in the VPI/VCI values causing a cell to be delivered to the wrong address. In the correction mode, 1-bit errors can be corrected. The detection mode, on the other hand, only allows errors to be detected. The corrupted cell is then simply discarded. Using the correction mode may be appropriate in an optical fibre transmission medium where errors are rare and isolated. The detection mode is, however, preferred in copper transmission media where error bursts are not uncommon. This avoids the risk of a multiple-bit error being mistaken for a single-bit error and erroneously ‘corrected’. The VPI/VCI values change at each network node, necessitating a recalculation of the HEC field.
8
7
6
5
4
3
GFC
1
VPI VCI
VPI
(a)
2
PTI
CLP
HEC
8
7
6
5
4
3
2
1
VPI (b)
VPI
VCI VCI
VCI
PTI HEC
Figure 13.33
byte 1 2
VCI VCI
bit
CLP
3 4 5
bit byte 1 2 3 4 5
Structure of ATM cell header at: (a) UNI; (b) NNI.
851
852
13 Multiplexing Strategies
13.3.4.4 ATM Features Summary
In concluding our brief discussion of ATM let us summarise some of the most important features of this transmission technique. ●
●
●
●
●
●
ATM was the transmission technique adopted for B-ISDN. It did not handle voice signals as well as PDH and SDH systems and was not as efficient for data transmission as the packet switching protocols (e.g. the X.25 whose efficiency approached 99.93%), but it was an excellent compromise for handling all types of services. Bits to be transmitted are packaged in fixed-length cells of 53 bytes, which include a 5-byte header. The cells are transported at regular intervals, with idle periods carrying idle (i.e. unassigned) cells. Cell sequence integrity is maintained. There is a nonzero cell delay variation and occasional loss of cells, but cells are delivered to their destinations in the right order. ATM provides a connection establishment contract whereby the bit rate and quality of service (QOS) are specified by the end-user device at call set-up and guaranteed by the network for the duration of the call. QOS parameters include maximum permissible delay, delay variation, and cell loss ratio. Network overload control is implemented. In the event of network congestion, new connections are prevented, and available capacity is allocated to delay sensitive services, namely voice and video. ATM functionality follows a layered architecture with a physical layer that is nearly equivalent to the physical layer of the open systems interconnection (OSI) model. This is followed by an ATM layer, which implements a basic ATM network. Finally, there is an ATM adaptation layer (AAL), which interfaces various user, control, and management information to ATM.
13.3.4.5 ATM Versus IP
You may want to revisit Section 1.5.4.2 to find a detailed context for the analogies that we will employ in this brief section. If we liken the communication system resource to a community cake, ATM allows many to eat as much as they need, although it means that some may have to do without when the cake is finished. IP (Internet protocol), on the other hand, ensures that everyone has a smaller piece of the cake, although it means that the piece may sometimes be too small to satisfy some appetites. Furthermore, as a servant or messenger, ATM is a slow-start sprinter who will obediently follow the route the master mapped out in advance and will faithfully report back if things go wrong. IP, on the other hand, is an instant-start jogger who simply grabs the address from the master and attempts to make its own way through thick and thin but will simply walk away without saying a word if it fails. So, which of these two servants would you prefer to charge with the important responsibility of taking your urgent message from A to B? And which of these two community-cake-sharing models would you prefer to live under? In the 1990s the technical world hailed ATM as the switching technology we had all been waiting for. With much hype and fanfare, it was adopted as the transmission technique of B-ISDN, which was at the time seen as the network of the future. Meanwhile, IP continued to quietly grow with the Internet. Many began to fall in love with IP’s egalitarianism and flexibility and to concede that IP’s weaknesses, especially its propensity to be irresponsible, could be fully mitigated by always pairing it with a smart and diligently responsible supervisor such as TCP (transmission control protocol). So gradually, many masters who had hitherto shunned IP began to come around. First it was voice, and we coined the phrase voice over Internet protocol (VoIP) in celebration of the deal. Then it was TV, and we were stunned and gave it the less celebrated name TVoIP. However, because data had always been comfortable using IP, we had no need to prefix that service. It was simply just IP. Eventually, and ultimately to no one’s surprise, everything came around. Yes, it became everything over IP, but with no need for a fancy name. The competition between ATM and IP had been decisively won by IP. ‘So, what happened to ATM?’ you ask. Well, it has been relegated to being an in-house servant where it can be generous in sharing the cake with a small number of diners and where it can also be an excellent messenger on an internal route from A to B that rarely needs a detour.
13.4 Code Division Multiplexing
IP has become the switching technology of our twenty-first-century broadband networks, including both the Internet and mobile communication networks since 4G. We will, however, resist the urge to delve any further into IP so that we do not stray too far into networking, which, although extremely exciting, is beyond the scope of this book.
13.4 Code Division Multiplexing CDM is based on spread spectrum modulation, a technique that was developed in the 1940s for military communications. The message signal, of (unspread) bandwidth Bm , is spread in a pseudorandom manner over a bandwidth Bc > > Bm . The bandwidth ratio G=
Bc Bm
(13.17)
represents a processing gain, which accounts for an increase in the signal-to-noise ratio at the output of a spread spectrum receiver. Transmitting a signal by spread spectrum modulation yields important benefits. ●
●
●
●
The signal is immune to intentional interference, called jamming. A high-power jamming signal is necessarily narrowband and will fail to drown the information signal since only a small fraction of the signal energy is corrupted. More accurately, the process of spread spectrum demodulation at the receiver involves the use of a pseudorandom code, which de-spreads the wanted signal back into a narrow band Bm . Interestingly, the effect of this process on the jamming signal is to spread it over a wide band Bc . In this way, the jamming signal energy is rendered insignificant within the narrow band occupied by the recovered wanted signal. By a similar consideration, spread spectrum signals are immune to frequency-selective fading arising from multipath propagation. An unauthorised receiver cannot recover the information signal from the transmitted spread spectrum signal. Simply put, you must have knowledge of the carrier frequency in order to tune into a transmission. And if, as in an equivalent view of spread spectrum, the carrier frequency is not fixed but changes pseudorandomly then the oscillator frequency at the receiver must change exactly in step for demodulation to be possible. Only authorised receivers will know precisely the pseudorandom sequence of carrier frequencies used at the transmitter. Spread spectrum signals have a noise-like appearance to other (unauthorised) receivers. Thus, multiple user transmissions can simultaneously occupy the same frequency band with guaranteed message privacy, provided each user’s signal has been spread using a unique pseudorandom code, also referred to as pseudonoise (PN) sequence. This is CDM, which is finding increased nonmilitary applications in satellite and mobile cellular communications. Clearly, as the number of users increases a point is reached where the ‘background noise’ at each receiver becomes excessive leading to unacceptable bit error ratios (BERs).
13.4.1 Types of Spread Spectrum Modulation There are various types of spread spectrum (SS) modulation depending on the method employed to spread the message signal over a wider bandwidth. ●
Time-hopping (TH): the message signal is transmitted in bursts during pseudorandomly selected time slots. Figure 13.34a shows the block diagram of a TH transmitter. Let Rm denote the bit rate of the encoded message signal, giving a bit interval T m = 1/Rm , and a message bandwidth Bm = Rm . Each time interval T > > T m is divided into L equal time slots, and one of these slots is selected pseudorandomly (by opening the gate for this duration) for transmission. To keep up with the message rate, we must take from the buffer an average of Rm
853
854
13 Multiplexing Strategies
(a) Message bit stream
Coder
ʋm(t)
Gate
Buffer
ʋth(t)
PN Code generator (b)
THSS signal ʋthss(t)
Gate
PN code generator Figure 13.34
PSK detector
Buffer
PSK modulator
THSS signal ʋthss(t)
RF carrier
Decoder
Message bit stream
RF carrier
Time-hopping spread spectrum (THSS): (a) transmitter; (b) receiver.
bits per second, or Rm T bits in each interval T, which must all be sent during the one time slot (of duration T/L) when the gate is open. Thus, the burst bit rate is Rs =
●
Rm T = LRm T∕L
With PSK modulation, the transmission bandwidth is Bc = LRm , which gives processing gain G = L. A TH receiver is shown in Figure 13.34b. The gate must be opened in precise synchronism with the transmitter, which requires that (i) the gate is controlled by the same PN code used at the transmitter and (ii) both codes are in phase. This synchronisation is very stringent and becomes more difficult to achieve as L increases. Note that the role of the buffer at the receiver is to play out the demodulated bursty bit stream at the uniform rate of the coded message signal. Frequency-hopping (FH): the message signal is conveyed on a carrier, which hops pseudorandomly from one frequency to another, making Rh hops per second. Figure 13.35a shows a block diagram of a frequency-hopping spread spectrum (FHSS) transmitter. A coded message bit stream first FSK modulates a carrier signal, which is then multiplied in a mixer by a digital frequency synthesiser output, and the sum frequency is selected. The output frequency f o of the synthesiser is controlled by a PN sequence taken k bits at a time. Noting that an all-zero combination does not occur in a PN sequence, we see that there are L = 2k − 1 different values over which f o hops. The FSK modulator generates symbols at a rate Rs – one symbol per bit for binary FSK, or per log2 M bits for M-ary FSK. If the hop rate Rh is an integer multiple of the symbol rate Rs , several frequency hops occur during each symbol interval. This type of FHSS is known as fast-frequency hopping. If, however, Rh ≤ Rs , then one or more symbols are transmitted on each hop, and we have slow-frequency hopping. At the receiver (Figure 13.35b), exactly the same pseudorandom sequence of frequencies f o is generated and used in a mixer to remove the frequency hopping imposed on the FSK signal. It is extremely difficult for frequency synthesisers to maintain phase coherence between hops, which means that a noncoherent FSK demodulator must be used at the receiver. The main advantages of FHSS are that synchronisation requirements are less stringent, and larger spread spectrum bandwidths can be more easily achieved to realise higher processing gains G ≈ 2k − 1.
13.4 Code Division Multiplexing
Message bit stream
Coder
ʋm(t)
FSK modulator
FHSS signal ʋfsk(t)
(a)
Mixer
ʋfhss(t)
Frequency synthesiser
RF carrier
PN code generator
FHSS signal Mixer
ʋfhss(t)
Frequency synthesiser
(b)
ʋfsk(t)
Noncoherent FSK demodulator
ʋm(t)
Decoder
Message bit stream
RF carrier
PN code generator Figure 13.35
●
Frequency-hopping spread spectrum (FHSS): (a) transmitter; (b) receiver.
Direct sequence (DS): the coded message signal, of bit duration T m , is multiplied by a PN bit stream of much shorter bit duration T c , referred to as chip duration. This pseudorandomises the message bit stream and spreads its (null) bandwidth from Bm = 1/T m to 1/T c , which yields a processing gain G = Tm ∕Tc
●
(13.18)
This highly spread product signal is then used to modulate a carrier by BPSK, QPSK, or M-ary APSK. Direct sequence spread spectrum (DSSS) is the type of spread spectrum modulation employed in CDM-based mobile cellular communication (e.g. the old standard IS-95), and our discussion of CDM will be restricted to this method. One disadvantage of DSSS (compared to FHSS) is that the processing gain that can be achieved is limited by current device technology as T m decreases (in high information rate systems), since the required low values of T c become difficult to implement. Timing requirements in DSSS are also more stringent than in FHSS, but less than in time-hopping spread spectrum (THSS). Hybrid methods: hybrid SS techniques are possible that combine TH, FH, and DS. The most common hybrid technique is DS/FH, which combines the large processing gain possible in FH with the advantage of coherent detection in DS. Each frequency hop carries a DS spread spectrum signal and is coherently detected, but the signals from different hops have to be incoherently combined because of their lack of phase coherence.
855
856
13 Multiplexing Strategies
13.4.2 CDM Transmitter Figure 13.36a shows the block diagram of a CDM transmitter based on DSSS modulation. The waveforms associated with this transmitter are shown in Figure 13.36b. Unit-amplitude bipolar waveforms are assumed for convenience. The coded message waveform vm (t) has the indicated bit duration T m , whereas the PN waveform vpn (t) has a chip duration T c . Note that the waveforms correspond to the case T m = 15T c . More than one user (say N) can be accommodated, with each assigned a unique PN code vpn1 (t), vpn2 (t), …, vpnN (t), or a unique time shift in a common PN code vpn (t − 𝜏 1 ), vpn (t − 𝜏 2 ), …, vpn (t − 𝜏 N ). The PN code generator is in general a linear feedback shift register. Figure 13.37a shows the circuit connection that produces the PN sequence vpn (t) used in Figure 13.36b. The shift register consists of four flip-flops (FF1 to FF4), which are controlled by a common clock. Clock pulses occur at intervals of T c , and at each clock pulse the input state of each flip-flop is shifted to its output. The outputs of FF1 and FF4 are added in an EX-OR gate and fed back as input to the shift register. This gate performs a modulo-2 addition defined as follows 0 ⊕ 0=0 0 ⊕ 1=1 1 ⊕ 0=1 1 ⊕ 1=0
(13.19)
Message bit stream
Bipolar NRZ coder
ʋm (t)
×
ʋmpn (t)
PSK modulator
DSSS signal ʋdsss (t)
ʋpn (t) RF carrier
PN code generator (a) Input bits →
1
0
0
1
+1 ʋm(t) –1 +1
Tm
t Tc t
ʋpn(t) –1 +1
t
ʋmpn(t) –1 +1
t
ʋdsss(t) –1 (b) Figure 13.36
Direct sequence spread spectrum: (a) transmitter; (b) waveforms.
13.4 Code Division Multiplexing
XOR Gate
(a)
Shift register 1
1
FF1
1
FF2
0
FF3
[4, 1] PN sequence
0
FF4
Clock (Tc) (b)
X-OR
X-OR
X-OR
Shift register FF1
FF2
FF3
FF5
FF4
[5, 4, 3, 2] PN sequence
Clock (Tc)
0
0
1
0
1
1
0
1
0
1
0
0
0
[5, 4, 3, 2] PN sequence
0
(c)
0
1
1
0
0
1
0
0
1
1
1 1
1
1
1
0
1
Figure 13.37 Maximum-length PN sequence: (a) generator of the [4, 1] code listed in Table 13.2; (b) [5, 4, 3, 2] code generator; (c) [5, 4, 3, 2] code.
Because the feedback taps are located at the outputs of the fourth and first flip-flops, we have what is known as a [4, 1] code generator. In general, a linear feedback shift register that consists of m flip-flops and has feedback taps at the outputs of flip-flops m, i, j, … is identified as [m, i, j, …]. The serial PN code generated is of course the sequence of states of the mth flip-flop. As an example, Figure 13.37b shows the connection of a [5, 4, 3, 2] PN code generator, which gives the cyclic pseudorandom sequence shown in (c). The following discussion clarifies how this sequence is obtained. Let us assume that the [4, 1] PN code generator in Figure 13.37a has the indicated initial register state (FF1, FF2, FF3, FF4) = (1, 1, 0, 0). This is the state before the first clock pulse occurs at time t = 0. The initial feedback input is therefore FF1 ⊕ FF4 = 1. Table 13.2 lists the sequence of flip-flop outputs. After the first clock pulse at t = 0, the initial feedback state is shifted to become the FF1 output, the initial FF1 output becomes FF2 output, etc. Thus, the register state just after
857
858
13 Multiplexing Strategies
Table 13.2
Sequence of flip-flop outputs in [4, 1] PN code generator. Flip-flop Output
Time (t)
Input to shift register (Feedback)
FF1
FF2
FF3
FF4 (PN sequence)
0 Binary 0 if < 0
PN code generator (a)
ʋmpn(t) = ʋm(t) × ʋpn(t) t ʋo(t) = ʋmpn(t) × ʋpn(t) t ʋod(t) = ʋmpn(t) × ʋpn(t–Tc) t ʋo2(t) = ʋmpn(t) × ʋpn2(t) t (b) Figure 13.38 Direct sequence spread spectrum: (a) receiver; (b) waveforms of indicated signals. (Note: v mpn (t) corresponds to message signal segment v m (t) = 101 spread using v pn (t) = Code [6, 1]; and v o (t), v od (t), and v o2 (t) are the results of de-spreading using locally generated codes, where v pn2 (t) = Code [6, 5, 2, 1].)
code vpn (t) used at the transmitter. The multiplication yields a de-spread signal vo (t), which is then integrated in regular intervals of one message bit duration T m . A decision device compares the integration result V o (T m ) in each interval to a zero threshold. The (message) bit interval is declared to contain a binary 1 if V o (T m ) exceeds zero. A decision in favour of binary 0 is taken if V o (T m ) is less than zero, and a random guess of 1 or 0 is made if V o (T m ) is exactly equal to zero. An illustration of the operation of the receiver is given in Figure 13.38b. The waveform vmpn (t) corresponds to a message bit stream segment vm (t) ≡ 101 that was spread at the transmitter using vpn (t) = Code [6, 1]. Multiplying vmpn (t) with a perfectly synchronised Code [6, 1] yields vo (t). Clearly, this process has somehow extracted the original waveform vm (t) from a signal vmpn (t) that is noise-like in appearance. You can see that this is the case by noting that vpn (t) = ±1, so that v2pn (t) = 1
(13.21)
859
860
13 Multiplexing Strategies
Hence vo (t) = vmpn (t)vpn (t) = [vm (t)vpn (t)]vpn (t) = vm (t)[v2pn (t)] = vm (t)
(13.22)
The importance of synchronisation is illustrated in the waveform vod (t), which is the result of using the right code [6, 1] but with a misalignment of one chip duration T c . In addition, we illustrate in vo2 (t) the effect of using a wrong code vpn2 (t) = Code [6, 5, 2, 1]. Proceeding as in Eq. (13.22), we write vod (t) = vm (t)[vpn (t)vpn (t − Tc )] vo2 (t) = vm (t)[vpn (t)vpn2 (t)]
(13.23)
You can see that in these two cases we have failed to de-spread vmpn (t) since the term in brackets is not a constant but just another PN code (i.e. a random sequence of ±1). The input to the integrator is therefore a randomised version of the original signal vm (t), the spreading signal being the term in brackets. It means that vm (t) remains hidden, and the integrator sees only noise-like signals vod (t) and vo2 (t). By examining these two waveforms you can see that the decision device will make random guesses of 1 or 0, since the average of these waveforms in intervals of T m is approximately zero. Note that the process of integration is equivalent to averaging except for a scaling factor. Figure 13.39 illustrates the code misalignment problem more clearly. Here the output V o (T m ) of the correlation receiver is plotted against misalignment 𝜏. With perfect synchronisation between transmitter and receiver codes, 𝜏 = 0 and V o (T m ) = Em for binary 1 and −Em for binary 0, where Em is the energy per message bit. We see that the noise margin (i.e. difference between the output levels of the correlation receiver for binary 1 and 0 in the absence of noise) is 2Em . As 𝜏 increases, the noise margin decreases steadily causing increased BER, and reaching zero – with V o (T m ) = 0 and BER = 0.5 – at L T (13.24) L+1 c Here L = 2m − 1 is the length of the PN code and m is the length of the linear feedback register that generates the code. For a misalignment of T c or larger we have { −Em ∕L, Binary 1 (13.25) Vo (Tm ) = +Em ∕L, Binary 0 𝜏=
That is, the noise margin is −2Em /L, which is negative and implies that in a noiseless receiver a binary 1 would always be mistaken for binary 0, and vice versa. However, a practical receiver will always be subject to noise and, because L is large, the value V o (T m ) = ±Em /L will be negligible compared to noise. Thus, the input to the decision device will simply fluctuate randomly about 0 according to the variations of noise. Under this scenario, the output of the decision device will be a random sequence of bits 1 and 0 so that BER = 50%, which is what you get in the long run from random guesses in a binary sample space. It is important to note that these comments are only applicable to a misalignment in the range T c ≤ 𝜏 ≤ (L − 1)T c . Beyond (L − 1)T c , the two codes will begin to approach perfect alignment at 𝜏 = LT c due to their cyclic sequence. We have demonstrated above that a message signal can only be recovered from a spread spectrum signal in a receiver equipped with a synchronised correct PN code. We now demonstrate with the aid of Figure 13.40 that a receiver will also correctly extract its desired signal from a multitude of spread spectrum signals. We show two message waveforms vm1 (t) ≡ 101 and vm2 (t) ≡ 001, which have been spread using different PN codes vpn1 (t) and
13.4 Code Division Multiplexing
Vo(Tm)(τ)
+Em
Bi
na
ry
1
+Em/L
τ=
L T L+1 c
τ
–Em/L
ary
Bin –Em
0
Tc
0
Figure 13.39
ʋm 1 (t) =
2Tc
Output V o (T m ) of correlation receiver as a function of PN code misalignment 𝜏.
1
0
1 t
(a) ʋm 2 (t) =
0
0
1 t
ʋcdm (t) = ʋm 1 (t)ʋpn 1 (t) + ʋm 2 (t) ʋpn 2 (t)
t
(b) ʋo1 (t) = ʋcdm (t)ʋpn 1 (t)
t (c)
ʋo2 (t) = ʋcdm (t)ʋpn 2 (t)
t Tm Figure 13.40 Two-channel CDM example showing (a) user data waveforms v m1 (t) and v m2 (t), (b) received CDM waveform v cdm (t), and (c) waveforms v o1 (t) and v o2 (t) after each channel at the receiver multiplies v cdm (t) by its allotted PN code.
861
862
13 Multiplexing Strategies
vpn2 (t) assigned to users 1 and 2, respectively. Thus, the signal at the receiver of each user (at the output of the PSK detector) is a composite signal given by (13.26)
vcdm (t) = vm1 (t)vpn1 (t) + vm2 (t)vpn2 (t) Receiver 1 multiplies vcdm (t) by its unique code vpn1 (t), whereas receiver 2 multiplies by vpn2 (t) to obtain v01 (t) = vm1 (t) + vm2 (t)vpn2 (t)vpn1 (t)
(13.27)
v02 (t) = vm2 (t) + vm1 (t)vpn1 (t)vpn2 (t)
These waveforms are also plotted in Figure 13.40 from which we see that integrating each one over every message bit interval will lead to correct decisions regarding the transmitted bit streams. That is, each receiver successfully extracts only the bit stream intended for it from the mix of bit streams contained in the CDM signal vcdm (t). This is a remarkable result. In general, the signal at the input of a receiver in a CDM system with N simultaneous transmissions is given by vcdm (t) = vm1 (t)vpn1 (t) + vm2 (t)vpn2 (t) + · · · + vmN (t)vpnN (t)
(13.28)
The de-spread signal of, say, receiver 1 is therefore vo1 (t) = vcdm (t)vpn1 (t) = vm1 (t) + vm2 (t)vpn2 (t)vpn1 (t) + · · · + vmN (t)vpnN (t)vpn1 (t) = vm1 (t) + Noise
(13.29)
Similar results apply to every one of the N receivers. The situation is therefore reduced to the familiar problem of digital signal detection in noise. As the number of users N goes up, the noise term in Eq. (13.29) becomes larger, and the probability of error, or BER, increases. So, there is a limit on the number of users in order to guarantee a specified BER. To quantify this limitation, consider an N-user CDM system in which the same signal power Ps reaches the receiver from each user. The unwanted accompanying N − 1 CDM signals constitute noise in addition to the receiver noise power Pn , but typically Ps > > Pn . The carrier-to-noise ratio (C/N)i of the wanted signal at the receiver input is thus ) ( ) ( Ps 1 dB ≃ 10log10 (C∕N)i = 10log10 Pn + (N − 1)Ps (N − 1) De-spreading by the receiver adds a processing gain G given earlier in Eq. (13.17), so that the carrier-to-noise ratio C/N of the de-spread signal (which is the signal from which a demodulator will attempt to recover the original bit stream) increases to C∕N = (C∕N)i + 10log10 (G) ) ( G = 10log10 ≃ (N − 1)
( 10log10
G N
) dB
(13.30)
We see therefore that G must be much larger than the number of users N if this C/N is to meet the threshold needed by the demodulator to achieve a specified BER. We may rearrange the above equation to obtain number of users N in terms of G and C/N G (13.31) 10(C∕N)∕10 For example, assuming that the transmission system includes error control coding which allows the BER at demodulator output to be as high as 10−2 (because it is followed by a codec that reduces BER from this high value to an acceptable 10−7 ) then we can determine minimum C/N needed by a QPSK modem as follows. We read from Figure 11.45 the value of Eb /N o needed to achieve BER 10−2 . This gives Eb /N o = 4.32. We then convert this value to C/N, N=
13.4 Code Division Multiplexing
assuming an ideal modem (with no implementation loss) and an ideal Nyquist filter (i.e. a raised cosine filter with roll-off factor 𝛼 = 0) and recalling that QPSK is M-ary PSK with M = 4. Thus C∕N = 4.32 + 10log10 (log2 M) = 4.32 + 10log10 (2) = 7.33 dB So, in this case, Eq. (13.31) gives N = G/5.41. This means that to accommodate 100 users we would need G = 541, which means that we require a 541-fold increase in bandwidth (from message bandwidth to the bandwidth of the transmitted CDM signal), and this could be a significant barrier. Alternatively, this means that chip duration T c must be (1/541)th the message bit duration T m and this may be difficult to achieve with available technology if message bit rate is high.
13.4.4 Crucial Features of CDM Let us conclude our discussion of CDM by emphasising those features that are crucial to the smooth operation of this multiplexing strategy. 13.4.4.1 Synchronisation
The PN code generated at the receiver must be identical to and synchronised with the spreading code used at the transmitter. There is usually no problem with the two codes being identical – unless of course the receiver is unauthorised – so we concentrate on the synchronisation requirement. Let the transmitter code be vpn (t) and the receiver code vpn (t − 𝜏) – with a misalignment 𝜏. It follows from the receiver block diagram that Tm
Vo (Tm ) =
∫0 Tm
=
∫0
=±
vmpn (t)vpn (t − 𝜏)dt vm (t)vpn (t)vpn (t − 𝜏)dt
Em Tm vpn (t)vpn (t − 𝜏)dt Tm ∫0
= ±Em Rp (𝜏)
(13.32)
In the above we have used the fact that vm (t) is a (normalised) constant ±Em /T m in the integration interval spanning one message bit interval. The positive sign applies to binary 1 and the negative sign to binary 0. Rp (𝜏) is the autocorrelation function of a periodic signal – in this case vpn (t) – of period T m and is defined by Rp (𝜏) =
1 Tm ∫0
Tm
vpn (t)vpn (t − 𝜏)dt
(13.33)
The autocorrelation function of a signal has several interesting properties, which we discuss in Section 3.5.5. Equation (13.33) has been evaluated for a normalised unit-amplitude maximum-length PN sequence of length L and chip duration T c , and is shown in Figure 13.41. By examining this figure, we see the importance of synchronisation. Equation (13.32) states that the output of the correlation receiver is proportional to Rp (𝜏), which from Figure 13.41 is clearly maximum at 𝜏 = 0 and decreases rapidly to −1/L at 𝜏 = T c . You may wish to look back at Figure 13.39 and note that it is actually a plot of Eq. (13.32). In practice, synchronisation is accomplished at the receiver in two stages. First is the acquisition stage, also known as coarse synchronisation, which is performed at the start of signal reception, or after loss of synchronisation, by sliding the timing of the locally generated PN code until a peak output is obtained. To do this the PN code first modulates a carrier, as was done in the transmitter. The resulting signal is then correlated with the incoming spread spectrum signal, and the code alignment is shifted until maximum correlation is achieved. Next follows the tracking stage or fine synchronisation in which a phase-locked loop is used to keep the locally generated PN code in step with the transmitter code.
863
864
13 Multiplexing Strategies
Rp(τ) 1
1/L τ –Tc
–LTc Figure 13.41
Tc
LTc
Autocorrelation of a maximum-length PN sequence of length L and chip duration T c .
13.4.4.2 Cross-correlation of PN Codes
Returning to Eq. (13.29), which deals with the detection of a spread spectrum signal in a multi-user environment, we see that the correlation output of receiver 1 may be written as Tm
Vo1 (Tm ) =
vcdm (t)vpn1 (t)dt
∫0 Tm
=
∫0
[vm1 (t)vpn1 (t) + vm2 (t)vpn2 (t − 𝜏2 ) + · · · + vmN (t)vpnN (t − 𝜏N )]vpn1 (t)dt
= ±Em1 ±
Tm Em2 Tm E vpn1 (t)vpn2 (t − 𝜏2 )dt ± · · · ± mN vpn1 (t)vpnN (t − 𝜏N )dt Tm ∫0 Tm ∫0
= ±Em1 ± Em2 R12 (𝜏2 ) ± · · · ± EmN R1N (𝜏N )
(13.34)
Here, 𝜏 k (for k = 2, 3, …, N) is the misalignment between receiver 1 and the PN code of the kth user transmission. And R1k (𝜏) is the cross-correlation function of the PN sequences vpn1 (t) and vpnk (t), defined by R1k (𝜏) =
1 Tm ∫0
Tm
vpn1 (t)vpnk (t − 𝜏)dt
(13.35)
Equation (13.34) shows that for there to be no interference between users in a CDM system, the cross-correlation of any two PN codes in the system must be zero. Equivalently, we say that the PN sequences should be mutually orthogonal. This requirement is impossible to meet in practice, and we can only search for classes of PN codes that give acceptably small (i.e. good) cross-correlation. Figure 13.42 shows the cross-correlation of maximum-length PN sequences [5, 2] versus [5, 4, 3, 2], and [7, 1] versus [7, 6, 5, 4, 2, 1]. A class of codes known as Gold sequences gives better cross-correlation properties. A Gold sequence results from an EX-OR combination of two carefully selected m-sequences. In general (and this is apparent from Figure 13.42), the larger the sequence length L, the smaller the cross-correlation, which leads to reduced mutual interference. However, processing delay (for example during coarse synchronisation) increases with L. 13.4.4.3 Power Control
By examining Eq. (13.34) we see that our failure to find a class of PN codes with zero cross-correlation leads to stringent power control requirements in order to minimise mutual interference in a multi-user CDM system. To be more specific, consider a receiver such as a base station in CDM-based cellular telephony or a satellite transponder that employs a CDM-based multiple access. This receiver is equipped with N correlators, one for each of the N user transmissions. Let V oj (T m ) denote the output of correlator j and Emj the energy per message bit reaching the receiver from the jth user. It is clear that the unwanted contribution to V o1 (T m ) from the other user
13.4 Code Division Multiplexing
ʋpn1(t) = Code [5,2]; ʋpn2(t) = Code [5,4,3,2] 0.2
R12(τ)
0.1 τ
0 –0.1 –0.2 0
5Tc
10Tc
15Tc
20Tc
25Tc
30Tc
ʋpn1(t) = Code [7,1]; ʋpn2(t) = Code [7,6,5,4,2,1] 0.1 R12(τ)
0.05 τ
0
–0.05 –0.1 0
Figure 13.42
20Tc
40Tc
60Tc
80Tc
100Tc
120Tc
Cross-correlation R12 (𝜏) of maximum-length PN sequences.
transmissions will depend on Em2 , Em3 , …, EmN according to Eq. (13.34). Similarly, the unwanted contribution to V o2 (Tm) depends on Em1 , Em3 , …, EmN . And so on. You can therefore see that the condition for minimum mutual interference is that Em1 = Em2 = Em3 = · · · = EmN
(13.36)
That is, the transmission from each of the N users must reach the receiver at the same power level. As the radio link from each user to the receiver is typically of a different length and subject to different propagation conditions, we must implement some form of power control in order to achieve Eq. (13.36). One way of implementing power control is by each user terminal monitoring the level of a pilot signal from the base station and adjusting its transmitted power level accordingly. A low pilot level indicates high path loss between terminal and base station, perhaps because the two are far apart, and causes the terminal to increase its transmitted power level. A high pilot level, on the other hand, indicates low path loss, perhaps due to increased proximity between user terminal and base station and causes the terminal to reduce its transmitted power. This technique is known as open loop power control. It assumes identical propagation conditions in the two directions of transmission between terminal and base station, which will often not be the case for a mobile terminal or if the pilot signal is at a different frequency from the user transmission. For example, if the monitored pilot signal undergoes a frequency-selective fade, the user terminal could overestimate the path loss experienced by its transmissions to the base station, which would cause it to increase its transmitted power excessively. There could also be situations when the user terminal grossly underestimates the attenuation on its transmissions to the base station because it is in a deep fade, whereas the pilot signal is not. Closed loop power control solves the above problem but requires a higher operational overhead. Here, the base station monitors the transmission from each terminal and regularly issues a command that causes the terminal to increase or decrease its transmitted power. For example, a one-bit command could be issued every 1.25 ms. A binary 1 indicates that the power transmitted by the terminal is too high, the terminal responding by decreasing
865
866
13 Multiplexing Strategies
its power by 1 dB. A binary 0 indicates that the power reaching the base station from the terminal is too low, and in response the terminal increases its radiated power by 1 dB. The power transmitted by a base station is also controlled based on power measurement reports received from user terminals, which indicate the signal strength reaching each terminal and the number of detected bit errors. 13.4.4.4 Processing Gain
It is illuminating to examine PN codes in the frequency domain in order to make important observations on their signal processing roles. The autocorrelation function Rp (𝜏) of an m-sequence vpn (t) was sketched in Figure 13.41. The Fourier transform of Rp (𝜏) gives the power spectral density Sp (f ) of vpn (t), which furnishes complete information on the frequency content of the PN sequence. We obtain Sp (f ) readily from the waveform of Rp (𝜏) by noting that Rp (𝜏) is a centred triangular pulse train of period T = LT c , pulse width 2T c , and hence duty cycle d = 2T c /LT c = 2/L. The pulse train has amplitude A = 1 + 1/L and has been reduced (shifted vertically down) by a constant level 1/L. From Chapter 4 (see, for example, Eq. (4.19)), we know that this waveform has the following Fourier series Rp (𝜏) =
∞ ∑ Ad 1 − + Ad sinc2 (nd∕2) cos(2𝜋nf o 𝜏); 2 L n=1
fo =
1 T
fo =
1 LT c
Substituting the above values, we obtain 2(L + 1) ∑ 1 + sinc2 (n∕L) cos(2𝜋nf o 𝜏); L2 L2 n=1 ∞
Rp (𝜏) =
This indicates that Rp (𝜏) has DC component of amplitude Ao = 1/L2 and contains harmonics spaced apart by frequency f o = 1/LT c , with the nth harmonic (of frequency nf o ) having amplitude An given by An =
2(L + 1) sinc2 (n∕L) L2
This, being the spectrum of the autocorrelation function Rp (𝜏) of the PN sequence, is the PSD of the sequence and is shown in Figure 13.43 for L = 15. This spectrum provides valuable information. A PN sequence vpn (t) contains sinusoidal components of frequencies up to a (null) bandwidth equal to the reciprocal of the chip duration T c . You may recall from Chapter 7 that the effect of multiplying a baseband signal by a sinusoid of (carrier) frequency f c is to shift the baseband spectrum to be centred at f c . Furthermore, the frequency translation may be removed without distortion to the baseband spectrum simply by performing the multiplication a second time using a carrier of the same frequency and phase. Thus, multiplying a message bit stream vm (t) by a PN sequence will duplicate (in other words spread) the spectrum of vm (t) at intervals of 1/LT c over a bandwidth of 1/T c . A little thought will show that the duplicated spectra are diminished in amplitude in proportion to T c , and that they overlap each other. The composite spread spectrum is therefore roughly uniform (somewhat like that of white noise) and bears little resemblance to the baseband spectrum, which means that the signal has been successfully hidden. Overlapping of the duplicated spectra is important; otherwise, the situation reduces to the straightforward process of sampling where the spectra are distinguishable and the ‘spreading’ can be removed by lowpass filtering. We ensure overlapping by using a spreading sequence that has a dense spectrum – implying a small value of 1/LT c . For a given chip duration T c , this requires that we make the sequence length L very large. A second multiplication by vpn (t) at the receiver has the ‘magical’ effect of reconstructing the message signal spectrum, because the spectra originally translated to f = k/LT c , k = 1, 2, 3, …, L are simply thrown back to f = 0, and these all add to give the original baseband spectrum. This is provided the ‘carrier frequencies’ k/LT c have the same phases at transmitter and receiver, which is another way of saying that the two PN codes must be synchronised. You will recall that a time shift of 𝜏 on vpn (t) has the effect of altering the phases of its frequency components by 2𝜋f𝜏. Yes, the multiplication also creates new spectra at 2 k/LT c , but these are filtered out.
13.5 Multiple Access
PSD
2(L + 1) L2
Square sinc envelope
1 L2
0
Figure 13.43
L = 15
1 LTc
1/Tc
2/Tc
f
PSD of PN sequence of length L = 15.
Herein lies the spread spectrum processing gain. For interference signals entering the receiver along with the wanted signal, this is their first multiplication with this code, which therefore spreads them over a bandwidth 1/T c . However, for the wanted signal it is the second multiplication, and this returns its frequency components back to the (null) bandwidth 1/T m , where T m is the message bit interval. Clearly then, the C/N at the output of the de-spreader is larger than the C/N at the input by the amount ( ) Tm , dB Processing Gain = 10log10 T ] [ c Spread (null) bandwidth (Hz) , dB (13.37) = 10log10 Message bit rate (b∕s) For example, if a spread bandwidth of 1.23 MHz is employed to transmit at a message bit rate of 9.6 kb/s, Eq. (13.37) gives a processing gain of 21.1 dB. For a given message bit rate, processing gain can be increased to realise improved performance in the presence of noise and interference by using a larger spread bandwidth 1/T c . But allocated radio bandwidths are limited, and device technology places a limit on how small we can make the chip duration T c .
13.5 Multiple Access Each of the main multiplexing strategies discussed in this chapter may be utilised as a multiple access technique. It is important to note the distinction between multiplexing and multiple access. Multiplexing is the process of combining multiple user signals for simultaneous transmission as a composite signal on one transmission link, whereas multiple access is concerned with how one communication resource such as a satellite transponder or a terrestrial base station is shared by various transmitting stations, each operating on a separate transmission link. Although the concept is more generally applicable, we will limit our brief discussion to a satellite communication application of multiple access. There are three main types of multiple access, namely frequency division multiple access (FDMA), time division multiple access (TDMA), and code division multiple access (CDMA).
867
13 Multiplexing Strategies
13.5.1 FDMA In FDMA, each transmitting earth station (ES) is allocated a separate frequency band in the satellite transponder. Figure 13.44 illustrates the sharing of a 36 MHz transponder among three earth stations (ESs). Each ES is allocated an exclusive 10 MHz bandwidth, which is used for simultaneous and continuous transmission as required. Note that a composite FDM signal exists on the downlink, whereas the uplink has separate links, each operating on a separate carrier frequency. To allow the use of realisable filters when extracting a desired channel from the downlink FDM signal, FDMA always includes a guard band (GB) between allocated adjacent sub-bands. In the illustration, a GB of 2 MHz is used. The signal transmitted by each ES may contain a single user signal or it may be a multiplex of several user signals. The former is known as single channel per carrier (SCPC) and the latter as multiple channel per carrier (MCPC). FDMA is very simple and cheap to implement using well-established filter technology, but it is prone to intermodulation distortion when the transponder amplifier is operated in its nonlinear region. To reduce this distortion, a lineariser is often used to pre-distort the incoming signal at the transponder in such a way as to make up for the subsequent amplification distortion. Additionally, input power is reduced from the level that would saturate the amplifier – an action known as back-off – by the amount (in dB) necessary to ensure operation in a linear region of the transfer characteristic of the combined lineariser/amplifier system. FDMA capacity may be readily determined in terms of the number of ESs N that can share a transponder bandwidth Bxp with each station needing bandwidth Bes and a GB Bg maintained between allocated sub-bands ⌋ ⌊ Bxp (13.38) N= Bes + Bg The bandwidth requirement Bes of each transmission depends on the required bit rate and the modulation scheme employed. These relationships are summarised in Section 11.11. Power issues must be considered when
ES 1 (10 MHz)
ES 2 (10 MHz)
ES 3 (10 MHz)
GB = 2 MHz 36 MHz Transponder Bandwidth
Uplink
GB
Downlink
Simultaneous transmission by each earth station Figure 13.44
f
f
f
868
f
ES1
ES2
FDMA concept.
ES3
13.5 Multiple Access
assessing capacity because, while there may be enough bandwidth to accommodate N stations, there may be insufficient power to support that number at the required C/N. When considering power in FDMA, it must always be remembered that a receiving ES shares the advertised transponder power (i.e. satellite transmit effective isotropically radiated power (EIRP)) only in proportion to the ES’s allocated bandwidth (as a fraction of transponder bandwidth).
13.5.2 TDMA The concept of TDMA is illustrated in Figure 13.45. ESs take turns in making use of the entire transponder bandwidth by transmitting bursts of RF signals within centrally allocated and nonoverlapping time slots. If N transmitting ESs share the transponder then over an interval known as a frame duration T f , each station is allocated one time slot for its burst transmission. Therefore, on the uplink, there are separate transmission links operating at the same frequency but at different time intervals, and on the downlink there is the TDMA frame carrying the signal of each of the N stations in separate time slots. A frame duration may range from 125 μs up to 20 ms, but a value of 2 ms is common. A large frame duration yields high frame efficiency but increases delay and complexity. It is always necessary to include a guard time between each time slot to ensure that bursts from stations using adjacent time slots will not overlap in their arrival time at the satellite even if there is a slight variation in the arrival of each burst. Such small variations in arrival time may be caused by timing error at the ES, changes in the distance between an ES and the satellite, and changes in propagation condition. TDMA is not affected by intermodulation distortion since only one carrier is present in the transponder at any given time. This is an advantage because it allows amplifiers to be operated at their maximum capacity, which saves on bulk and weight (since one no longer must use a larger amplifier at a reduced output due to back-off). However, each ES must transmit at a high symbol rate (to fill the transponder bandwidth) and must do so using Satellite
Uplink
Downlink
rom sf s rst on bu tati ion h s iss art sm t e an en Tr iffer d
e tim
Frame duration, Tf Figure 13.45
TDMA concept.
Info bits (ES N)
Guard time Preamble (ES 1)
Preamble (ES N)
Info bits (ES 2)
Guard time
Info bits (ES 1)
Guard time Preamble (ES 2)
Preamble (ES 1)
TDMA Frame (exists only on satellite downlink): Info bits (ES 1)
869
870
13 Multiplexing Strategies
enough signal power to provide an acceptable C/N, taking into consideration the proportionate increase in noise power with bandwidth as N o B. For this reason, TDMA is not as suitable as FDMA for transmitting narrowband signals from small ESs. TDMA capacity, in terms of the number of stations N that can share a transponder of bandwidth Bxp using a TDMA frame of duration T f and guard time T g between time slots and number of preamble bits nbp in each station’s transmit burst, may be determined in several ways. One way is to start by specifying the required bit rate Res of each station, the M-ary modulation scheme (APSK or PSK) and the raised cosine filter roll-off factor 𝛼. This sets the burst bit rate as Bxp (13.39) Rburst = log M (1 + 𝛼) 2 The duration of the time slot T es needed by each ES within the TDMA frame of duration T f is Tes =
Res Tf Rburst
(13.40)
We must also allow a total guard time NT g within the frame as well as total time NT p for all stations to send their preamble bits, where npb (13.41) Tp = Rburst Since NT es + NT g + NT p = T f , it follows that the TDMA system capacity is ⌋ ⌊ Tf N= Tg + Tes + Tp
(13.42)
where T f and T g are in the system specification, and T es and T p are calculated using the preceding equations. Worked Example 13.2 A TDMA system has the following specifications: ● ● ● ●
Satellite transponder bandwidth, Bxp = 72 MHz. TDMA frame duration, T f = 2 ms. Guard time between time slots, T g = 1 μs. Number of preamble bits in each ES burst, nbp = 148 bits.
Determine the number of stations that can be accommodated if each station operates at a bit rate Res = 20 Mb/s (which includes redundancy due to error control coding) using 16-APSK and a raised cosine filter of roll-off 𝛼 = 0.05. Determine also the TDMA frame efficiency. The supported burst bit rate is Bxp 72 log M = × 4 = 274.286 Mb∕s Rburst = (1 + 𝛼) 2 1.05 Required time slot per station per frame is Res Tf 20 × 2 = 0.1458 ms = Tes = Rburst 274.286 Required time for preamble per station per frame is npb 148 = 0.5396 𝜇s = Tp = Rburst 274.286
13.5 Multiple Access
The number of stations is therefore ⌋ ⌊ ⌊ ⌋ Tf 2 = N= Tg + Tes + Tp 1 × 10−3 + 0.1458 + 0.5396 × 10−3 = ⌊13.57⌋ = 13 Frame efficiency is the fraction of the frame duration that is devoted to carrying bits from the N stations. Since, in this case, in every 2 ms we have 13 stations using 0.1458 ms each to transmit their data, frame efficiency is 𝜂=
NT es 13 × 0.1458 × 100% × 100% = Tf 2
= 94.79% This TDMA efficiency is quite high. In practical systems, the guard time will be higher than 1 μs and the number of preamble bits will be larger. This will reduce frame efficiency and hence system capacity.
13.5.3 CDMA In CDMA, each transmitting ES multiplies its signal by a unique orthogonal spreading code prior to transmission. These signals reach the satellite transponder at the same time and in the same frequency band. Following frequency down conversion and power amplification, a composite CDM signal is transmitted on the satellite downlink, containing N differently spread signals from the sharing ESs. The theoretical details of the CDM signal generation and detection are as discussed under CDM. Figure 13.46 shows a CDMA system consisting of N transmit ESs. Details of the receiver operations on the incoming downlink signal from the satellite in order to recover
Uplink
Downlink
Transmit earth station 1 m1(t)
Filtering, Amp., etc p1(t)
Receive earth station 1 N
mk(t)pk (t)cos(2πfct) Σ k=1
cos(2πfct)
LNA, Mixer, etc
PN code generator
p1(t) Transmit earth station N mN(t)
m1(t)cos(2πfct) + nI(t) Filtering & Amp.
pN(t)
cos(2πfct)
PN code generator
2cos(2πfct)
∫
0
Tm
m1(t) + m1 (t)cos(4π fct) + 2n1 (t)cos(2πfct) m1(t) + n′I(t)
m1(t)
Figure 13.46
CDMA system operation.
871
13 Multiplexing Strategies
the wanted message signal from any of the N stations are also given in the diagram, using Earth Station 1 as an example. CDMA has several benefits, such as helping to reduce interference from co-channel systems, since the unwanted carrier signals will be spread and therefore mostly rejected by the receiver. It also serves to reduce multipath effect since the reflected waves will be spread by the receiver if their delay exceeds a chip duration. Also, unlike TDMA, coordination between ESs sharing one transponder is not required, although synchronisation of the spreading sequences at transmitter and receiver is essential, as discussed in Section 13.4. Theoretically, CDMA facilitates 100% frequency reuse between beams in a multiple spot beam satellite system. However, CDMA’s main drawback is its low efficiency, expressed in terms of the total achievable data rate of a CDMA system utilising an entire transponder when compared to the data rate of a single carrier that fully occupies the same transponder. We did earlier discuss the constraint on CDMA capacity and derived Eq. (13.31) for number of users in terms of processing gain and C/N. CDMA requires a large contiguous bandwidth in order to achieve the high processing gain required, and this may not be available on a satellite transponder. Furthermore, strict power control must be maintained, as also earlier discussed.
13.5.4 Hybrid Schemes The multiple access techniques discussed above are often used in combination. For example, a satellite transponder bandwidth may be divided into sub-bands and each sub-band is shared by multiple small ESs using TDMA. This combination of FDMA and TDMA is known as multifrequency TDMA (MF-TDMA). An example is shown in Figure 13.47, where 12 stations share one transponder which has been partitioned into three sub-bands denoted f 1 , f 2 , and f 3 . Four stations share each sub-band using a TDMA frame with four allocated time slots. Note therefore that the system resource allocated in FDMA is a frequency band, and in TDMA it is a time slot, but in MF-TDMA f1
f2
f3
TDMA frame 3 TDMA frame 2 TDMA frame 1
872
Time Transponder Bandwidth Figure 13.47
Example of 12 stations sharing one transponder using MF-TDMA.
Frequency
13.6 Summary
Inbound narrowband SCPC channels
Outbound wideband TDM channel
TDMA slots →
Transponder Bandwidth Inbound MF-TDMA channels
Sub-band 1
Sub-band 2
Inbound CDMA channels
Outbound wideband TDM channel
Transponder Bandwidth Outbound wideband TDM channel
Transponder Bandwidth
Figure 13.48
Hybrid multiple access schemes for a star VSAT network.
the allocation is both a time slot and a frequency band. This allocation is usually dynamic and according to need and may change in a time interval of a few frames. Figure 13.48 shows three hybrid multiple access arrangements for a VSAT (very small aperture terminal) star network implementation. In all three scenarios, half of the transponder bandwidth is used to carry a wideband TDM signal on the outbound link (from the hub station to the VSAT terminals) and the other half is used for the inbound link supporting transmissions from multiple VSAT terminals to the hub. These VSATs share their half of the transponder bandwidth using three different schemes: SCPC FDMA is used in the top scenario, MF-TDMA is used in the middle, and CDMA is used in the bottom. Note therefore that the bottom scenario is a combination of CDMA and FDMA, the top scenario is pure FDMA but with unequal sub-band partitioning, whereas the middle scenario is a combination of MF-TDMA and FDMA.
13.6 Summary This now completes our study of multiplexing strategies. We started by giving several compelling reasons for multiplexing in modern telecommunications and then presented a nonmathematical discussion of the four strategies of SDM, FDM, TDM, and CDM. This was followed by a more detailed discussion of the last three techniques. FDM is truly indispensable to radio communications and has a very long list of applications. It allows the existence of several audio and TV broadcast houses in one locality, and the simultaneous provision of a large variety of communication services. Capacity enhancement in cellular telephony and satellite communication relies heavily on FDM. Closed media applications of FDM include wavelength division multiplexing (WDM) in optical fibre, which allows literally millions of toll-quality digital voice signals to be transmitted in one fibre. FDM telephony allowing up to 10 800 analogue voice signals to be transmitted in one coaxial cable was popular up till the 1980s. FDM implementation is very straightforward. A frequency band is allocated to a user, and the user’s signal is applied to modulate a suitable carrier thereby translating the signal into its allocated band. The type of modulation technique depends on the communication system. SSB was used in FDM telephony, on–off keying (OOK) is
873
874
13 Multiplexing Strategies
used in optical fibre, FM was used in first generation cellular telephony and analogue satellite communication, and M-ary APSK is used in modern satellite and terrestrial communications, etc. Each of these modulation techniques is covered in previous chapters. Our discussion of FDM included its application in telephony where a complete set of standards was specified for hierarchical implementation. TDM fits in very well with the techniques of digital switching in modern networks and is ideally suited for transmitting digital signals or bit streams from multiple sources. It allows the advantages of digital communications to be extended to a larger number of simultaneous users in a common transmission medium than would be possible with FDM. We presented a detailed discussion of (nonstatistical) TDM optimised for digital transmission of voice, including the plesiochronous and synchronous digital hierarchies. To show how the requirements of broadband integrated services are satisfied, we presented the statistical TDM technique of ATM, which satisfactorily multiplexes all types of digital signals, including voice, data, and video. However, we noted by way of analogies that IP has beaten ATM to become the preferred transmission technology of the twenty-first century. We also discussed various spread spectrum modulation techniques, including time-hopping, frequency-hopping, and direct sequence. In studying CDM, we demonstrated the importance of several factors, including code synchronisation and cross-correlation, power control, processing gain, and the length of the code sequence. We concluded the chapter with a brief discussion of the application of the above multiplexing strategies for multiple access in satellite communication systems. The suitability of a multiple access technique for a given application or its superiority to other techniques is still open to debate. We briefly presented the merits and drawbacks of each technique and provided formulas for system capacity calculations. You should therefore now be better equipped to make an informed choice.
Questions 13.1
Higher line utilisation may be realised by multiplexing 16 voice signals into one 48 kHz group signal. The following procedure has been specified by the ITU for doing this. Each voice signal is limited to frequencies of 0.25–3.05 kHz. Frequency translation of each voice signal to an exclusive passband is accomplished for the odd-numbered channels using eight LSB-modulated carriers at frequencies (kHz) 63.15, 69.15, 75.15, … The even-numbered channels are translated using eight USB modulated carriers at frequencies (kHz) 62.85, 68.85, 74.85, … (a) Draw detailed block diagrams of the multiplexer and demultiplexer for this 16-channel FDM system. (b) Sketch a clearly labelled spectrum (similar to Figure 13.7a) of the 16-channel group signal. (c) Determine the nominal bandwidth and GB of each voice channel. (d) Determine the Q and ℤ factors of the most stringent filter used in this FDM system. (e) Compare your result in (d) to that of a standard 12-channel group signal.
13.2
(. a) Sketch a clearly labelled spectrum of the HG signal in the UK FDM system. (b) Determine the Q factor of the most stringent filter in the STE of Figure 13.9.
13.3
Determine the Q and ℤ factors of the most stringent filter in a 10 800-channel UK hierarchical FDM system. How does this compare with a flat-level assembly of the same number of voice channels using subcarriers of 4 kHz spacing starting at 1 MHz?
13.4
Determine the number of CTE, GTE, and STE required to set up a 600-channel Bell FDM system. Draw a block diagram showing the connection of these devices from the second multiplexing stage.
Questions
13.5
Frequency gaps or GBs are necessary in FDM signals to allow the use of realisable filters and the transmission of control tones in the gaps. The efficiency of an FDM signal is the percentage of total bandwidth that contains voice frequencies. Determine the efficiency of a 10 800-channel FDM signal in each of the three hierarchical standards, namely UK, Europe, and Bell, discussed in the chapter.
13.6
Determine the number of signals multiplexed and the GBs involved when a group signal (60–108 kHz) is built, according to ITU standards, exclusively from each of the following types of wideband audio signals: (a) 50–6400 Hz (b) 50–10 000 Hz (c) 30–15 000 Hz. If each baseband audio signal is translated by LSB modulation so that half of the GB is on either side of the translated spectrum, determine the set of carrier frequencies required in (a)–(c).
13.7
Examine the frame structures of the E1 and T1 signals shown in Figures 13.16 and 13.19b and calculate the rate of each of the following types of bits for the indicated frame: (a) Framing bits in E1 and T1. (b) Signalling bits in E1 and T1. (c) Signalling bits per channel in E1 and T1. (d) CRC-4 error checking using IB bit in E1. (e) Message bits in T1.
13.8
Justification bits are used in PDH frames to accommodate slight variations in the rates of input tributaries. For example, the E1 signal may vary slightly from its nominal 2048 kb/s rate without hindering the operation of the 2–8 muldex. Examine the frame structures given in Figure 13.18 and determine the allowable range of bit rate variation in the following CEPT PDH signals: (a) E1 (b) E2 (c) E3.
13.9
We showed that the maximum efficiency of SDH is 𝜂 max = 96.30%. Considering the conveyed PDH signals to be the message bits, determine the actual efficiency of the following STM-1 frames: (a) STM-1 assembled as shown in Figure 13.26. (b) STM-1 assembled as shown in Figure 13.27a. (c) Why is actual efficiency significantly lower than 𝜂 max ? (d) Give reasons for the discrepancy between your answers in (a) and (b). (e) In what situation would the assembly procedure of Figure 13.27a be preferred to that of Figure 13.26?
13.10
Determine the rates of the following types of bits in the STM-1 frame of Figure 13.27a: (a) Filler pattern bits (b) POH bits (c) TU pointer bits (d) AU pointer bits (e) SOH bits.
13.11
Repeat all of Question 13.10, except (c), for the STM-1 frame of Figure 13.26
875
876
13 Multiplexing Strategies
13.12
The fixed-length ATM cell is a compromise between the requirements of high efficiency in data transmission and low delay in voice and video transmission. Determine the following: (a) The ATM packetisation delay for 64 kb/s voice signals. (b) The ATM cell duration in a transmission medium operating at (i) 2 Mb/s and (ii) 140 Mb/s. (c) The efficiency of an AAL1 ATM cell. Comment on your results in (b) and the impact of line rate on cell delay variation.
13.13
Using a table like Table 13.2, show that the PN sequence of the code generator of Figure 13.37b is as given in Figure 13.37c. You may assume any initial register state, except of course all-zero.
13.14
Draw the block diagram of a [4, 2] PN sequence generator. Determine the PN sequence. Is this an m-sequence? Repeat your analysis for a [4, 3] code, and determine whether it is an m-sequence.
13.15
Determine the processing gain of a CDM system that uses BPSK modulation and an m-sequence spreading code generated by a linear feedback shift register of length 12. Note that the spreading code is periodic, with period equal to the bit interval of the coded message signal.
13.16
13.16 A TDMA system has the following specifications: (a) Satellite transponder bandwidth, Bxp = 54 MHz. (b) TDMA frame duration, Tf = 5 ms. (c) Guard time between time slots, Tg = 5 μs. (d) Preamble bits in each ES burst, nbp = 280 bits. This system is to be used to provide broadband connection for 25 rural bank branches and the satellite link will use 64-APSK and raised cosine filter of roll-off factor 𝛼 = 0.1. Determine the maximum data rate (including any redundant bits for error control coding) of each connection.
877
Appendix A Character Codes
Table A.1
International Morse Code
Character
Morse Code
Character
Morse Code
A
•—
2
••———
B
—•••
3
•••——
C
—•—•
4
••••—
D
—••
5
•••••
E
•
6
—••••
F
••—•
7
——•••
G
——•
8
———••
H
••••
9
————•
I
••
: (colon)
———•••
J
•———
, (comma)
——••——
K
—•—
; (semicolon)
—•—•—•
L
•—••
?
••——••
M
——
. (period)
•—•—•—
N
—•
’ (apostrophe)
•————•
O
———
"
•—••—•
P
•—•—
/
—••—•
Q
——•—
- (hyphen)
—••••—
R
•—•
=
—•••—
S
•••
) or (
—•——•—
T
—
Attention
—•—•—
U
••—
Break
—•••—•—
V
•••—
End of Message
•—•—•
W
•——
Error
••••••••
Communication Engineering Principles, Second Edition. Ifiok Otung. © 2021 John Wiley & Sons Ltd. Published 2021 by John Wiley & Sons Ltd. Companion Website: www.wiley.com/go/otung
878
Appendix A Character Codes
Table A.1
(Continued)
Character
Morse Code
Character
Morse Code
X
—••—
Go ahead
—•—
Y
—•——
OK
•—•
Z
——••
SOS
•••———•••
0
—————
End of Contact
•••—•—
1
•————
Wait
•—•••
Table A.2 International Telegraph Alphabet No. 2 (ITA-2). A modification of the Baudot code. Character Letters
Figures
A
-
B C
Character
ITA-2 Code
Letters
Figures
11000
Q
1
11101
?
10011
R
4
01010
:
01110
S
’ (US1 = Bell)
10100
2
1
D
WRU (US = $)
10010
T
5
00001
E
3
10000
U
7
11100
F
! (UD3 )
10110
V
= (US1 = ;)
01111
G
& (UD3 )
01011
W
2
11001 10111
1
3
H
£ (US = #) (UD )
00101
X
/
I
8
01100
Y
6
1
10101 1
J
Bell (US = ’)
11010
Z
K
(
11110
Letter shift
L
)
01001
Figure shift
11011
00111
Space
00100
M
1
ITA-2 Code
+ (US = ")
11111
N
,
00110
Carriage Return
00010
O
9
00011
Line Feed
01000
P
0
01101
Blank (Null)
00000
Where US standard differs from ITA-2, this is indicated in brackets as (US = …); WRU ≡ Who are you? 3 Three codes were left undefined (UD) to allow for national variants. 2
10001
Table A.3
EBCDIC code. Second hexadecimal (hex) digit (0 → F ≡ b8 b7 b6 b5 = 0000 → 1111)
1st hex digit ↓
0 0000
1 0001
0
NUL
DLE
1
SOH
DC1
2
STX
DC2
FS
3
ETX
DC3
WUS
IR
c
l
t
C
4
SEL
RES
BYP
PP
d
m
u
D
5
HT
NL
LF
TRN
e
n
v
E
6
RNL
BS
ETB
NBS
f
o
w
7
DEL
POC
ESC
BOT
g
p
x
8
GE
CAN
SA
SBS
h
q
y
i
r
z
2 0010
3 0011
4 0100
5 0101
DS
SP
&
SOS
RSP
6 0110
7 0111
9
SPS
EM
SPE
IT
RPT
UBS
SM
RFF
‘ ¢
!
^
A 1010
B 1011
{ a
j
∼
A
b
k
s
B
K
:
[ ]
B
VT
CU1
CSP
CU3
.
$
,
#
C
FF
IFS
MFA
DC4