Simulating Wireless Communication Systems: Practical Models in C++ 0130222682, 9780130222688

The practical, inclusive reference for engineers simulating wireless systemsIn order to keep prices within reach of the

355 98 4MB

English Pages xvi, 571 pages: illustrations; 24 cm [592] Year 2004

Table of contents :
Cover......Page 1
CONTENTS......Page 8
PREFACE......Page 16
1 SIMULATION: BACKGROUND AND OVERVIEW......Page 18
1.2 Simulation Process......Page 19
1.3 Simulation Programs......Page 20
2.1 Parameter Input......Page 21
2.1.2 Parameter Arrays......Page 22
2.1.4 System Parameters......Page 24
2.1.5 Signal-Plotting Parameters......Page 25
2.2 Signals......Page 26
2.2.1 Signal Management Strategy......Page 27
2.2.2 SMS Implementation......Page 37
2.3 Controls......Page 46
2.4 Results Reporting......Page 47
2A.1 PracSimModel......Page 50
2A.2 GenericSignal......Page 55
3.1.1 Unit Step......Page 61
3.1.2 Rectangular Pulse......Page 62
3.1.3 Unit Impulse......Page 63
3.1.4 Software Implementation......Page 64
3.2 Tone Generators......Page 66
3.2.1 Software Implementation......Page 67
3.3 Sampling Baseband Signals......Page 68
3.3.1 Spectral View of Sampling......Page 70
3.4 Baseband Data Waveform Generators......Page 71
3.4.1 NRZ Baseband Signaling......Page 72
3.4.2 Biphase Baseband Signaling......Page 74
3.4.3 Delay Modulation......Page 75
3.4.4 Practical Issues......Page 76
3.5 Modeling Bandpass Signals......Page 78
3A.1 MultipleToneGener......Page 81
3A.2 BasebandWaveform......Page 86
4.1 Random Sequences......Page 95
4.1.1 Discrete Distributions......Page 96
4.1.2 Discrete-Time Random Processes......Page 99
4.2 Random Sequence Generators......Page 100
4.2.1 Linear Congruential Sequences......Page 101
4.2.2 Software Implementations......Page 107
4.2.3 Evaluating Random-Number Generators......Page 109
4.3 Continuous-Time Noise Processes......Page 110
4.3.1 Continuous Random Variables......Page 111
4.3.2 Random Processes......Page 114
4.4.1 Gaussian Distribution......Page 116
4.4.2 Error Function......Page 117
4.4.3 Spectral Properties......Page 118
4.4.5 Gaussian Random Number Generators......Page 119
4.5.1 Envelope and Phase......Page 121
4.6 Parametric Models of Random Processes......Page 126
4.6.1 Autoregressive Noise Model......Page 127
4A.1 AdditiveGaussianNoise......Page 129
5.1 Discrete Fourier Transform......Page 136
5.1.2 Properties of the DFT......Page 137
5.2 Decimation-in-Time Algorithms......Page 140
5.2.1 Software Notes......Page 143
5.3 Decimation-in-Frequency Algorithms......Page 148
5.4 Small -N Transforms......Page 153
5.5.1 Software Notes......Page 155
5A.2 FFT Engines......Page 158
6.1 Sample Spectrum......Page 163
6.1.1 Software Implementation......Page 164
6.2 Daniell Periodogram......Page 165
6.2.1 Software Implementation......Page 166
6.3 Bartlett Periodogram......Page 168
6.3.1 Software Implementation......Page 169
6.4 Windowing and Other Issues......Page 170
6.4.1 Triangular Window......Page 171
6.4.2 Software Considerations......Page 172
6.4.3 von Hann Window......Page 174
6.4.4 Hamming Window......Page 177
6.4.5 Software Implementation......Page 178
6.6 Yule-Walker Method......Page 184
6.6.1 Software Implementation......Page 185
6A.1 BartlettPeriodogramWindowed......Page 188
6A.2 GenericWindow......Page 194
7.1 Linear Systems......Page 199
7.1.1 Characterization of Linear Systems......Page 200
7.1.2 Transfer Functions......Page 201
7.1.3 Computer Representation of Transfer Functions......Page 203
7.1.4 Magnitude, Phase, and Delay Responses......Page 206
7.2 Constellation Plots......Page 209
7.2.1 Eye Diagrams......Page 210
7A: EXAMPLE SOURCE CODE......Page 216
7A.1 CmpxIqPlot......Page 217
7A.2 HistogramBuilder......Page 220
8.1.1 Numerical Integration......Page 224
8.1.3 Digital Filters......Page 225
8.2 Analog Filter Responses......Page 226
8.2.2 Filter Transformations......Page 227
8.3.1 Butterworth Filters......Page 234
8.3.2 Chebyshev Filters......Page 235
8.3.3 Elliptical Filters......Page 239
8.3.4 Bessel Filters......Page 244
8.4 Simulating Filters via Numerical Integration......Page 246
8.4.1 Biquadratic Form......Page 248
8.4.2 Software Design......Page 249
8.5 Using IIR Digital Filters to Simulate Analog Filters......Page 251
8.5.1 Properties of IIR Filters......Page 253
8.5.2 Mapping Analog Filters into IIR Designs......Page 254
8.5.3 Software Design......Page 257
8.6.1 Fast Convolution......Page 259
8.6.2 Software Design......Page 261
8A.1 Classical Filters......Page 264
9.1 Simulation Issues......Page 279
9.1.1 Using the Recovered Carrier......Page 280
9.2 Quadrature Phase Shift Keying......Page 281
9.2.1 Nonideal Behaviors......Page 283
9.2.2 Quadrature Modulator Models......Page 286
9.2.3 Correlation Demodulator Models for QPSK......Page 287
9.2.4 Quadrature Demodulator Models......Page 290
9.2.5 QPSK Simulations......Page 292
9.2.6 Properties of QPSK Signals......Page 296
9.2.7 Offset QPSK......Page 299
9.3.1 BPSK Modulator Models......Page 303
9.3.2 BPSK Demodulation......Page 304
9.3.3 BPSK Simulations......Page 306
9.3.4 Properties of BPSK Signals......Page 307
9.3.5 Error Performance......Page 309
9.4.1 Ideal m-PSK Modulation and Demodulation......Page 310
9.4.2 Power Spectral Densities of m-PSK Signals......Page 312
9.4.3 Error Performance......Page 315
9.5 Frequency Shift Keying......Page 316
9.5.1 FSK Modulators......Page 320
9.6.1 Nonideal Behaviors......Page 323
9.6.2 MSK Modulator Models......Page 326
9.6.3 Properties of MSK Signals......Page 329
9A: EXAMPLE SOURCE CODE......Page 332
9A.1 MskModulator......Page 333
9A.2 MpskOptimalDemod......Page 337
10 AMPLIFIERS AND MIXERS......Page 342
10.1.1 Hard Limiters......Page 343
10.1.2 Bandpass Amplifiers......Page 344
10.2.1 AM/AM and AM/PM......Page 359
10.2.2 Swept-Frequency Response......Page 360
10.3.1 Filter Measurements......Page 361
10A.1 NonlinearAmplifier......Page 367
11.1 Shifting Signals in Time......Page 373
11.1.1 Delaying Signals by Multiples of the Sampling Interval......Page 374
11.1.2 Advancing Signals by Multiples of the Sampling Interval......Page 377
11.1.3 Continuous-Time Delays via Interpolation......Page 385
11.2 Correlation-Based Delay Estimation......Page 402
11.2.1 Software Implementation......Page 404
11.3 Phase-Slope Delay Estimation......Page 405
11.4 Changing Clock Rates......Page 410
11A.1 DiscreteDelay......Page 415
12 SYNCHRONIZATION RECOVERY......Page 423
12.1 Linear Phase-Locked Loops......Page 424
12.2.1 Phase-Frequency Detector......Page 429
12.3.1 Squaring Loop......Page 441
12.3.2 Costas Loop......Page 443
12A.1 DigitalPLL......Page 447
13.1.1 Binary Symmetric Channel......Page 457
13.1.2 Other Binary Channels......Page 458
13.1.3 Nonbinary Channels......Page 460
13.2.1 System Functions......Page 466
13.2.2 Randomly Time-Varying Channels......Page 472
13.3 Diffuse Multipath Channels......Page 476
13.3.1 Uncorrelated Tap Gains......Page 477
13.3.2 Correlated Tap Gains......Page 478
13.4 Discrete Multipath Channels......Page 480
14.1 Basic Concepts of Multirate Signal Processing......Page 482
14.1.2 Interpolation by Integer Factors......Page 483
14.1.3 Decimation and Interpolation by Noninteger Factors......Page 485
14.2 Filter Design for Interpolators and Decimators......Page 486
14.2.1 Interpolation......Page 488
14.2.2 Decimation......Page 497
14.3.2 Quadrature Modulation......Page 504
15.1.1 Coefficient Quantization......Page 508
15.1.3 Finite-Precision Arithmetic......Page 512
15.2 FIR Filters......Page 513
15.3 IIR Filters......Page 518
16.1 Block Codes......Page 523
16.1.1 Cyclic Codes......Page 524
16.2 BCH Codes......Page 526
16.3.1 Block Interleavers......Page 530
16.3.2 Convolutional Interleavers......Page 531
16.4 Convolutional Codes......Page 532
16.4.1 Trellis Representation of a Convolutional Encoder......Page 535
16.4.2 Viterbi Decoding......Page 536
16.5 Viterbi Decoding with Soft Decisions......Page 542
A.1 Trigonometric Identities......Page 549
A.2 Table of Integrals......Page 551
A.4 Modified Bessel Functions of the First Kind......Page 553
A.4.1 Identities......Page 554
B.2 Gaussian Distribution......Page 555
B.3 Exponential Distribution......Page 556
B.4 Rayleigh Distribution......Page 557
B.5 Rice Distribution......Page 558
B.5.1 Marcum Q Function......Page 559
C.1 Finite Fields......Page 560
C.2 Polynomial Arithmetic......Page 562
C.3 Computer Generation of Extension Fields......Page 568
C.3.2 Using a Computer to Find Primitive Polynomials......Page 569
C.3.3 Programming Considerations......Page 574
C.4 Minimal Polynomials and Cyclotomic Cosets......Page 576
D: REFERENCES......Page 580
C......Page 584
H......Page 585
P......Page 586
U......Page 587
Y......Page 588

Recommend Papers

Wireless Optical Communication Systems 0387227857, 9780387227856

104 22 9MB Read more

Practical Models for Technical Communication 9781943536955

440 47 18MB Read more

Wireless Communication Circuits and Systems 0852964439, 9780852964439

This work presents a state-of-the-art review of integrated circuits, systems and transceivers for wireless and mobile co

747 99 3MB Read more

Practical Models for Technical Communication 9781943536955, 1943536953

Practical Models for Technical Communication is a college-level textbook for technical writers and communicators. Writte

101 90 18MB Read more

Wireless Communication Systems in MATLAB [2 ed.] 9798648350779, 9798648523210

In this book shows the theoretical aspects of how a wireless communication system can be translated into simulation mode

562 7 12MB Read more

Simulating Continuous Fuzzy Systems 3540284559, 9783540284550

1. 1 Introduction This book is written in two major parts. The ?rst part includes the int- ductory chapters consisting o

99 28 6MB Read more

Wireless Communication Technologies: New Multimedia Systems 0306473267, 0792380053, 0792386337

Wireless Communication Technologies: New Multimedia Systems is based on a selection of the best papers presented at the

495 25 2MB Read more

Wireless Multimedia Communication Systems: Design, Analysis, and Implementation 1466566000, 9781466566002

Rapid progress in software, hardware, mobile networks, and the potential of interactive media poses many questions for r

840 20 5MB Read more

Wireless Communication Systems, Advanced Techniques for Signal Reception 0130214353, 9780130214355

544 60 5MB Read more

Wireless Communication Circuits and Systems [illustrated edition] 0852964439, 9780852964439

This new book examines integrated circuits, systems and transceivers for wireless and mobile communications. It covers r

520 3 16MB Read more

Simulating Wireless Communication Systems: Practical Models in C++
0130222682, 9780130222688

Author / Uploaded
Rorabaugh
C Britton

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

SIMULATING WIRELESS COMMUNICATION SYSTEMS

SIMULATING WIRELESS COMMUNICATION SYSTEMS Companion Software Website http://authors.phptr.com/rorabaugh/

C. Britton Rorabaugh

PRENTICE HALL Professional Technical Reference Upper Saddle River, NJ 07458 www.phptr.com

Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book can be obtained from the Library of Congress.

Companion Software Website http://authors.phptr.com/rorabaugh/

Editorial/Production Supervision: Patti Guerrieri Cover Design Director: Jerry Votta Cover Design: Anthony Gemmellaro Art Director: Gail Cocker-Bogusz Manufacturing Buyer: Maura Zaldivar Publisher: Bernard Goodwin Editorial Assistant: Michelle Vincenti Marketing Manager: Dan DePasquale © 2004 Pearson Education, Inc. Publishing as Prentice Hall Professional Technical Reference Upper Saddle River, New Jersey 07458 Prentice Hall PTR offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales. For more information, please contact: U.S. Corporate and Government Sales, 1-800-382-3419, [email protected]. For sales outside of the U.S., please contact: International Sales, 1-317-581-3793, [email protected]. Company and product names mentioned herein are the trademarks or registered trademarks of their respective owners. All rights reserved. No part of this book may be reproduced, in any form or by any means, without permission in writing from the publisher. Printed in the United States of America First Printing

ISBN

0-13-022268-2

Pearson Education Ltd. Pearson Education Australia Pty., Limited Pearson Education South Asia Pte. Ltd. Pearson Education Asia Ltd. Pearson Education Canada, Ltd. Pearson Educación de Mexico, S.A. de C.V. Pearson Education — Japan Pearson Malaysia SDN BHD

To Joyce, Geoff, Amber, and Eleanor

This page intentionally left blank

CONTENTS

PREFACE

xv

1

SIMULATION: BACKGROUND AND OVERVIEW 1.1 Communication Systems 1.2 Simulation Process 1.3 Simulation Programs

1 2 2 3

2

SIMULATION INFRASTRUCTURE 2.1 Parameter Input 2.1.1 Individual Parameter Values 2.1.2 Parameter Arrays 2.1.3 Enumerated Type Parameters 2.1.4 System Parameters 2.1.5 Signal-Plotting Parameters 2.2 Signals 2.2.1 Signal Management Strategy 2.2.2 SMS Implementation 2.3 Controls 2.4 Results Reporting

4 4 5 5 7 7 8 9 10 20 29 30

2A

EXAMPLE SOURCE CODE 2A.1 PracSimModel 2A.2 GenericSignal

33 33 38

3

SIGNAL GENERATORS 3.1 Elementary Signal Generators 3.1.1 Unit Step 3.1.2 Rectangular Pulse

44 44 44 45 vii

viii

3.2 3.3 3.4

3.5

3.1.3 Unit Impulse 3.1.4 Software Implementation Tone Generators 3.2.1 Software Implementation Sampling Baseband Signals 3.3.1 Spectral View of Sampling Baseband Data Waveform Generators 3.4.1 NRZ Baseband Signaling 3.4.2 Biphase Baseband Signaling 3.4.3 Delay Modulation 3.4.4 Practical Issues Modeling Bandpass Signals

3A

EXAMPLE SOURCE CODE 3A.1 MultipleToneGener 3A.2 BasebandWaveform

4

RANDOM PROCESS MODELS 4.1 Random Sequences 4.1.1 Discrete Distributions 4.1.2 Discrete-Time Random Processes 4.2 Random Sequence Generators 4.2.1 Linear Congruential Sequences 4.2.2 Software Implementations 4.2.3 Evaluating Random-Number Generators 4.3 Continuous-Time Noise Processes 4.3.1 Continuous Random Variables 4.3.2 Random Processes 4.4 Additive Gaussian Noise Generators 4.4.1 Gaussian Distribution 4.4.2 Error Function 4.4.3 Spectral Properties 4.4.4 Noise Power 4.4.5 Gaussian Random Number Generators 4.5 Bandpass Noise 4.5.1 Envelope and Phase 4.5.2 Rayleigh Random Number Generators 4.6 Parametric Models of Random Processes 4.6.1 Autoregressive Noise Model

46 47 49 50 51 53 54 55 57 58 59 61 64 64 69 78 78 79 82 83 84 90 92 93 94 97 99 99 100 101 102 102 104 104 109 109 110

ix 4A

EXAMPLE SOURCE CODE 4A.1 AdditiveGaussianNoise

112 112

5

DISCRETE TRANSFORMS 5.1 Discrete Fourier Transform 5.1.1 Parameter Selection 5.1.2 Properties of the DFT 5.2 Decimation-in-Time Algorithms 5.2.1 Software Notes 5.3 Decimation-in-Frequency Algorithms 5.4 Small -N Transforms 5.5 Prime Factor Algorithm 5.5.1 Software Notes

119 119 120 120 123 126 131 136 138 138

5A

EXAMPLE SOURCE CODE 5A.1 FFT Wrapper Routines 5A.2 FFT Engines

141 141 141

6

SPECTRUM ESTIMATION 6.1 Sample Spectrum 6.1.1 Software Implementation 6.2 Daniell Periodogram 6.2.1 Software Implementation 6.3 Bartlett Periodogram 6.3.1 Software Implementation 6.4 Windowing and Other Issues 6.4.1 Triangular Window 6.4.2 Software Considerations 6.4.3 von Hann Window 6.4.4 Hamming Window 6.4.5 Software Implementation 6.5 Welch Periodogram 6.5.1 Software Implementation 6.6 Yule-Walker Method 6.6.1 Software Implementation

146 146 147 148 149 151 152 153 154 155 157 160 161 167 167 167 168

6A

EXAMPLE SOURCE CODE 6A.1 BartlettPeriodogramWindowed 6A.2 GenericWindow

171 171 177

x 7

SYSTEM CHARACTERIZATION TOOLS 7.1 Linear Systems 7.1.1 Characterization of Linear Systems 7.1.2 Transfer Functions 7.1.3 Computer Representation of Transfer Functions 7.1.4 Magnitude, Phase, and Delay Responses 7.2 Constellation Plots 7.2.1 Eye Diagrams

182 182 183 184 186 189 192 193

7A

EXAMPLE SOURCE CODE 7A.1 CmpxIqPlot 7A.2 HistogramBuilder

199 200 203

8

FILTER MODELS 8.1 Modeling Approaches 8.1.1 Numerical Integration 8.1.2 Sampled Frequency Response 8.1.3 Digital Filters 8.2 Analog Filter Responses 8.2.1 Magnitude Response Features of Lowpass Filters 8.2.2 Filter Transformations 8.3 Classical Analog Filters 8.3.1 Butterworth Filters 8.3.2 Chebyshev Filters 8.3.3 Elliptical Filters 8.3.4 Bessel Filters 8.4 Simulating Filters via Numerical Integration 8.4.1 Biquadratic Form 8.4.2 Software Design 8.5 Using IIR Digital Filters to Simulate Analog Filters 8.5.1 Properties of IIR Filters 8.5.2 Mapping Analog Filters into IIR Designs 8.5.3 Software Design 8.6 Filtering in the Frequency Domain 8.6.1 Fast Convolution 8.6.2 Software Design

207 207 207 208 208 209 210 210 217 217 218 222 227 229 231 232 234 236 237 240 242 242 244

8A

EXAMPLE SOURCE CODE 8A.1 Classical Filters

247 247

xi 9

MODULATION AND DEMODULATION 9.1 Simulation Issues 9.1.1 Using the Recovered Carrier 9.2 Quadrature Phase Shift Keying 9.2.1 Nonideal Behaviors 9.2.2 Quadrature Modulator Models 9.2.3 Correlation Demodulator Models for QPSK 9.2.4 Quadrature Demodulator Models 9.2.5 QPSK Simulations 9.2.6 Properties of QPSK Signals 9.2.7 Offset QPSK 9.3 Binary Phase Shift Keying 9.3.1 BPSK Modulator Models 9.3.2 BPSK Demodulation 9.3.3 BPSK Simulations 9.3.4 Properties of BPSK Signals 9.3.5 Error Performance 9.4 Multiple Phase Shift Keying 9.4.1 Ideal m -PSK Modulation and Demodulation 9.4.2 Power Spectral Densities of m -PSK Signals 9.4.3 Error Performance 9.5 Frequency Shift Keying 9.5.1 FSK Modulators 9.6 Minimum Shift Keying 9.6.1 Nonideal Behaviors 9.6.2 MSK Modulator Models 9.6.3 Properties of MSK Signals

262 262 263 264 266 269 270 273 275 279 282 286 286 287 289 290 292 293 293 295 298 299 303 306 306 309 312

9A

EXAMPLE SOURCE CODE 9A.1 MskModulator 9A.2 MpskOptimalDemod

315 316 320

10

AMPLIFIERS AND MIXERS 10.1 Memoryless Nonlinearities 10.1.1 Hard Limiters 10.1.2 Bandpass Ampliﬁers 10.2 Characterizing Nonlinear Ampliﬁers 10.2.1 AM/AM and AM/PM 10.2.2 Swept-Frequency Response

325 326 326 327 342 342 343

xii 10.3 Two-Box Nonlinear Ampliﬁer Models 10.3.1 Filter Measurements

344 344

10A EXAMPLE SOURCE CODE 10A.1NonlinearAmplifier

350 350

11

356 356 357 360 368 385 387 388 393

SYNCHRONIZATION AND SIGNAL SHIFTING 11.1 Shifting Signals in Time 11.1.1 Delaying Signals by Multiples of the Sampling Interval 11.1.2 Advancing Signals by Multiples of the Sampling Interval 11.1.3 Continuous-Time Delays via Interpolation 11.2 Correlation-Based Delay Estimation 11.2.1 Software Implementation 11.3 Phase-Slope Delay Estimation 11.4 Changing Clock Rates

11A EXAMPLE SOURCE CODE 11A.1DiscreteDelay

398 398

12

406 407 412 412 424 424 426

SYNCHRONIZATION RECOVERY 12.1 Linear Phase-Locked Loops 12.2 Digital Phase-Locked Loops 12.2.1 Phase-Frequency Detector 12.3 Phase-Locked Demodulators 12.3.1 Squaring Loop 12.3.2 Costas Loop

12A EXAMPLE SOURCE CODE 12A.1DigitalPLL

430 430

13

440 440 440 441 443 449 449 455 459 460

CHANNEL MODELS 13.1 Discrete Memoryless Channels 13.1.1 Binary Symmetric Channel 13.1.2 Other Binary Channels 13.1.3 Nonbinary Channels 13.2 Characterization of Time-Varying Random Channels 13.2.1 System Functions 13.2.2 Randomly Time-Varying Channels 13.3 Diffuse Multipath Channels 13.3.1 Uncorrelated Tap Gains

xiii 13.3.2 Correlated Tap Gains 13.4 Discrete Multipath Channels

461 463

14

MULTIRATE SIMULATIONS 14.1 Basic Concepts of Multirate Signal Processing 14.1.1 Decimation by Integer Factors 14.1.2 Interpolation by Integer Factors 14.1.3 Decimation and Interpolation by Noninteger Factors 14.2 Filter Design for Interpolators and Decimators 14.2.1 Interpolation 14.2.2 Decimation 14.3 Multirate Processing for Bandpass Signals 14.3.1 Quadrature Demodulation 14.3.2 Quadrature Modulation

465 465 466 466 468 469 471 480 487 487 487

15

MODELING DSP COMPONENTS 15.1 Quantization and Finite-Precision Arithmetic 15.1.1 Coefﬁcient Quantization 15.1.2 Signal Quantization 15.1.3 Finite-Precision Arithmetic 15.2 FIR Filters 15.3 IIR Filters

491 491 491 495 495 496 501

16

CODING AND INTERLEAVING 16.1 Block Codes 16.1.1 Cyclic Codes 16.2 BCH Codes 16.3 Interleavers 16.3.1 Block Interleavers 16.3.2 Convolutional Interleavers 16.4 Convolutional Codes 16.4.1 Trellis Representation of a Convolutional Encoder 16.4.2 Viterbi Decoding 16.5 Viterbi Decoding with Soft Decisions

506 506 507 509 513 513 514 515 518 519 525

A

MATHEMATICAL TOOLS A.1 Trigonometric Identities A.2 Table of Integrals A.3 Logarithms

532 532 534 536

xiv A.4 Modiﬁed Bessel Functions of the First Kind A.4.1 Identities

536 537

B

PROBABILITY DISTRIBUTIONS IN COMMUNICATIONS B.1 Uniform Distribution B.2 Gaussian Distribution B.3 Exponential Distribution B.4 Rayleigh Distribution B.4.1 Relationship to Exponential Distribution B.5 Rice Distribution B.5.1 Marcum Q Function

538 538 538 539 540 541 541 542

C

GALOIS FIELDS C.1 Finite Fields C.1.1 Fields C.2 Polynomial Arithmetic C.3 Computer Generation of Extension Fields C.3.1 Computer Representations for Polynomials C.3.2 Using a Computer to Find Primitive Polynomials C.3.3 Programming Considerations C.4 Minimal Polynomials and Cyclotomic Cosets

543 543 545 545 551 552 552 557 559

D

REFERENCES

INDEX

563 566

PREFACE M odern communications systems and the devices operating within

these systems would not be possible without simulation, but practical information speciﬁc to the simulation of communications systems is relatively scarce. My motive for writing this book was to collect and capture in a useful form the techniques that can be used to simulate a wireless communication system using C++. It has been my experience that organizations newly confronted with a need to simulate a communication system are in a rush to get started. Consequently, these organizations will purchase a commercial simulation package like SPW or MATLAB Simulink without even considering the alternative of constructing their own simulation using C++. In the beginning, progress comes quickly as simple systems are conﬁgured from standard library models. Only when they begin to model the more complex proprietary parts of their systems do these organizations begin to realize how much control and ﬂexibility they sacriﬁced in going with a commercial package. It is not possible for any library of precoded models to be absolutely complete. There will always be a need to build a highly specialized model or make modiﬁcations to existing models. A user attempting to do either, using a commercial package, usually spends more time dealing with the rules and limitations of the simulation infrastructure than with the details of the model algorithms themselves. In the mid 1990s, I was the architect and lead designer for a proprietary simulation package that was used to simulate the wireless data communication links in several very large U.S. defense systems. This package wasn’t perfect—software never is—but I drew upon this experience, and while writing this book, I developed a simpler simulation package that avoids many of the complexities and objectionable features of my earlier effort. This new package is called PracSim, which is short for Practical Simulation. All of the source code for the models and infrastructure comprising the PracSim package is provided on the Prentice Hall Web site (http://authors.phptr.com/rorabaugh/). Examples of this code are prexv

xvi

Preface

sented and discussed throughout the book, but there is far too much code to include it all in the text. The library of PracSim models is not intended to be complete, but rather to provide a foundation that users can modify or build upon as needed to capture the nuances of the particular systems they are attempting to model. I didn’t keep accurate records, but I’m sure that construction of the PracSim software took far more time than the actual writing of the text. I would like to thank my wife Joyce, son Geoffrey, daughter Amber, and mother-in-law Eleanor for not complaining too much about all the time I spent on this project and for dealing with all of the household problems that I never seemed to have time for. I would also like to thank my editor, Bernard Goodwin, for his patience despite the numerous times that I postponed delivery of the ﬁnal manuscript.

Chapter 1

SIMULATION: BACKGROUND AND OVERVIEW M

odern communications systems and the devices operating within these systems would not be possible without simulation. The expanded use of digital signal processing techniques has spawned cell phones and wireless transceivers that offer incredible performance and features at a per-unit cost that puts them within the reach of nearly everyone. However, these low per-unit costs are achieved through mass production of hundreds of thousands or even millions of units from a single design. The design of a new cell phone or wireless modem for a PDA is a very complex and expensive affair. Because of the complexity in such devices, it is not practical to breadboard prototypes for testing until after the design has been exhaustively tested and honed using simulation. Even after a new device has been prototyped, it is usually impractical to test it under every possible combination of operating conditions. For example, the nature of CDMA and GSM cellular phone systems is such that all of the phones in a given area unavoidably interfere with each other. The phones and base stations all include processing to mitigate this interference, but the severity of the interference and the effectiveness of the countermeasures depend upon the relative locations, with respect to the base station tower, of all the potentially interfering phones. Assessment of the interference is complicated by the fact that the phones can individually vary their transmit powers via power-control loops executing in the phones or in response to commands from the base station. Analysis is impossible and exhaustive testing is impractical. Simulation using carefully constructed models of the phones and base station is the only answer. In the design of nearly any type of communications equipment, simulation provides an inexpensive way to explore possibilities and design trades before the more expensive process of prototyping is initiated.

1

2

1.1

Simulation: Background and Overview

Chapter 1

Communication Systems

There are many aspects to the operation of a large complex communication system, and simulations of various kinds can be used to assess the system’s performance with respect to these various aspects. Consider a typical cellular phone system. At the lowest level, there is the radio link between the mobile phone and the base station tower. The performance of this link is degraded by additive noise, interference from other phones, interference from other towers, interference from other noncellular man-made sources, attenuation of the RF signal, and multipath propagation. The system architects can employ a number of techniques to combat these sources of degradation. These techniques include selection of modulation technique, transmit power control, improved receiver sensitivity, diversity combining, error-correction coding, interleaving, equalization, RAKE demodulator designs and interference cancellation. Analytically assessing the performance of these techniques in various combinations is always difﬁcult and often impossible. Simulation is often the only practical way to estimate the performance of the link without actually building and testing it. The simulation techniques presented in this book are concerned with the performance of this link. Other aspects of the mobile-to-base-station link (such as the capacity and throughput limitations of a particular multiple-access protocol under various trafﬁc-loading conditions) involve discrete event simulations of the sort used to model local area network protocols and are covered elsewhere [1]. Another variant of discrete event simulation would be needed to assess the performance of a particular geographic deployment of a cell cluster with respect to tower-to-tower handover of fast moving mobiles on a nearby interstate highway.

1.2

Simulation Process

The simulation process begins with an analysis of the system to be simulated. Obviously, the nature of this analysis depends upon the nature of the system and the maturity of the design. If the system has already been completely designed, the simulation effort can immediately focus on the selection or development of high-ﬁdelity models for constituent parts of the system. On the other hand, if the simulation effort is being mounted for the purpose of assisting in system architecture decisions, the effort might begin with textbook models of channel impairments and idealized subsystems. In many satellite communication systems, the transmit power ampliﬁers onboard the satellite introduce signiﬁcant amounts of signal distortion and the rest of the system must be designed to tolerate or mitigate this distortion. In the early stages of architecting the system, simulations using a detailed model of the power ampliﬁer along with idealized, or perhaps “typical-perfomance,” models of other

Section 1.3

Simulation Programs

3

components might be used to decide whether the ampliﬁer-induced distortions are best mitigated by predistortion at the transmitter, equalization at the receiver, or some combination of both. For early-stage, “broad-brush” estimates of system performance, simulations can use idealized models that are implemented directly from textbook descriptions of modulators, demodulators, codecs, and equalizers. These are the types of models usually included with commercial simulation packages. In other situations, highﬁdelity models that include detailed second- and third-order behaviors of the actual devices must be used. At 110 bits per second, an equalizer can be implemented using digital signal processing (DSP) techniques, and performance can be freely traded against cost and complexity using “canned” models that capture the quantization strategy of the proposed implementations. At one gigabit per second, equalizers must be implemented using analog techniques, and the irreducible distortions that remain in a state-of-the-art design must be captured in a handcrafted model that is validated against measurements and circuit-level simulations of the proposed or brassboard device. Models of nonlinear power ampliﬁers almost always need to be validated against measurements of the as-built device.

1.3

Simulation Programs

Thirty years ago, the few simulations of truly large communication systems were performed using large, hand-coded FORTRAN programs. Because of the computationintensive nature of simulation and the limited computer speeds available at the time, these programs were tightly integrated monoliths of code that used subroutines only when portions of the code needed to be written in assembly language. Now simulations are performed using modular packages—either proprietary or COTS—that emphasize ﬂexibility and ease of use. PracSim is a modular set of simulation models and connective infrastructure that was developed during the writing of this book. The infrastructure is discussed at length in Chapter 2, and the models are mentioned throughout all of the other chapters. The source code for both the infrastructure and models is available on the Prentice Hall Web site (http://authors.phptr.com/rorabaugh/). The models include various sources of performance degradation that are often left out of the “textbook” models included with some of the popular commercial simulation packages. However, these models cannot be considered truly “industrial grade” because their internal coding is structured for tutorial clarity rather than for absolute optimum execution speed.

Chapter 2

SIMULATION INFRASTRUCTURE A

mong designers of real-time software, there is a tendency to shun object-oriented programming in C++ in favor of hand-optimized programming in a mix of C and assembly. Because they often involve very short sampling intervals and very long durations, simulations can take a long time to execute. The lengthy execution time creates an incentive to apply runtime optimization techniques to simulation programs. However, simulation tools also need to be convenient and easy to use correctly. When designing the simulation software for this book, the convenience and robustness of an object-oriented approach won out over execution speed advantages potentially offered by other approaches.

2.1

Parameter Input

Simulations, and the models from which they are constructed, require a number of input parameters. Usually the total number of parameters for a simulation is large enough to make it impractical to interactively enter the parameters each time the simulation is executed. PracSim provides a capability to read input parameters from a simply formatted text ﬁle. The names of the input ﬁle and several output ﬁles are tied to the name of the simulation. The simulation’s name is established at the beginning of the main program by the macro deﬁnition for SIM NAME. To name the simulation BpskSim, the deﬁnition would be #define SIM NAME "BpskSim\0"

An extract from a typical parameter ﬁle is shown in Table 2.1. The ﬁle is divided into sections for the system and each constituent model. Each new section begins with a line containing a single dollar sign. The second line of each section is the name 4

Section 2.1

Parameter Input

5

of the model instance to which the parameters pertain. The section for system-level parameters uses system for the name of the model instance. All of the other sections use the instance names that are passed into each model constructor by main. The ﬁnal section is tagged SignalPlotter and speciﬁes which system-level signals are to be plotted along with some constraints on how they are to be plotted. The ﬁle ends with a line containing $EOF.

2.1.1

Individual Parameter Values

To read a value into a double variable named Pulse Duration, the constructor for the wav gen instance of the BitsToWave model would have to invoke ParmFile::GetDoubleParm using the syntax: Pulse Duration = ParmInput->GetDoubleParm("Pulse Duration");

In order for this method to be successful, the input ﬁle section for model instance wav gen must contain a line that assigns a value to Pulse Duration such as Pulse Duration = 1.0

The header ﬁle parmfile.h provides a number of macro deﬁnitions that can be used to simplify this syntax to GET DOUBLE PARM(Pulse Duration);

Similar macro deﬁnitions are provided for other parameter types: GET GET GET GET GET

INT PARM(X) BOOL PARM(X) LONG PARM(X) FLOAT PARM(X) STRING PARM(X)

2.1.2

Parameter Arrays

To read values into a double array named Tone Gain, the constructor for the SinesInAwgn model would have to invoke ParmFile::GetDoubleParmArray using the syntax

6

Simulation Infrastructure

Table 2.1 Extract from parameter ﬁle. $ system Date In Short Rpt Name = false Date In Full Rpt Name = false Max Pass Number = 50 $ symb gen Initial Seed = 7733115 Bits Per Symb = 3 $ m psk mod Bits Per Symb = 3 Samps Per Symb = 16 Symb Duration = 1.0 $ spec analyzer 1 Kind Of Spec Estim = SPECT CALC BARTLETT PDGM Num Segs To Avg = 600 Seg Len = 4000 Fft Len = 4096 Norm Factor = 1.0 Hold Off = 0 Psd File Name = test sig psd Freq Norm Factor = 1.0 Output In Decibels = true Plot Two Sided = true Plot Relative To Peak = true Halt When Completed = false Num Bits Per Symb = 3 Time Const For Pwr Mtr = 30.0 Seed = 69069 Sig Pwr Meas Enabled = true Outpt Pwr Scaling On = false Sig Filtering Enabled = false $ SignalPlotter Num Plot Sigs = 2 symb vals, 0.0, 500.0, 1, 1, 0 modulated signal, 0.0, 500.0, 1, 0, 0 $ $EOF

Chapter 2

Section 2.1

7

Parameter Input

Tone Gain = ParmInput->GetDoubleParmArray("Tone Gain\0", Tone Gain, Num Sines);

where Num Sines is the number of elements to be read into the Tone Gain array. If Num Sines equals 3, and the input ﬁle section for the instance of SinesInAwgn being initialized contains a line of the form Tone Gain = 5.0,100.0,5.0

then Tone Gain[0] is set to 5.0, Tone Gain[1] is set to 100.0, and Tone Gain[2] is set to 5.0. The header ﬁle parmfile.h provides a number of macro deﬁnitions that can be used to simplify this syntax to GET DOUBLE PARM ARRAY(Tone Gain,Num Sines);

2.1.3

Enumerated Type Parameters

To improve the readability of the parameter input ﬁles, PracSim includes a number of enumerated types for specifying various model options. For example, the DiscreteDelay model described in Chapter 11 has four different operating modes, each represented by a value of the enumerated type DELAY MODE T. The possible values are DELAY MODE NONE, DELAY MODE FIXED, DELAY MODE DYNAMIC, and DELAY MODE GATED. These values are read from the parameter input ﬁle using the function GetDelayModeParm, which is provided in ﬁle delay modes.h. This function is not a member of the ParmFile class. Table 2.2 summarizes the other enumerated types implemented in PracSim.

2.1.4

System Parameters

System-level parameters are placed in the system section of the parameter input ﬁle. There are normally only three system-level parameters: two booleans, Date In Long Report Name and Date In Short Report Name, which are discussed in Section 2.4; and Max Pass Number, an int that speciﬁes the ﬁnal pass number after which the simulation terminates execution. System-level parameters are read by code located in ﬁles sim preamble.cpp and sim startup.cpp. The ﬁle sim preamble.cpp should be inserted using the #include directive at the very beginning of the main program for every simulation. This code reads Max Pass Number directly and calls SimulationStartup, which reads the two system-level booleans.

8

Simulation Infrastructure

Table 2.2

Chapter 2

Enumerated types in PracSim.

Type

File

Using Module

ADVANCE MODE T

adv modes.h

DELAY MODE T

delay modes.h

FILT BAND CONFIG T

filter types.h

FILT RESP CONFIG T PCM WAVE KIND T INTERP MODE T

filter resp.h wave kinds.h interp modes.h

KIND OF SPECT CALC T WINDOW SHAPE T WINDOW SHAPE T

spect calc kinds.h window shapes.h window shapes.h

ContinuousAdvance DiscreteAdvance ContinuousDelay DiscreteDelay AnalogFilterByIir AnalogFilterByInteg DenormalizedPrototype FilterResponse BasebandWaveform ContinuousAdvance ContinuousDelay DftDelay SpectrumAnalyzer BartlettPeriodogram WelchPeriodogram

2.1.5

Signal-Plotting Parameters

The signal-plotting parameters are contained in the SignalPlotter section of the input ﬁle. This section always begins with the parameter Num Plot Sigs that indicates the number of signals to be plotted. This is followed by a series of one-line plot speciﬁcations each having a form like input sig, 0.0, 500.0, 1, 1, 0

The ﬁrst parameter in the speciﬁcation is the name of the signal to be plotted. This name must match the signal’s name as it is deﬁned in the main simulation program. The second and third parameters are the starting and stopping times of the signal interval to be plotted. The fourth parameter is the plot decimation rate. A value of 1 indicates that every available sample in the interval is to be written to the plot ﬁle. A value of k indicates that k − 1 samples are to be skipped between samples that are written to the plot ﬁle. The ﬁfth parameter is a ﬂag, which for most signals is set to zero. If the ﬂag is set to 1, the plotting routine interprets the values provided for the start and stop times as sample counts rather than time values. For each signal plotted, PracSim creates a ﬁle that has the same name as the signal and an extension of .txt. Each sample appears on a separate line. The time value appears ﬁrst on the line, followed by a comma and then the sample value. Complex signal values are written as two ﬂoating-point values separated by a comma.

Section 2.2

2.2

Signals

9

Signals

In their simplest embodiment, signals in a simulation would be little more than buffer areas to hold some number of consecutive values from a sampled waveform or a discrete-time sequence. However, such a simple implementation would impose a signiﬁcant amount of bookkeeping and housekeeping on the simulation models that write to or read from these buffers. PracSim implements signals as objects that each include a number of ancillary parameters and housekeeping methods in addition to the essential buffer for sample values. In a hierarchical system, something as apparently simple as a signal’s name can be a source of complexity. At the system level, a signal might have a name like filtered baseband waveform. Within the ﬁlter model that creates this signal, it would be more appropriately called filtered output or simply output sig. Within a bit slicer model that takes this signal as an input, it might be called input wave, while in a correlator model used to estimate delay, this signal might be named delayed input. PracSim handles this situation by allocating a separate Signal object for each context in which a signal appears. One of these objects is designated as the master and all of the other objects for the same signal are linked or connected to this master. Only the master instance of Signal actually allocates the buffer space needed to store sample values—the connected instances all point to this common buffer. The selection of the instance of Signal to be designated as the master instance is not arbitrary. A signal can exhibit “fan-out” in which it serves as an input for multiple models. However, a signal cannot usually exhibit “fan-in,” where a single input on a single model is driven by signals from multiple model outputs. Therefore, it makes sense for the master instance of Signal to be the one associated with the output of the model that actually generates the values to be placed in the buffer. This model controls the one and only write pointer for the sample buffer. Fan-out is accomplished by multiple subordinate instances of Signal that each maintain a separate read pointer for the sample buffer. Signal is actually a class template that can be instantiated for a number of different types of sample values. PracSim includes specializations of Signal for types float, int, bit t, and byte t. A complete implementation of signals involves many attributes and methods that do not depend upon the type of the sample values, and these attributes and methods have been extracted into the nontemplate base class GenericSignal. Tables 2.3 through 2.5 list the attributes and methods belonging to Signal and GenericSignal.

10

Simulation Infrastructure

Chapter 2

Table 2.3 Summary of class template Signal. Constructors: Signal::Signal( char* name ); Signal::Signal( Signal* root id, char* name, PracSimModel* model ); Public Methods: ˜Signal(void); void AllocateSignalBuffer(void); void InitializeReadPtrs(void); T* GetRawOutputPtr(PracSimModel* model); T* GetRawInputPtr(PracSimModel* model); Signal* AddConnection( PracSimModel* model, char* name in model ); void Dump(ofstream); void PassUpdate(void); void SetupPlotSignal(void); void IssuePlotterData(void); Private Attributes: T *Buf Beg T *Phys Buf Beg T *Buf Final Mem Beg T *Next Loc To Plot Notes: 1. This class inherits all methods belonging to GenericSignal. 2. Source code is contained in ﬁle signal t.cpp.

2.2.1

Signal Management Strategy

The simplest approach to controlling the ﬂow of signal samples in a simulation would be to have a constant sampling interval throughout the simulation and to have each model, each time invoked, read one sample from each of its input signals and generate one sample for each of its output signals. However, due to the overhead processing incurred every time a model is invoked, it is more efﬁcient to have each model process a block containing multiple samples upon each invocation. Some models,

Section 2.2

11

Signals

Table 2.4

Summary of class GenericSignal.

Constructors: GenericSignal( char* name, PracSimModel* model ); Public Methods: ˜Signal(void); ˜GenericSignal(void); int GetBlockSize(); int GetValidBlockSize(); void SetBlockSize(int block size); void SetValidBlockSize(int block size); char* GetName(void); double GetSampIntvl(void); void SetSampIntvl(double samp intvl); void GenericSignal::SetAllocMemDepth( int req mem depth ); virtual void AllocateSignalBuffer(void){}; GenericSignal* GetId(); virtual void InitializeReadPtrs(void){}; virtual void SetupPlotSignal(void){}; virtual void IssuePlotterData(void){}; void SetupPlotFile( GenericSignal* sig id, double start time, double stop time, int plot decim rate, bool count vice time, bool header desired); virtual void PassUpdate(void){}; double GetTimeAtBeg(void); void SetTimeAtBeg(double time at beg); void SetEnclave(int enclave num); int GetEnclave(void); Notes: 1. Source code is contained in ﬁle gensig.cpp.

such as those that use fast convolution or fast correlation, must process a block of a particular size each time invoked. Furthermore, in many situations, it is inconvenient

12

Simulation Infrastructure

Chapter 2

Table 2.5 Attributes of class GenericSignal. Protected Attributes: int Buf Len int Block Size int Valid Block Size int Prev Block Size long Cumul Samps Thru Prev Block int Alloc Mem Depth double Samp Intvl char* Name PracSimModel* Owning Model GenericSignal* Root Id bool Sig Is Root double Plot Start Time double Plot Stop Time int Plot Decim Rate ofstream* Plotter File bool Plotting Enabled bool Plot Setup Complete bool Count Vice Time int Start Sample int Stop Sample int Plotting Wakeup int Plotting Bedtime int Cumul Samp Cnt double Time At Beg int Enclave Num std::vector *Connected Sigs Notes: 1. Deﬁnitions are contained in ﬁle gensig.h.

to maintain a constant sampling interval throughout the simulation. Some models (such as BitsToWave from Chapter 3) have an inherent need to generate multiple output waveform samples for each input sample. Other models, discussed in Chapter 14, exist for the speciﬁc purpose of changing the sampling rates between their inputs and outputs. PracSim includes signal management infrastructure designed to facilitate the use of different sample rates at different places within a simulation. This

Section 2.2

13

Signals

section explores the strategy behind this infrastructure, and Section 2.2.2 discusses the details of its implementation. Consider a model that has one input signal and one output signal. The model and its signals can be represented using a graph, as shown in Figure 2.1. Each signal is represented by a vertex in the graph, and the model is represented by an edge. Each signal has several attributes that are of interest in a discussion of multirate simulation. A processing block size and a sampling interval are associated with each signal. In a single-rate simulation, every signal has the same block size and the same sampling interval. In a multirate simulation, different signals may have different block sizes and different sampling rates, and the simulation can contain both single-rate models and multirate models. For a single-rate model, every input signal and every output signal has the same block size and the same sampling interval. For a multirate model, each signal can, in principle, have a different block size and a different sampling interval—subject to one signiﬁcant constraint: the average time epoch covered by each signal block must be constant across the entire simulation. In other words, for each model, Nout Tout = Nin Tin where N is the average block size and T is the sampling interval. There is a resampling rate associated with each input/output pair for a multirate model. The value of the resampling rate R is given by R=

Nout Tin = Nin Tout

Once determined, each value of Tin and Tout remains constant over the life of a simulation run. However, only the average values of Nin and Nout need to remain constant over the life of a simulation run. As discussed in Chapter 11, if there is a signal-shifting model in the upstream signal path, there may be slight variations in Nin from block to block. It would be possible to specify the values for N and T as parameter inputs for every model in the simulation, but this would be very cumbersome. Instead, PracSim has been designed to require that values for N and T be speciﬁed as input parameters for only a few critical places within a simulation. The PracSim signal management system (SMS) then propagates a consistent set of values for N and T throughout the simulation. To support this approach, each multirate model must convey certain information about itself to the SMS. The information to be conveyed can vary depending upon the nature of the model and the relationships between its inputs and outputs.

14

Simulation Infrastructure

S0

M1 S1

Figure 2.1

2.2.1.1

Chapter 2

input_sig model_A

output_sig

Model graph.

Example System

This section uses a simple example system to explore some of the issues involved in the design of the SMS. For the system shown in Figure 2.2, let’s make the following assumptions: • Encoder uses a code with Rcode = 1/2. • The bit pulses that are output from BitsToWave have a duration of 1.0 normalized time units. • For the fast Fourier transform (FFT) in CmpxFrequencyDomainFilter, Tmem = 20.0, T = 0.0625 and N = 4096. BitGener is a source—it has no input signals and it outputs a single output

signal (S1) comprising a sequence of ones and zeros (one sample per bit). The model’s mission is to generate a block of Nout bits each time its Execute method is invoked. The model is hardwired to produce one sample per bit. A value for Nout must eventually be determined by the SMS, which sets the required size for this output block equal to the required input block size for models adjacent to BitGener. (The term adjacent is used in the nonreﬂexive, directed graph sense. In Figure 2.2, Encoder is adjacent to BitGener, but BitGener is not adjacent to Encoder.) This model’s constructor does not convey any signal parameters to the SMS. The required value for Nout is obtained by the model’s Initialize method. The Encoder model accepts as input a bit sequence (S1) having one sample per bit. The model encodes this bit sequence in accordance with the deﬁnition of the code being implemented. The result of encoding is an output sequence of bits containing more than one bit for each bit in the input sequence. If the input block size is Nin , the output block size Nout will equal Nin /Rcode , where Rcode is the code rate

Section 2.2

15

Signals

M4 M1

M2

M3

S1 BitGener

S2 Encoder

BitsTo Wave

S3

Frequency Domain Filter

S5 BitSlicer M5 M7 Comb Error Counter

Figure 2.2

S4

S6 Decoder M6 M8

M9

Binary Error Counter #1

Binary Error Counter #2

Block diagram of example simulation.

for the particular code being implemented. The Encoder model “knows” that its resampling rate is equal to the inverse of the code rate, and therefore the constructor conveys the value Rresamp = (Rcode )−1 to SMS as the model’s resampling rate. The model cannot immediately determine values for Tin , Tout , Nin , or Nout . The values for these signal parameters are determined by the SMS based upon signal parameter values deﬁned elsewhere in the system. Because Encoder has a value for Rresamp deﬁned in the model constructor, the SMS can immediately propagate signal parameters through this model. For this example, let’s assume Rcode = 1/2. The model BitsToWave accepts a sequence of ones and zeros and generates as output a train of rectangular pulses with a positive-valued pulse for each 1 in the input sequence and a negative-valued pulse for each 0 in the input sequence. The input signal (S2) has one sample per bit, and the output signal (S3) has multiple samples per bit. The user must provide a value for the pulse duration TP . Because there is one pulse per input bit, the BitsToWave model knows that the sampling interval for its input signal is equal to TP . The model’s constructor conveys this value to the SMS as the sampling interval Tin for the input signal.

16

Simulation Infrastructure

Chapter 2

FrequencyDomainFilter is one of the few models that determines its own

block size. Most models have their block sizes determined for them by the SMS. A discrete-Fourier-transform-based model such as FrequencyDomainFilter, which is based on the discrete Fourier transform (DFT), is an obvious place to ﬁx both block size and sampling interval because of the relationship F N T = 1, which holds for all DFTs. In this model it is assumed that Rresamp = 1, Tin = Tout , and Nin = Nout . The user must provide values for the parameters NFFT , TFFT , and Tmem , where Tmem is the effective time duration of the ﬁlter’s impulse response. As discussed in Chapter 7, Section 7.6, the number of saved samples in the overlapand-save segmenting strategy must be sufﬁcient to cover a time interval of at least Tmem . The model computes Nin and Nout as Tmem + 0.5 Nin = Nout = NFFT − TFFT For the assumed values of Tmem = 20.0, TFFT = 0.0625, and NFFT = 4096, the resulting value for Nin and Nout is 3776. The model conveys the values of Tin , Tout , Nin , Nout , and Rresamp to the SMS. CombErrorCounter is a hypothetical model that generates a sampling comb that is applied to the input waveform (S4) once per symbol time. The sample values are compared to a bit sequence provided as the reference input (S2). This model was contrived to illustrate what happens when a model has inputs at two or more different sampling rates. This model does not convey any signal parameters to the SMS. The model BitSlicer accepts an input signal (S4), which is a baseband binary waveform, and samples this waveform to generate a sequence of individual output bits. The input signal has multiple samples per bit, and the output signal has one sample per bit. The user must provide a value for the pulse duration TP . Because there is one output bit per pulse, the BitSlicer model knows that the sampling interval for its output signal is equal to TP . The model conveys this value to the SMS as the sampling interval for the output signal (S5). The Decoder model accepts as input an encoded bit sequence (S5) and performs decoding accordance with the deﬁnition of the particular code being implemented. Multiple input bits will be consumed in the generation of each bit in the output sequence (S6). The Decoder model knows that its resampling rate is equal to the code rate and therefore conveys the value Rresamp = Rcode to the SMS as the model’s resampling rate. The model cannot immediately determine values for Tin , Tout , Nin , or Nout . The values for these signal parameters are determined by the SMS based upon signal parameter values deﬁned elsewhere in the system. Because Decoder has a value for Rresamp deﬁned in the model constructor, the SMS can immediately

Section 2.2

17

Signals

propagate signal parameters through this model. The BerCounter model compares a received bit sequence to the “true” or reference bit stream for the purpose of counting errors in the received bit sequence. The only signal parameter that this model needs to know is the block size Nin for its two input signals. This value is obtained from the SMS by the model’s Initialize method. All of the signal parameter information provided to the SMS by the model constructors is summarized in the directed graph shown in Figure 2.3. Each signal is represented by a node, and each model instance is represented by an edge in this graph. To avoid unterminated edges in this graph, a dummy source node is provided for BitGener, and dummy destination nodes are provided for CombErrorCounter and both instances of BerCounter. 2.2.1.2

Propagation of Signal Parameters

PracSim models are carefully designed to make maximum use of known relationships between input and output signal parameters, and to provide sufﬁcient information to the simulation infrastructure so that missing signal parameters can be computed from relationships between models which are connected to each other. To generate these mising parameters, the PracSim infrastructure follows these steps:

1. The SMS must search through the graph until it ﬁnds a node for which both time increment and block size are deﬁned. Such a node represents a base node from which the SMS can begin to propagate time increments and block sizes to other nodes in the graph. In this example, such a search will ﬁnd node S3. From the discussion of FrequencyDomainFilter in the previous section, we can determine the time increment TS3 = 0.0625 and block size NS3 = 3776. The SMS attempts to propagate “upstream” from the base node by locating (1) an incident edge to the base node and (2) the source node for this incident edge. In this example, the SMS would locate edge M3 and node S2. Parameter propagation along an edge in the graph occurs in one of two ways, depending on what parameters have already been deﬁned for the edge and its source node: (a) If a resampling rate is deﬁned for the edge, the SMS can use this value to calculate the time increment and block size for the source node from the time increment and block size deﬁned for the base node: Tsource = Rresamp Tbase Nbase Nsource = Rresamp

(2.2.1) (2.2.2)

18

Simulation Infrastructure

Chapter 2

S0

M1 S1 RM2 =

M2 M9a S2

1 =2 Rcode

TS2 = TP = 1.0

M3 S3 M7a

D1

Figure 2.3

S4

M7b

TS4 = TFFT = 0.0625 NS4 = NS3 = 3776

M5

M8b

M9a D3

⎢T ⎥ NS3 = N FFT − ⎢ mem + 0.5⎥ = 3776 ⎣ TFFT ⎦

M4

M8a D2

TS3 = TFFT = 0.0625

S5

M6 M9b

TS5 = TP = 1.0 RM6 = Rcode = 0.5

S6

Signal dependency graph after all models have been constructed.

Section 2.2

19

Signals

In this example, there is no resampling rate yet deﬁned for edge M3. (b) If no resampling rate is deﬁned for the edge, the SMS must turn to the edge’s source node. If either time increment or block size is deﬁned for the source node, the SMS can use this information in Eqs. (2.2.1) and (2.2.2) to compute the resampling rate for the edge and the missing quantity for the source node. In this example, the time increment for node S2 has been deﬁned to be the pulse duration TP . For the sake of concreteness, let’s say TS2 = TP = 1.0. Then we can write Tsource TS2 1 = 16 = = Tbase TS3 0.0625 Nbase NS3 4096 = 256 = = = Rresamp Rresamp 16

Rresamp = Nsource

2. After both TS2 and NS2 have been determined, the SMS treats S2 as a temporary base node and attempts another step of upstream propagation. The new incident edge of interest is M2, and the source node for M2 is S1. It so happens there is a resampling rate deﬁned for M2, and it is Rresamp = (Rcode )−1 = 2. Thus, we can write TS1 = RM2 TS2 = 2 NS2 256 = 128 NS1 = = RM2 2 3. Node S0 and edge M1 are special cases. 4. Now the SMS begins attempts at forward or downstream propagation. A choice must be made here, and the decision is not immediately obvious. The SMS can begin its campaign of forward propagation at node S1, or it could move forward to the original base node S3 and begin from there. The SMS arrived at S1 via backward propagation through edge M2, so there is no need to attempt forward propagation through M2. There is a second departing edge, M9a. The destination node for edge M9a is vertex D3, which is a sink node. Sink nodes exist strictly for keeping the graph tidy, and they do not have a time increment or a block size. 5. After discovering D3 to be a sink node, the SMS backs up to S1 and then moves forward from S1 to S2. Departing from S2 are two previously untraversed edges, D1 and D2.

20

Simulation Infrastructure

Chapter 2

6. Both D1 and D2 terminate in sink nodes, so the SMS moves forward to S3 and attempts forward propagation to S4 via M4. Both TS4 and NS4 are already deﬁned, so the SMS computes RM4 as RM4 =

NS4 4096 =1 = NS3 4096

and then conﬁrms this value as RM4 =

TS3 0.0625 =1 = TS4 0.0625

7. The SMS moves forward to S4 and attempts forward propagation to S5 via edge M5. The time increment has been deﬁned, so the resampling rate for M5 can be computed as RM5 =

TS4 0.0625 = 0.0625 = TS5 1

and the block size for S5 can be computed as NS5 = RM5 NS4 = (0.0625) (4096) = 256 8. The SMS moves forward to S5 and attempts forward propagation to S6 via edge M6. The resampling rate RM6 has been deﬁned, so the time increment and block size for S6 can be computed as TS5 1 =2 = RM6 (1/2) = RM6 NS5 = (1/2)(256) = 128

TS6 = NS6

Figure 2.4 shows the directed graph after all signal parameters have been propagated throughout the system.

2.2.2

SMS Implementation

In order to make systematic use of the signal parameters provided by individual model constructors, the SMS must construct a directed graph, or digraph, similar to the one shown in Figure 2.3. This graph is used to determine dependencies between the various signals, controls, and model instances. In this graph, signals and controls are represented as vertices, and models are represented as directed edges going from the vertices representing model inputs to vertices representing model outputs. For

Section 2.2

21

Signals

S0

M1 TS1 = RM2TS2 = 2.0 N NS1 = S2 = 118 RM2

S1

M2

RM2 = 2

M9a S2

M3 S3

RM3

TS2 = 1.0 N NS2 = S3 = 236 RM 3 TS2 = = 16 TS3 TS3 = 0.0625 NS3 = 3776

M7a D1

M4 M7b

M8a D2

Figure 2.4 deﬁned.

NS4 = 3776

M5 M8b

M9a D3

S4

S5

M6 M9b

NS4 =1 NS3 TS4 = 0.0625

RM4 =

S6

TS4 = 0.0625 TS5 TS5 = 1.0 NS5 = RM5 NS4 = 236

RM5 =

RM6 = 0.5 TS5 = 2.0 RM6 NS6 = RM6 NS5 = 118

TS6 =

Signal dependency graph with all node and edge parameters

22

Simulation Infrastructure

Chapter 2

a single-rate model, the default is to deﬁne an edge from each input vertex to each output vertex. For a multirate model, each edge must be explicitly speciﬁed by the model constructor. The total “connection picture” for a model is built up using a series of declarations in the model constructor. Eventually, the pictures for all models are merged into one large graph for the entire active system. Until the merging can be accomplished, the connection pattern for each model is stored in an instance of the ModelGraph class. 2.2.2.1 Top-Level Approach

The implementation of the signal management strategy is spread across a number of different classes and methods. This section outlines the sequence of signal-related events that take place as a simulation is constructed, initialized, and executed. Each event is discussed in greater detail in subsequent sections. 1. The program main allocates the master instance of each Signal object that will be used in the simulation. (Section 2.2.2.2.) 2. The program main calls the various model constructors in the order that the models are to be executed. In the current implementation of PracSim, the user must establish this sequence manually by the order in which the constructor calls are placed in main. As discussed on the companion Web site, some infrastructure is provided to support an eventual migration to a graphical speciﬁcation of the system toplogy, with the execution sequence of the models determined automatically based on signal dependencies between the models. In the current implementation of PracSim, pointers to the appropriate input and output Signal objects are passed in the call to each constructor. 3. For each model constructed, (a) If the model being constructed is not the ﬁrst model in the simulation, the constructor for the base class PracSimModel calls the method PracSimModel: :CloseoutModelGraph for the previous model. This causes the model graph for the previous model to be integrated into the system graph. (Section 2.2.2.4.) (b) The constructor for the base class PracSimModel creates an instance of ModelGraph. (Section 2.2.2.3.) (c) The constructor for each speciﬁc derived model class takes the following actions with respect to signals:

Section 2.2

Signals

23

i. For multirate models, the constructor enables multirate operation using the macro ENABLE MULTIRATE. (Section 2.2.2.3.) ii. The constructor reads parameters from the simulation setup ﬁle. Some of these parameters may pertain to input or output signal characteristics. iii. The model constructor copies each passed-in Signal pointer to a corresponding class variable. iv. The model constructor registers each output signal with the current instance of ModelGraph using the macro MAKE OUTPUT(X). (Section 2.2.2.3.) v. The model constructor registers each input signal with the current instance of ModelGraph using the macro MAKE INPUT(X). (Section 2.2.2.3.) vi. If appropriate for the speciﬁc model being constructed, the constructor sets edge (i.e., model) parameters in the current model graph using the macro CHANGE RATE(X,Y,Z) or SAME RATE(X,Y). (Section 2.2.2.3.) vii. If appropriate for the speciﬁc model being constructed, the constructor sets node (i.e., signal) parameters in the current model graph using SET SAMP INTVL(X,Y) or SET BLOCK SIZE(X,Y). (Section 2.2.2.3.) 4. After the ﬁnal model constructor has completed, main calls the Executive method MultirateSetup, which performs the following actions: (a) Calls CloseoutModelGraph for the ﬁnal model. (b) Initializes the SigPlot class. (c) Invokes SystemGraph::ResolveSignalParms, which uses the strategy presented in Section 2.2.1.2 to propagate known signal parameters throughout the entire system graph. (d) Invokes SystemGraph::DistributeSignalParms, which sets the propagated values of signal parameters within each individual signal object. (e) Invokes SystemGraph::AllocateStorageBuffers, which causes the master instance of each signal to allocate the buffer space needed to store a block of signal samples. This allocation is deferred to this point in the processing because not all Signal objects “know” how large

24

Simulation Infrastructure

Chapter 2

their buffers are until after propagated signal parameters have been set in step 4d. (f) Invokes SystemGraph::InitializeReadPtrs, which causes the master instance of each signal to initialize the buffer read pointers in subordinate Signal objects that are connected to the master instance. (g) Invokes SystemGraph::AllocatePlotPointers, which causes the master instance of each signal to create and initialize a complicated structure that is essentially a signal buffer read pointer used by the signalplotting subsystem. The buffer pointers allocated in step 4f are simple pointers that get reset to the beginning of the buffer area on each pass through the simulation. The plotting pointers are complicated by the fact that the plotting of a signal can span many passes and may not even start until after the simulation has been running for some amount of time. (h) Invokes SystemGraph::InitializeModels, which runs the initialization method of every model instance in the system. Invoking the Initialize methods at this point allows these methods to include any calculations that depend upon the availability of valid signal parameters. 5. The simulation enters a loop that invokes SystemGraph::RunSimulation once for each simulation pass. RunSimulation performs the following tasks: (a) Calls the Execute method for each model in the same sequence that the models were constructed. Each time called, the model processes one block’s worth of signal samples. (b) Calls the PassUpdate method for each signal in the system after each model has been invoked. 2.2.2.2

Signal Allocation

The master instance of each Signal object is allocated in main prior to calling the ﬁrst model constructor. Signal is a class template that can be specialized for a number of different signal types. The constructor for Signal takes an input argument that is a string containing the name of the signal by which it is known at the system level. The calling syntax to allocate a float-valued signal named tx signal would be: Signal* tx signal=new Signal("tx signal");

Section 2.2

Signals

25

A number of macros have been deﬁned in sigstuff.h to simplify the syntax for allocating different specialized Signal objects: BIT SIGNAL(X); BYTE SIGNAL(X); INT SIGNAL(X); FLOAT SIGNAL(X); COMPLEX SIGNAL(X); 2.2.2.3

Current Model Graph

A directed graph can be implemented in software using a list of node descriptors, a list of edge descriptors, and an adjacency matrix that speciﬁes the connections between nodes and edges. Edge k going from vertex i to vertex j is indicated by placing the index k into row i, column j of the adjacency matrix. The class DirectedGraph provided in ﬁle digraph.cpp is a minimal implementation of a digraph. The implementation of DirectedGraph (and ModelGraph) makes use of the C++ standard template library (STL). Each node descriptor is simply a pointer to the master instance of the Signal object for the corresponding signal. The list of pointers to Signal instances is kept in an STL vector. ModelGraph maintains a number of lists parallel to the list of node descriptors in DirectedGraph. These lists are parallel in the sense that element k of each list pertains to signal k in the list of node descriptors. These parallel lists contain information like sampling interval, block size, and input/output sense. The hard-core object-oriented approach would be to create node descriptor objects that contain all of this information and eliminate the need for parallel lists. However, PracSim is implemented using parallel lists to avoid the speed penalty associated with dereferencing complicated data structures. In a similar vein, each edge descriptor is simply a pointer to the particular model instance that the edge represents. Each time a model is instantiated, the constructor for the PracSimModel base class creates an instance of the class ModelGraph and establishes the protected class variable Curr Model Graph as a pointer to this instance. The constructor for ModelGraph performs the following tasks: 1. Allocates one instance of DirectedGraph. 2. Allocates STL vector objects for the lists of node and edge properties. 3. Sets Model Is Multirate to the default value of false. 4. Sets Model Is Constant Interval to the default value of false.

26

Simulation Infrastructure

Chapter 2

Model-Wide Signal Properties After the constructor for PracSimModel has created an instance of ModelGraph, and before any speciﬁc signals are inserted

into the digraph, the constructor for the derived model class has an opportunity to change several of the defaults that were set in the constructor for ModelGraph: 1. The macro ENABLE MULTIRATE, which expands to Curr Mod Graph->EnableMultirate();

can be used to indicate that the model is a multirate model. 2. The macro ENABLE CONST INTERVAL, which expands to Curr Mod Graph->EnableConstantInterval();

can be used to indicate that the model uses the same sampling interval for all of its inputs and outputs. Signal Registration Before any speciﬁc signal parameters can be conveyed to the current model graph, all of the signals that are inputs or outputs of the model must be registered with *Curr Model Graph using the macros MAKE OUTPUT(X) and MAKE INPUT(X). The macro MAKE OUTPUT(X) expands to Curr Mod Graph->InsertSignal( X, this, false);

where X is a pointer to the signal object being registered as an output, this is the pointer to the calling model, and false is a boolean constant indicating that the signal is not an input. The macro MAKE INPUT(X) expands to Curr Mod Graph->InsertSignal( X, this, true); X = X->AddConnection( this, #X);

where this is the pointer to the calling model, true is a boolean constant indicating that the signal is an input, and X starts out as a pointer to the master instance of Signal that actually contains the buffer of samples that were written as output by some upstream model and that will be read as input by the current model. The second line of the macro expansion invokes Signal::AddConnection, which creates a slave instance of Signal, connects this instance to the master instance and then

Section 2.2

Signals

27

returns a pointer to the new slave instance. Before MAKE INPUT(X) is executed, X points to the master instance for the signal of interest. After MAKE INPUT(X) has executed, X points to the newly created slave instance that holds a read pointer for the sample buffer and the signal’s name as it is known inside the model. The InsertSignal method performs the following tasks: 1. Calls the AddVertex method of DirectedGraph to add a new vertex for the signal being inserted. 2. Sets default values for node property list items: (a) Vertex Is Input is set to true or false depending upon which macro was used to invoke InsertSignal. (b) Vertex Kind is set to the enumerated value SK REGULAR SIGNAL. (c) Node Is Feedback is set to false. (d) Block Size is set to zero. (e) Samp Intvl is set to zero. 3. Compares the new vertex against existing vertices to see where new edges need to be added to the graph. If the new vertex is an input, edges are added from new vertex to existing output vertices. If the new vertex is an output, edges are added from existing input vertices to the new vertex. For each newly added edge, it sets default values for edge property list items: (a) Delta Delay is set to zero. (b) Const Intvl is set to the value of Model Is Constant Interval, which was set for the entire model prior to any signals being registered. (c) Resamp Rate is set to an undeﬁned rate if the model is a multirate model. Otherwise, Resamp Rate is set to 1.0. Once a model’s constructor has registered all of its input and output signals with *Curr Model Graph, known signal parameters can be inserted into the graph using a number of macros from the ﬁle sigstuff.h, which invoke public methods from class ModelGraph: Setting Signal Parameters

1. The macro CHANGE RATE(X,Y,Z), which expands to Curr Mod Graph->ChangeRate(X->GetId(), Y, Z, this);

28

Simulation Infrastructure

Chapter 2

can be used to set the resampling rate to Z for the edge in the digraph that connects input signal X to output signal Y. The pointer Y can be used directly because, as an output signal, the instance of Signal pointed to by Y will be the master instance for this particular signal. As an input signal, the instance of Signal pointed to by X will only be a connected instance, and the pointer X must be used to invoke GenericSignal::GetId(), which returns a pointer to the corresponding master instance. 2. The macro SAME RATE(X,Y), which expands to Curr Mod Graph->ChangeRate(X->GetId(), Y, 1.0, this);

can be used to set the resampling rate to 1.0 for the edge in the digraph that connects input signal X to output signal Y. 3. The macro SET SAMP INTVL(X,Y), which expands to Curr Mod Graph->SetSampIntvl(X->GetId(),Y);

can be used to set the sample rate for signal X to the value Y. 4. The macro SET BLOCK SIZE(X,Y), which expands to Curr Mod Graph->SetBlockSize(X->GetId(),Y);

can be used to set the block size for signal X to the value Y. 2.2.2.4

Building the System Graph

The ﬁrst thing that the constructor for the PracSimModel base class does is call the method PracSimModel::CloseoutModelGraph for the previous model. The constructor for the ﬁrst model in a simulation detects that no previous model exists and therefore does not call CloseoutModelGraph. The last model in a simulation is taken care of by Executive::MultirateSetup, which runs immediately after the ﬁnal model’s constructor. Except for some optional debug output, CloseoutModelGraph accomplishes its work by invoking two methods belonging to other classes. The ﬁrst method is ModelGraph::Closeout, which puts the ﬁnishing touches on the previous model’s digraph. To avoid the possibility of “dangling” edges, each current model graph (CMG) must have at least one input node

Section 2.3

Controls

29

and one output node. If the CMG does not have any input nodes, Closeout adds a dummy source node for which Vertex Kind is set to SK DUMMY SOURCE SIGNAL. If the CMG does not have any output nodes, Closeout adds a dummy destination node for which Vertex Kind is set to SK DUMMY DEST SIGNAL. SystemGraph::MergeCurrModelGraph is the second method invoked by CloseoutModelGraph and performs the following tasks: 1. Checks each node in the model graph against each node that is already in the system graph. If a match is found, it places the node’s identity in a temporary list of merged nodes and performs the following: (a) If the sampling rate is undeﬁned for the matching node in the system graph (SG), then it copies the sampling rate from the matching CMG node. If the sampling rate is deﬁned differently in the CMG and SG, then a fatal error condition exists. (b) If the block size is undeﬁned for the matching node in the SG, then it copies the block size from the matching CMG node. If the block size is deﬁned differently in the CMG and SG, then a fatal error condition exists. 2. For each CMG node not matched to a node in the SG, the SystemGraph method MergeCurrModelGraph adds a new node to the SG, places this node’s identity in a temporary list of merged nodes, and sets the sampling rate and block size to the values deﬁned in the CMG. 3. Once all the nodes in the CMG have been matched or added to the SG, the adjacency matrix is examined for the presence of edges between nodes appearing on the temporary list of merged nodes. If any such edges are already in the SG, a fatal error condition exists. Such edges must have been placed by some other model, implying that such a model and the current model are both trying to produce the same output signal. This type of fan-in is not supported in PracSim. 4. Inserts new edges into the SG so that every input signal in the merged node list is connected by an edge to every output signal in this list. Edge parameters are copied from the corresponding edges in the CMG.

2.3

Controls

Controls in PracSim are “lightweight” signals, created to handle situations in which one model needs to excise some form of control over another model. Although this

30

Simulation Infrastructure

Chapter 2

control could be accomplished using the signal mechanism discussed in Section 2.2, a separate control mechanism can make it easier to conﬁgure a simulation and eliminate a signiﬁcant amount of the overhead associated with signals. Control is a class template that can be instantiated for a number of different types of control values. PracSim includes specializations of Control for types double, float, int, and bool. A complete implementation of signals involves a few attributes and methods that do not depend upon the type of the sample values, and these attributes and methods have been extracted into the nontemplate base class GenericControl. Tables 2.6 and 2.7 list the attributes and methods belonging to Control and GenericControl.

Table 2.6

Summary of class template Control.

Constructors: Control::Control( char* name ); Control::Control( char* name, PracSimModel* model ); Public Methods: ˜Control(void); T GetValue(void); void SetValue(T value); Private Attribute: T Cntrl Value Notes: 1. This class inherits all methods belonging to GenericControl. 2. Source code is contained in ﬁle control t.cpp.

2.4

Results Reporting

PracSim uses stream I/O for all results reporting. Not including the signal plotting ﬁles, PracSim creates two or three output ﬁles and deﬁnes four output streams for each simulation. Often, a user will want different levels of detail in the results

Section 2.4

Results Reporting

Table 2.7

31

Summary of class GenericControl.

Constructor: GenericControl( char* name, PracSimModel* model ); Public Methods: ˜GenericControl(void); char* GetName(void); GenericControl* GetId(void); Protected Attributes: char* Name; PracSimModel* Owning Model; GenericControl* Root Id; Notes: 1. Source code is contained in ﬁle genctl.cpp.

report depending upon where the simulation is in its development cycle. Early in the development cycle, very detailed results can be useful in determining that the simulation and its constituent models are conﬁgured and operating correctly. Later, when the development is complete, and a number of different parametrized cases are to be run, a less-detailed report may be more convenient. PracSim creates both a full report and a short report. The user has a choice via the system-level parameters Date In Full Report Name and Date In Short Report Name concerning whether or not the report ﬁle names will include the time and date at the start of the simulation. For a simulation named BpskSim, started at 14:27:03 on December 28, 2003, the full report ﬁle would have one of the following names: BpskSim full.txt BpskSim full 031228 14 27 03.txt

The corresponding short report would have one of two similar names BpskSim short.txt BpskSim short 031228 14 27 03.txt

32

Simulation Infrastructure

Chapter 2

If the date is not included in the report ﬁle name, running the simulation multiple times will cause the new ﬁle to overwrite the existing ﬁle of the same name. This is convenient while a simulation is being constructed and the results are of no interest. If the ﬂag DEBUG is deﬁned, PracSim will also create a debug ﬁle BpskSim.dbg. It will be appropriate for some results to be written to more than one report ﬁle. To make this easy for the user, PracSim deﬁnes a number of output streams that automatically route the results to the appropriate ﬁles. The relationship between these streams and the output ﬁles is summarized in Figure 2.5.

DetailedResults

Long Report File

BasicResults

Short Report File

Debug File

ErrorStream

Figure 2.5

Relationship between output streams and ﬁles.

Appendix 2A

EXAMPLE SOURCE CODE

The support directory on the companion Web site contains the classes that implement the PracSim simulation infrastructure. A number of these classes are listed in Table 2A.1. In addition to the items listed, the support directory also contains I/O support routines for the enumerated types presented in Section 2.1.3. Table 2A.1

Classes in the support directory.

class

ﬁle

DirectedGraph Executive GenericControl GenericSignal PsModelError ModelGraph ParmFile PracSimModel PracSimStream SignalPlotter SystemGraph Control Signal

digraph.cpp exec.cpp genctl.cpp gensig.cpp model error.cpp model graph.cpp parmfile.cpp psmodel.cpp psstream.cpp sigplot.cpp syst graph.cpp control T.cpp signal T.cpp

2A.1 PracSimModel Every model in PracSim inherits from the base class PracSimModel. The header for PracSimModel is provided in Listing 2A.1, and implementations for the various methods are provided in Listings 2A.2 through 2A.4. 33

34

Example Source Code

Listing 2A.1

Header for the PracSimModel base class.

class PracSimModel { public: PracSimModel( char *model_name, PracSimModel* outer_model); PracSimModel( int dummy_for_unique_signature, char *model_name); ~PracSimModel(void); const char* GetModelName(void); const char* GetInstanceName(void); int GetNestDepth(void); void CloseoutModelGraph(int key); virtual void Initialize(void){}; virtual int Execute(void){return(-1);}; protected: typedef struct{ GenericSignal *Ptr_To_Sig; bool Sig_Is_Optional; } Sig_List_Elem; char *Model_Name; char *Instance_Name; std::list *Output_Sigs; std::list *Input_Sigs; ModelGraph* Curr_Mod_Graph; int Nest_Depth; };

Appendix 2A

Section 2A.1 PracSimModel

Listing 2A.2

Constructor for the PracSimModel base class.

PracSimModel::PracSimModel( char* instance_name, PracSimModel* outer_model) { //--------------------------------------------------// Closeout the CMSG for previous model instance // and merge it with the Active System Graph Nest_Depth = 1 + outer_model->GetNestDepth(); if( (PrevModelConstr !=NULL) && (Nest_Depth==1)) { #ifdef _DEBUG *DebugFile GetName()),".txt\0"); Plotter_File = new ofstream(file_name, ios::out); Plotting_Enabled = true; Plot_Start_Time = start_time; Plot_Stop_Time = stop_time; Plot_Decim_Rate = plot_decim_rate; Count_Vice_Time = count_vice_time; *DebugFile SetValidBlockSize(Proc_Block_Size); if(Noise_Only_Sig != NULL) Noise_Only_Sig->SetValidBlockSize(Proc_Block_Size); //--------------------------------------------------// determine the power of the input signal in_sig = GET_INPUT_PTR(In_Sig); double sum = 0.0; for(is=0; is *cmpx out sig, Signal *mag out sig, Signal *phase out sig); Parameters: double Bit Durat; double Data Skew; double Subcar Misalign; double Phase Unbal; double Amp Unbal; bool Shaping Is Bipolar; Notes: 1. Source code is contained in ﬁle mskmod.cpp.

1.2

envelope

1.1 1 0.9 0.8 5

10

15

20 time (norm. sec.)

25

30

35

Figure 9.37 Envelope variations for an MSK modulator with an amplitude unbalance of 1.1 and a phase unbalance of 5 degrees.

312

Modulation and Demodulation

Chapter 9

1.3 1.2

envelope

1.1 1 0.9 0.8 0.7 5

10

15

20 time (norm. sec.)

25

30

35

Figure 9.38 Envelope variations for an MSK modulator with an amplitude unbalance of 1.1, a phase unbalance of 5 degrees, and a data skew of 0.05.

9.6.3

Properties of MSK Signals

The power spectral density of an MSK signal is given by

8Eb cos [2π (f + fc ) Tb ] 2 cos [2π (f − fc ) Tb ] 2 + (9.6.4) PMSK = 2 π 1 − [4Tb (f − fc )]2 1 − [4Tb (f + fc )]2 Evaluation of this equation is straightforward except at the values of f for which the denominators in the the two fraction terms become zero. When f = fc ± (4Tb )−1 , the denominator of the ﬁrst term equals zero. For these values of f , the numerator will also be zero, so we have an indeterminate form of type 00 that can be evaluated using L’Hospital’s rule. Similarly, the second term will be an indeterminate form of type 00 when f = −fc ± (4Tb )−1 . Applying L’Hospital’s rule, we obtain cos [2π (f − fc ) Tb ] −π lim = 2 4 f →fc ±(4Tb )−1 1 − [4Tb (f − fc )] cos [2π (f + fc ) Tb ] π = lim 2 4 f → −fc ±(4Tb )−1 1 − [4Tb (f + fc )] These values of f are treated as special cases in the function MskPsd that was used to generate the normalized plot of Eq. (9.6.4) shown in Figure 9.39. An estimated PSD for a complex baseband simulation of an MSK signal is shown in Figure 9.40.

Section 9.6

313

Minimum Shift Keying

0 -10

dB

-20 -30 -40 -50 -60 -4

-3

-2

-1

0

1

2

3

4

Normalized frequency offset from carrier, (f-f c )/Rs

Figure 9.39

PSD for an MSK signal.

0

-10

dB

-20

-30

-40

-50 0

0.5

1

1.5

2

2.5

3

3.5

4

Normalized frequency offset from carrier, (f-f c )/Rs

Figure 9.40 signal.

PSD estimated from a complex baseband simulation of an MSK

314 9.6.3.1

Modulation and Demodulation

Chapter 9

Error Performance

The probability of bit error for MSK is the same as for QPSK; that is, ' ' 1 2Eb Eb Pb = Q = erfc N0 2 N0 Figure 9.41 contains a plot of Pb and several estimated bit-error-rate values that were obtained from simulation of an ideal MSK modulator and perfectly synchronized I&D demodulator.

1x10 0

Probability of bit error

1x10 -1

1x10 -2

X 1x10 -3

X X

1x10 -4

X 1x10 -5 -8

-4

0

4

8

12

Eb /N0

Figure 9.41

Probability of bit error for MSK signals.

Appendix 9A

EXAMPLE SOURCE CODE

The companion Web site includes 14 Microsoft Visual Studio .NET projects, each comprising a simulation that demonstrates and provides a test vehicle for a different pairing of modulator and demodulator models, as listed in Table 9A.1. Table 9A.1

Projects in Modulation directory.

project

modulator

demodulator

BpskSim

BpskModulator

BpskCorrelationDemod

BpskSim Bp

BpskBandpassModulator

BpskBandpassDemod

FskCohSim

ComplexVco

FskCoherentDemod

FskCohSim Bp

BandpassVco

FskCoherentBandpassDemod

FskSim Bp

FskTwoToneModulator

FskBandpassDemod

MpskSim

MpskSymbsToQuadWave

MpskOptimalDemod

QuadratureModulator MpskSim Bp

MpskSymbsToQuadWave

MpskOptimalBandpassDemod

QuadBandpassModulator MskSim

MskModulator

QuadratureDemod

OqpskSim

QuadratureModulator

QuadratureDemod

QamSim

QamSymbsToQuadWaves

QamOptimalDemod

IntegrateDumpAndSlice

QuadratureModulator QamSim Bp

QamSymbsToQuadWaves

QuadBandpassMixer

QuadBandpassModulator

IntegrateAndDump QamSymbolDecoder

QpskSim

QuadratureModulator

QuadratureDemod

QpskSim Bp

QuadBandpassModulator

QuadBandpassMixer IntegrateDumpAndSlice

QpskSim Corr

QuadratureModulator

QpskOptimalBitDemod

315

316

Example Source Code

Appendix 9A

9A.1 MskModulator The header for MskModulator is shown in Listing 9A.1. The constructor is provided in Listing 9A.2, and the Execute method is provided in listing 9A.3.

Listing 9A.1

Header for MskModulator model.

class MskModulator : public PracSimModel { public: MskModulator( char* instance_name, PracSimModel *outer_model, Signal< float >* in_signal_i, Signal< float >* in_signal_q, Signal< complex >* out_signal, Signal< float >* mag_signal, Signal< float >* phase_signal ); ~MskModulator(void); void Initialize(void); int Execute(void); private: float Phase_Unbal; float Amp_Unbal; double Pi_Over_Bit_Dur; double Bit_Durat; double Samp_Intvl; float Subcar_Misalign; float Data_Skew; int Shaping_Is_Bipolar; std::complex Phase_Shift; int Samps_Out_Cnt; int Block_Size; Signal< float > *I_In_Sig; Signal< float > *Q_In_Sig; Signal< std::complex > *Cmpx_Out_Sig; Signal< float > *Mag_Out_Sig; Signal< float > *Phase_Out_Sig; };

Section 9A.1 MskModulator

Listing 9A.2

Constructor for MskModulator model.

MskModulator::MskModulator( char* instance_name, PracSimModel* outer_model, Signal< float >* i_in_sig, Signal< float >* q_in_sig, Signal< complex >* cmpx_out_sig, Signal< float >* mag_out_sig, Signal< float >* phase_out_sig ) :PracSimModel(instance_name, outer_model) { MODEL_NAME(MskModulator); // Read model config parms OPEN_PARM_BLOCK; GET_DOUBLE_PARM(Bit_Durat); GET_DOUBLE_PARM(Data_Skew); GET_DOUBLE_PARM(Subcar_Misalign); GET_DOUBLE_PARM(Phase_Unbal); GET_DOUBLE_PARM(Amp_Unbal); GET_BOOL_PARM(Shaping_Is_Bipolar); // Connect input and output signals I_In_Sig = i_in_sig; Q_In_Sig = q_in_sig; Cmpx_Out_Sig = cmpx_out_sig; Mag_Out_Sig = mag_out_sig; Phase_Out_Sig = phase_out_sig; MAKE_OUTPUT( Cmpx_Out_Sig ); MAKE_OUTPUT( Mag_Out_Sig ); MAKE_OUTPUT( Phase_Out_Sig ); MAKE_INPUT( I_In_Sig ); MAKE_INPUT( Q_In_Sig ); // Set up derived parms double phase_unbal_rad = PI * Phase_Unbal / 180.0; Pi_Over_Bit_Dur = PI/Bit_Durat; Phase_Shift = complex( -sin(phase_unbal_rad), cos(phase_unbal_rad)); }

317

318

Example Source Code

Listing 9A.3

Appendix 9A

Execute method for MskModulator model.

int MskModulator::Execute(void) { float *i_in_sig_ptr, *q_in_sig_ptr; float *phase_out_sig_ptr, *mag_out_sig_ptr; float subcar_misalign, amp_unbal, data_skew; float work, work1; std::complex work2; std::complex *cmpx_out_sig_ptr; int samps_out_cnt; double samp_intvl; double pi_over_bit_dur, argument; std::complex phase_shift; long int_mult; int shaping_is_bipolar; int block_size; int is; cmpx_out_sig_ptr = GET_OUTPUT_PTR( Cmpx_Out_Sig ); phase_out_sig_ptr = GET_OUTPUT_PTR( Phase_Out_Sig ); mag_out_sig_ptr = GET_OUTPUT_PTR( Mag_Out_Sig ); i_in_sig_ptr = GET_INPUT_PTR( I_In_Sig ); q_in_sig_ptr = GET_INPUT_PTR( Q_In_Sig ); samps_out_cnt = Samps_Out_Cnt; samp_intvl = Samp_Intvl; subcar_misalign = Subcar_Misalign; amp_unbal = Amp_Unbal; data_skew = Data_Skew; pi_over_bit_dur = Pi_Over_Bit_Dur; phase_shift = Phase_Shift; shaping_is_bipolar = Shaping_Is_Bipolar; block_size = I_In_Sig->GetValidBlockSize(); Cmpx_Out_Sig->SetValidBlockSize(block_size); Mag_Out_Sig->SetValidBlockSize(block_size); Phase_Out_Sig->SetValidBlockSize(block_size); for (is=0; is* in_sig, Signal< bit_t >* symb_clock_in, Signal< byte_t >* out_sig ); ~MpskOptimalDemod(void); void Initialize(void); int Execute(void); private: double Out_Samp_Intvl; int Block_Size; Signal< byte_t > *Out_Sig; Signal< std::complex > *In_Sig; Signal< bit_t > *Symb_Clock_In; int Bits_Per_Symb; int Samps_Per_Symb; byte_t Num_Diff_Symbs; double *Integ_Val; std::complex *Conj_Ref; };

Section 9A.2 MpskOptimalDemod

Listing 9A.5

Constructor for MpskOptimalDemod model.

MpskOptimalDemod::MpskOptimalDemod( char* instance_name, PracSimModel* outer_model, Signal< complex< float > >* in_sig, Signal< bit_t >* symb_clock_in, Signal< byte_t >* out_sig ) :PracSimModel(instance_name, outer_model) { MODEL_NAME(MpskOptimalDemod); ENABLE_MULTIRATE; // Read model config parms OPEN_PARM_BLOCK; GET_INT_PARM(Bits_Per_Symb); GET_INT_PARM(Samps_Per_Symb); // Connect input and output signals Out_Sig = out_sig; Symb_Clock_In = symb_clock_in; In_Sig = in_sig; MAKE_OUTPUT( Out_Sig ); MAKE_INPUT( Symb_Clock_In ); MAKE_INPUT( In_Sig ); double resamp_rate = 1.0/double(Samps_Per_Symb); CHANGE_RATE( In_Sig, Out_Sig, resamp_rate ); CHANGE_RATE( Symb_Clock_In, Out_Sig, resamp_rate ); Num_Diff_Symbs = 1; for(int i=1; iGetBlockSize(); Out_Samp_Intvl = Out_Sig->GetSampIntvl(); // // set up table of phase references Conj_Ref = new std::complex[Num_Diff_Symbs]; Integ_Val = new double[Num_Diff_Symbs]; for( byte_t isymb=0; isymbGetValidBlockSize(); Out_Sig->SetValidBlockSize(block_size/ Samps_Per_Symb); integ_val = Integ_Val; for (is=0; is *in sig, Signal< std::complex > *out sig); Parameters: int Fft Size; double Dt For Fft; float Overlap Save Mem; bool Bypass Enabled; char* Magnitude Data Fname; double Mag Freq Scaling Factor; char* Phase Data Fname; double Phase Freq Scaling Factor; Notes: 1. Source code is contained in ﬁle polar freq dom filt.cpp.

Example 10.4 As shown in Example 10.3, the scatter diagram for an 8-PSK signal is virtually unchanged when the signal is passed through a memoryless nonlinearity. However, this same signal will be degraded by a two-box

348

Ampliﬁers and Mixers

A

SymbGener

MpskSymbs To QuadWaves

Chapter 10

Additive Gaussian Noise

CmpxToQuadrature Quadrature Modulator

PolarFreq Domain Filter

AnlgDirect FormFir (as LPF)

AnlgDirect FormFir (as LPF)

QuadratureToCmpx Nonlinear Amplifier Cmpx IqPlot

A

Figure 10.24

Simulation architecture for Example 10.4.

nonlinear ampliﬁer model that uses the magnitude and phase responses from Figures 10.22 and 10.23. The simulation architecture is shown in Figure 10.24. Figure 10.25(b) shows the scatter diagram for an 8-PSK signal (with Eb /N0 = 14 dB) passed through the two-box nonlinear ampliﬁer. Compare this to Figure 10.25(a), which shows the scatter diagram when the ﬁlter is bypassed so that the signal passes through only the memoryless nonlinearity from Example 10.3. In AWGN, an 8-PSK signal with Eb /N0 = 14 dB achieves an SER of 2 × 10−5 when optimally demodulated. When the two-box model from this example is added to the signal path, the SER degrades to 9 × 10−4 .

349

Section 10.3 Two-Box Nonlinear Ampliﬁer Models

a

b

Figure 10.25 Scatter diagram of 8PSK constellation for Example 10.4: (a) output of data ﬁlters with PolarFreqDomainFilter bypassed and (b) with PolarFreqDomainFilter enabled.

Appendix 10A

EXAMPLE SOURCE CODE

The companion Web site includes four Microsoft Visual Studio projects, each comprising a simulation that demonstrates and provides a test vehicle for a different aspect of ampliﬁer modeling. The project Zm Nonlin uses the the model IdealHardLimiter to demonstrate the operation of a memoryless nonlinearity. The project AmAm AmPm uses the NonlinearAmplifier model presented in Section 10A.1 to simulate theAM/AM andAM/PM conversion often exhibited by practical amplifers. The project NLA 2 Box combines the use of NonlinearAmplifier with the use of the model PolarFreqDomainFilter model to assess the impact of nonlinear ampliﬁcation upon modulated data waveforms. Project Msk 2Box is similar to NLA 2 Box, but structured to use MSK modulator and demodulator models.

10A.1

NonlinearAmplifier

The header for NonlinearAmplifier is provided in Listing 10A.1, and implementations for the constructor and the Execute method are provided in Listings 10A.2 and 10A.3 respectively. This model uses the SampledCurve class shown in Listing 10A.4 to interpolate sampled AM/AM and AM/PM curves for various input power levels.

350

Section 10A.1 NonlinearAmplifier

Listing 10A.1

Header for NonlinearAmplifier model.

class NonlinearAmplifier : public PracSimModel { public: NonlinearAmplifier( char* instance_nam, PracSimModel *outer_model, Signal< complex > *in_signal, Signal< complex > *out_sig ); ~NonlinearAmplifier(void); void Initialize(void); int Execute(void); private: int Out_Avg_Block_Size; int In_Avg_Block_Size; Signal< complex > *In_Sig; Signal< complex > *Out_Sig; double Output_Power_Scale_Factor; double Phase_Scale_Factor; double Anticipated_Input_Power; double Operating_Point; double Agc_Time_Constant; float Input_Power_Scale_Factor; bool Agc_On; SampledCurve *Am_Am_Curve; SampledCurve *Am_Pm_Curve; char *Am_Am_Fname; char *Am_Pm_Fname; };

351

352

Example Source Code

Listing 10A.2

Appendix 10A

Constructor for NonlinearAmplifier model.

NonlinearAmplifier::NonlinearAmplifier( char* instance_name, PracSimModel* outer_model, Signal< complex >* in_sig, Signal< complex >* out_sig ) :PracSimModel(instance_name, outer_model) { MODEL_NAME(NonlinearAmplifier); //

Read model config parms

OPEN_PARM_BLOCK; GET_DOUBLE_PARM(Output_Power_Scale_Factor); GET_DOUBLE_PARM(Phase_Scale_Factor); GET_DOUBLE_PARM(Anticipated_Input_Power); GET_DOUBLE_PARM(Operating_Point); GET_DOUBLE_PARM(Agc_Time_Constant); Input_Power_Scale_Factor = float(Operating_Point/Anticipated_Input_Power); Am_Am_Fname = new char[64]; strcpy(Am_Am_Fname, "\0"); GET_STRING_PARM(Am_Am_Fname); Am_Pm_Fname = new char[64]; strcpy(Am_Pm_Fname, "\0"); GET_STRING_PARM(Am_Pm_Fname); //

Connect input and output signals

In_Sig = in_sig; Out_Sig = out_sig; MAKE_OUTPUT( Out_Sig ); MAKE_INPUT( In_Sig ); Am_Am_Curve = new SampledCurve(Am_Am_Fname); Am_Pm_Curve = new SampledCurve(Am_Pm_Fname); }

Section 10A.1 NonlinearAmplifier

Listing 10A.3

Execute method for NonlinearAmplifier model.

int NonlinearAmplifier::Execute() { complex *out_sig_ptr, out_sig; complex *in_sig_ptr, in_sig; complex agc_in_sig; float power, power_out; float input_phase; double phase_shift; double amplitude; double phase_out; double sum_in, sum_out; double amp_sqrd; double avg_power_in, avg_power_out; int block_size, is; block_size = In_Sig->GetValidBlockSize(); Out_Sig->SetValidBlockSize(block_size); //------------------------------------------------out_sig_ptr = GET_OUTPUT_PTR( Out_Sig ); in_sig_ptr = GET_INPUT_PTR( In_Sig ); block_size = In_Sig->GetValidBlockSize(); Out_Sig->SetValidBlockSize(block_size); sum_in = 0.0; sum_out = 0.0; for(is=0; isGetValue(power)); sum_out += power_out; phase_shift = Am_Pm_Curve->GetValue(power); amplitude = sqrt(2.0*power_out); phase_out = input_phase + Phase_Scale_Factor*phase_shift; out_sig = complex( float(amplitude*cos(phase_out)), float(amplitude*sin(phase_out))); *out_sig_ptr++ = out_sig; } avg_power_out = sum_out/block_size; avg_power_in = sum_in/block_size/2; BasicResults

::DiscreteDelay( //note 2 char* instance name, PracSimModel* outer model, Signal* in signal, Signal* out signal, Control *dynamic delay) DiscreteDelay< T > ::DiscreteDelay( //note 3 char* instance name, PracSimModel* outer model, Signal* in signal, Signal* out signal) Parameters: DELAY MODE T Delay Mode; int Initial Delay In Samps; int Num Initial Passes; int Max Delay In Samps; Notes: 1. BlockSize for this model is set by the PracSim system. 2. This constructor does not support DELAY MODE GATED. 3. This constructor does not support DELAY MODE GATED or DELAY MODE DYNAMIC. 4. Source code is contained in ﬁle discrete delay T.cpp.

361

362

Synchronization and Signal Shifting

Chapter 11

the second input block. The remaining Nblock − Nadv samples from the second input block are saved inside the model for use in building the second output block. A model that operates this way to realize a signal advance does not adhere to a block-synchronous protocol—output block B cannot be generated until after input block B + 1 is available. The problem becomes even worse when Nadv is greater than Nblock . In this case, output block B cannot be generated until after input block B+Badv , where Badv = Nadv /Nblock . Some fundamental changes to the simulation infrastructure must be made to accommodate models that operate this way. Two different approaches are supported in PracSim. 11.1.2.1

Block-Synchronous Enclaves

Consider the hypothetical simulation shown in Figure 11.2. The models upstream from the SignalAdvance model—Model A through Model B—must each be executed in sequence for a total of Badv +1 passes to generate the input samples needed by SignalAdvance to generate the ﬁrst block of its output signal, sig b. The models downstream from the SignalAdvance model—Model C through Model D—are not permitted to execute until after SignalAdvance begins producing blocks of sig b. Considered apart from the rest of the simulation, Model A through Model B can be operated in a block-synchronous fashion. Likewise, the models Model C through Model D can be operated in a block-synchronous fashion when the rest of the simulation is not involved. Thus, even though the simulation as a whole is not block-synchronous, it can be divided into two block-synchronous enclaves joined together by the SignalAdvance model, which acts as a gateway for the ﬂow of data and control between the enclaves. During the execution phase of a simulation, the Execute methods of the various models are called in the proper sequence by the RunSimulation method of the ActiveSystemGraph class. When the Execute methods of block-synchronous models terminate normally, they return a value of MES AOK (Model Execution Status, All OKay) to RunSimulation, thus allowing the execution to continue on to the next model in the sequence. Until it is able to issue a block of output samples, a block-asynchronous model such as SignalAdvance will return a value of MES RESTART that causes RunSimulation to increment the global value of PassNumber and begin executing the ﬁrst model in enclave 0. Once enclave 0 has been executed in this way for a number of passes and the block-asynchronous model has received a number of input blocks sufﬁcient to allow generation of the ﬁrst output block, the block-asynchronous model will perform the following exit sequence:

Section 11.1

363

Shifting Signals in Time

Model_A

Model_B

enclave 0

signal_a SignalAdvance signal_b Model_C

enclave 1

Model_D

Figure 11.2 Simulation architecture for exploring the problems raised by signal advance.

1. Call SigPlot.CollectData to cause plot values for enclave 0 to be collected. [For the ﬁnal enclave in a simulation, the call to SigPlot.CollectData is made by RunSimulation at the end of each pass. When the entire simulation is block-synchronous, the ﬁnal enclave is the only enclave, and none of the models needs concern itself with calling SigPlot.CollectData.] 2. Increment the global value of enclave number. 3. Set the local value of New Pass Number to 1. 4. Set the global value of PassNumber equal to the local value of New Pass Number. 5. Return a value of MES AOK to RunSimulation. This allows execution to continue on to the ﬁrst model in the next enclaves rather than return to the top of enclave 0. On subsequent passes, the model performs all of these actions with one exception. Instead of setting New Pass Number equal to 1, the existing value is simply incremented by 1.

364

Synchronization and Signal Shifting

Chapter 11

11.1.2.2 Variable Block Lengths

For operation using block-synchronous enclaves, execution is not permitted to progress from one enclave to the next until the block-asynchronous model joining the the two enclaves has received a number of input samples sufﬁcient to allow generation of a complete output block. An alternative approach is to allow the blockasynchronous model to generate an incomplete output block during every pass. On each pass, the output block will be as large as possible given the available inputs. Models downstream from the block-asynchronous model must be prepared to deal with varying block sizes. For most models, this is simply a matter of reading the block size from the signal once per pass rather than just once at system initialization. Models that use FFTs and models that implement memory as ﬁxed-length circular buffers both depend upon input signals having block lengths that are constant. A reblocking model can be used to allow simulations with variable block lengths to include models that depend upon ﬁxed block lengths. A reblocking model contains a buffer that accumulates input samples over multiple simulation passes. Whenever the buffer does not contain enough samples for an output block to be issued, the model sets a ﬂag that downstream models can check during each pass to determine whether or not they should execute. 11.1.2.3

Dynamic Advances

Implementing a dynamic advance model can run into a buffering problem that is similar to the buffering problem encountered in connection with the dynamic delay model. Consider the case of a signal having 10 samples per block, as depicted in Figure 11.3. If the initial advance is four sample intervals in passes 1 and 2 of enclave 0, the samples in the enclave 1, pass 1, output block will be the samples x4 , x5 , x6 , . . . , x13 . Input samples x14 through x19 will be saved in the model’s internal buffer for use in pass 2 of enclave 1. If the advance is then reduced to one sample, the samples in the pass 2 output block should be x11 , x12 , . . . , x20 . However, if the model is implemented using a minimal buffering scheme, the samples x11 , x12 , and x13 would not be available, having been used and discarded during pass 1 of enclave 1. Similar to the approach taken for variable delays, the variable advance model should allocate and maintain its buffer based on a maximum speciﬁed delay that is provided as one of the model’s conﬁguration parameters. The correct implementation of buffering for variable advances is slightly more complicated than it is for variable delays. Not only does the speciﬁed maximum advance govern the size of the internal buffer, it also establishes the offset between the pass numbers in the enclaves for which the advance model serves as a gateway. Suppose that we specify a maximum advance of 24 samples for the case of a signal

enclave 0 pass 1 input x0 x1 x2 x3 x4 x5 x6 x7 x8 x9

Control: advance = 4 samples

buffer pass 1-2 x4 x5 x6 x7 x8 x9 4 sample advance

enclave 0 pass 2 input x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 buffer pass 2-3 x14 x15 x16 x17 x18 x19

365 1 sample advance

x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 enclave 1 pass 1 output

x14 x15 x16 x17 x18 x19 x20 enclave 1 pass 2 output required samples not available

Figure 11.3

Sample loss in simple implementation of dynamic advance.

Control: advance = 1 sample enclave 0 pass 3 input x20 x21 x22 x23 x24 x25

buffer pass 3-4 x21 x22 x23 x24 x25

366

Synchronization and Signal Shifting

Chapter 11

having 10 samples per block. If the variable delay starts out at the maximum, the model must skip over the ﬁrst two input blocks and the ﬁrst four samples of the third block. The ﬁrst output block (i.e., pass 1 of enclave 1) will contain samples x24 through x33 . The enclave providing input to the advance model must execute pass 4 before the enclave accepting the advanced output can begin pass 1; thus the offset between enclaves is 4 − 1 = 3 passes. If the two enclaves are to remain block synchronous within themselves, this offset cannot change over the life of the simulation—even if the desired amount of delay changes. If the desired advance is reduced from 24 to zero during pass 5 of enclave 0, and the offset between enclaves is held constant, the second output block of the advance model should contain samples x10 through x19 , and the internal buffer should hold samples x20 through x49 for use in subsequent passes. We can generalize on this example to conclude that the internal buffer must be sized to hold P complete input blocks, where , Nmax P = Nblock Another way to look at this is to forget about enclaves for a minute and just recognize that an advance of 24 is equivalent to an advance of 30 plus a delay of 6. This delay could be accomplished using a buffer of length 6. The two different block-synchronous enclaves offset by three passes simply provide a systematic way to mechanize an advance of 30 samples. However, once the offset between enclaves is set to 3 passes, it must stay ﬁxed for the life of the simulation. Thus, when the advance is reduced to zero, it must be viewed as an advance of 30 plus a delay of 30, the latter requiring the model to have an internal buffer of length 30. In general, an advance of Nmax samples can be viewed as an advance of P Nblock samples plus a delay of P Nblock − Nmax samples. It turns out that even though a maximum advance of Nmax is speciﬁed, the model can actually support an advance of up to P Nblock samples by simply allowing the delay component to become zero. 11.1.2.4

Discrete Advance Model

Table 11.2 summarizes the model DiscreteAdvance that can be used for advancing signals by integer multiples of the sampling interval. The model can operate in one of four different modes depending upon the value of Advance Mode that is read from the parameter input ﬁle when the simulation is being conﬁgured. Advance Mode is a variable of type ADVANCE MODE T, which is an enumerated type deﬁned in advance modes.h. Parameter input and stream I/O support for this enumeration are provided in advance modes.cpp. The possible values for Advance Mode and the corresponding model behaviors are analogous to those given above for the Delay Mode in the DiscreteDelay model.

Section 11.1

Shifting Signals in Time

Table 11.2

Summary of model DiscreteAdvance.

Constructors: DiscreteAdvance< T > ::DiscreteAdvance( char* instance name, PracSimModel* outer model, Signal* in signal, Signal* out signal, Control *dynamic adv, Control *adv change enabled ) DiscreteAdvance< T > ::DiscreteAdvance( //note 2 char* instance name, PracSimModel* outer model, Signal* in signal, Signal* out signal, Control *dynamic adv) DiscreteAdvance< T > ::DiscreteAdvance( //note 3 char* instance name, PracSimModel* outer model, Signal* in signal, Signal* out signal) Parameters: ADVANCE MODE T Advance Mode; int Initial Adv In Samps; int Num Initial Passes; int Max Adv In Samps; Notes: 1. BlockSize for this model is set by the PracSim system. 2. This constructor does not support ADVANCE MODE GATED. 3. This constructor does not support ADVANCE MODE GATED or ADVANCE MODE DYNAMIC. 4. Source code is contained in ﬁle discrete adv T.cpp.

367

368

11.1.3

Synchronization and Signal Shifting

Chapter 11

Continuous-Time Delays via Interpolation

Sometimes it is necessary to delay a signal by an interval that is not an integer multiple of the sampling interval. In these cases, interpolation must be used to generate new sample values that fall within the intervals between existing samples. If the signal that is to be delayed has been sampled at a relatively high rate, linear interpolation will often provide sufﬁcient accuracy. Consider the case of a signal having 10 samples per block, as depicted in Figure 11.4. Let’s say the signal is to be delayed by 3.6T , where T is the sample interval. This would delay the input x0 to a point 0.6T to the right of y3 and 0.4T to the left of y4 . Similarly, x1 will be delayed to a point 0.6T to the right of y4 and 0.4T to the left of y5 . The model’s output samples must be those values that occur at integer multiples of T , not at times (n + 0.6)T . The output y4 occurring at time 4T needs to be the value that x would have at this point if the signal were a function of continuous time. As depicted for arbitrary values of x0 and x1 in Figure 11.5, the output y4 occurs at a point 0.4T to the right of the delayed x0 and at 0.6T to the left of the delayed x1 . Using linear interpolation to calculate y4 yields y 4 − x0 (4T − 3.6T ) = x1 x0 (4.6T − 3.6T ) y 4 − x0 0.4T = T x1 − x0 y4 = 0.6x0 + 0.4x1 A similar equation can be written for each output and generalized to obtain yn = 0.6xn−4 + 0.4xn−3

(11.1.1)

If we assume xm ≡ 0 for m > 0, we have y0 = y1 = y2 = 0. Calculation of y3 requires special consideration. If we assume x−1 = 0 and apply (11.1.1), we obtain y3 = 0.4x0 On the other hand, we could just deﬁne y3 = 0. The output sample y9 is the last output in block 1 and makes use of inputs x5 and x6 . The samples x6 , x7 , x8 , and x9 must be saved inside the model for use during pass 2 of the simulation. These results can be extended to the general case of a signal that is to be delayed by τ T , where T is the sampling interval and τ is nonnegative and real. Let m be the largest integer that does not exceed τ m = τ Then we can state the following:

Section 11.1

369

Shifting Signals in Time

x0

x1

x2

y0

y1

y2

x3

x4

x5

x6

x7

x8

x9

y6

y7

y8

y9

0.6T 0.4T

y3 0.6T 0.4T

y4

Figure 11.4 to xk .

y5

Relative sample positions when yk is delayed by 3.6T with respect

1. The ﬁrst m + 1 samples of output block 1 will be zero. 2. The last m samples in each input block must be saved for use at the start of the next pass. 3. The output samples yn for n > m can be computed as yn = W xn−m−1 + (1 − W ) xn−m where W = τ − τ

11.1.3.1

Interpolation Using Sampling Functions

According to the uniform sampling theorem, if the spectrum of a signal vanishes beyond an upper frequency of fH , the signal can be completely determined by

370

Synchronization and Signal Shifting

Chapter 11

x0 y4 x1

4.6T

3.6T 4T

Figure 11.5

Linear interpolation of output y4 from inputs x0 and x1 .

samples taken at uniform intervals of T < 1/(2fH ). The sampled signal x[m] is related to the analog xa (t) by x[m] = xa (mT ) The original signal xa (t) can be reconstructed from x[m] by ∞

xa (t) = =

x[m]

m=−∞ ∞

sin [(π /T ) (t − mT )] (π /T ) (t − mT )

x[m] sinc(t/T − m)

(11.1.2a) (11.1.2b)

m=−∞

where

.

sinc(τ ) =

1 sin π τ πτ

τ = 0 otherwise

(11.1.3)

Because Eq. (11.1.2) can be used to ﬁnd xa (t) for any value of t, it can be used as the basis for a time shifter that can shift a sampled signal by an arbitrary amount. To realize a delay of τ , we need to calculate values of xa (t) for t = nT − τ . xdly (nT ) = xa (nT − τ ) =

( τ) x[m] sinc n − m − T m=−∞ ∞

(11.1.4)

Equation (11.1.4) corresponds to the “usual” view of sinc interpolation, depicted in Figure 11.6, in which each input sample x[m] is used to weight a sinc function that is centered at t = mT . The interpolated value at any arbitrary time td , is obtained as the sum of the values at time td of all the weighted sinc functions. However, because

x [m1]

x [m1]sinc(t d - m1T )

m1T x [m2 ]

m2T

x [m2 ]sinc(td - m2T ) m3T x [m3 ]sinc(t d - m3T )

x [m3 ]

m4T

371

x [m4 ]sinc(t d - m4T ) x [m5 ]sinc(t d - m5T )

x [m4 ]

x [m5 ]

x (t d )

Figure 11.6

Interpolation using sinc functions.

372

Synchronization and Signal Shifting

Chapter 11

sinc(τ ) is symmetric about τ = 0, it is a simple matter to show that the value at t = nT − τ of a sinc function centered at t = mT is equal to the value at t = mT of a sinc function centered at t = nT − τ . This observation leads to an equivalent view of sinc interpolation at time td , which has a set of weighted sinc functions all centered at td , as depicted in Figure 11.7. The sinc weighted by x[m1 ] is evaluated at time m1 T , the sinc weighted by x[m2 ] is evaluated at time m2 T , and so on. The equation corresponding to this alternate view of interpolation is given by xdly (nT ) = xa (nT − τ ) =

) ( τ x[m] sinc m + − n T m=−∞ ∞

(11.1.5)

This alternate view is used in the theoretical development of a continuous-delay model. Equation (11.1.3) indicates that for all integer values of τ other than zero, the value of sinc(τ ) is zero. This means that in cases where the desired shift, τ , is an integer multiple of T , the only nonzero summand in both Eqs. (11.1.4) and (11.1.5) will be the one for which m = n − τ /T , so the interpolation is merely selecting an input sample for each shifted output such that * τ+ xout [n] = x n − T The more interesting case is when τ is not an integer multiple of T . Using the alternative view of Figure 11.7, the peaks of all the sinc functions used to compute xout [n] occur at a time between samples k and k + 1, where / τ0 (11.1.6) k = n− T The output index n can be expressed in terms of the sinc alignment index k as 1τ 2 (11.1.7) n=k+ T Substitution of Eq. (11.1.7) into Eq. (11.1.4) yields xdly (nT ) =

( 1τ 2 τ ) − x[m] sinc k − m + T T m=−∞ ∞

Let τ represent the displacement between t = kT and the centers of the sinc functions; then 1τ 2 τ (1 − τ ) − = T T T

Section 11.1

373

Shifting Signals in Time

and ∞

(1 − τ ) xdly (nT ) = x[m] sinc k − m + T m=−∞

(11.1.8)

The values of sinc(τ ) become small for large values of τ , so the range of the summation in Eq. (11.1.8) can be truncated to a range of m for which the values of sinc(k − m − (1 − τ )/T ) are “signiﬁcant.” Assume that each output sample is to include contributions from M leading input samples that occur immediately prior to the sinc peak as well as contributions from M lagging input samples that occur immediately after the sinc peak. The M leading samples are x[k + 1 − M] through x[k], and the M lagging samples are x[k + 1] through x[k + M]. The interpolation equation then becomes k+M

(1 − τ ) x[m] sinc k − m + y[n] = T m=k−M+1

(11.1.9)

For any given value of k, the summation will require input samples k − M + 1 through k + M, and the result will be stored in output sample k + τ /T . For given values of M and τ , the sinc function needs to be evaluated at the same 2M abscissae for every value of the output index n. This fact allows a continuous-delay model to compute a set of 2M sinc factors at the beginning of a simulation and use these same values for generating each output for as long as the delay remains constant. It turns out that an interpolating delay element based on Eq. (11.1.9) is really nothing more than an FIR ﬁlter in which the ﬁlter coefﬁcients h[p] are obtained as (1 − τ ) h[p] = sinc p − M + T

p = 0, 1, . . . , 2M − 1

Block Mode Considerations Before using Eq. (11.1.9) to implement a delay model, it is useful to explore how the various sample locations are related to the block boundaries of the input and output signals. In the discussions so far, the indices n and k have been assumed to start at zero and continually increment until the simulation ends. A different notation is needed to indicate indices with a particular block. In the development that follows, k [B] is used to indicate a sample index within input block B. The intrablock index k [B] and corresponding global index k

x [m1] x [m1]sinc(m1T - t d ) x [m2 ]

x [m2 ]sinc(m2T - t d )

x [m3 ]sinc(m3T - t d )

374

x [m4 ]sinc(m4T - t d ) x [m4 ]

x [m5 ]sinc(m5T - t d )

x [m5 ]

x (t d )

Figure 11.7 Alternate view of sinc interpolation.

Section 11.1

375

Shifting Signals in Time

are related by

k [B] =

⎧ ⎪ ⎪ ⎪ ⎨

B=1

k

⎪ ⎪ ⎪ Nb ⎩ k− B−1

otherwise

b=1

where Nb is the number of samples in block b. (Note that in PracSim, block numbering begins with one rather than zero because it is convenient to keep pass numbering and block numbering the same, and pass zero is reserved for some advanced consistency checking that does not generate any blocks of signal samples.) To simplify the initial development, let’s assume that M + τ /T < N . For development of implementation rules, it is more convenient to express the position of the sinc window in terms of its ﬁrst and last samples rather than in terms of the alignment index k. The alignment index is a contrived quantity introduced solely for the purpose of allowing window position and window length to be considered independently in the foregoing development. Let mL represent the global index of the leftmost (i.e., ﬁrst) sample in the truncated sinc window, and let mR represent the global index of the rightmost sample in the truncated sinc window: mL = k − M + 1 mR = k + M mR (1 − τ ) x[m] sinc mL + M − m − 1 + y[n] = T m=m L

The output index n can be expressed in terms of mL as 1τ 2 n = mL + M − 1 + /τ 0 T = mL + M + T Consider input block B containing NB samples. The earliest output sample that depends on any inputs from block B is the output computed when m[B] R = 0 or mR =

B−1

Nb

b=1

Generation of this output also involves samples m[B−1] = NB−1 − 2M + 2 through m[B−1] = NB−1 − 1 saved from input block B − 1. (When B = 1, the values of

376

Synchronization and Signal Shifting

Chapter 11

these saved samples are each assumed to be zero.) Using Eq. (11.1.7), the index for this output sample is obtained as B−1 1τ 2 (11.1.10) nE = Nb − M + T b=1 The latest output sample that can be computed before input block B + 1 becomes available is the one computed when m[B] R = NB − 1 or m R = NB − 1 + =

B

B−1

Nb

b=1

Nb − 1

b=1

Using Eq. (11.1.7), the index for this output sample is obtained as B−1 1τ 2 −1 nL = N b + NB − M + T b=1 The output samples, indexed nE through nL , comprise a sequence of exactly NB samples, and it seems logical to keep them together as a single output block. Let’s denote this block as output block D and not immediately assume that D = B. Close examination of Eq. (11.1.10) reveals that the number of input samples prior to block B and the number of output samples prior to block D differ by τ /T − M. When τ /T = M, the difference is zero, and it seems logical to make D = B and keep the output block size equal to the input block size. When τ /T < M, the number of input samples prior to block B is greater than the number of output samples prior to block D. If τ /T − M is less than the nominal block size, it still makes sense to make D = B, but one output block (block 1 is the logical choice) needs to be shorter than the corresponding input block. If τ /T − M is greater than the nominal block size, we are faced with a choice. We can totally eliminate a number of output blocks such that at most one output block needs to be shortened, or we can keep D = B and shorten two or more output blocks. When τ /T > M, the number of input samples prior to block B is less than the number of output samples prior to block D. This case arises when the delay is larger than half the span of the truncated sinc window. The sequence of output samples will begin with a preamble of τ /T − M zeros taking the place of output samples

Section 11.1

Shifting Signals in Time

377

whose calculation would require delayed samples from input blocks prior to block 1. Signal management is easier if the nominal block size is made the maximum block size. Therefore, when τ /T > M, a number of extra output blocks must be introduced so that the size of any one block does not exceed the nominal size. The ﬁrst sample in output block B has an intrablock index of n[B] = 0, which corresponds to a global index n = (B − 1)N . The position of the sinc window for generating this output is obtained from Eq. (11.1.6) as / τ0 k1 = (B − 1)N − T The ﬁrst input sample needed for the summation in this position has a global index m1A given by / τ0 −M +1 (11.1.11) m1A = (B − 1)N − T 1τ 2 −M +1 (11.1.12) = (B − 1)N − T and the ﬁnal input sample needed has a global index m1B given by / τ0 +M m1B = (B − 1)N − 1 Tτ 2 +M = (B − 1)N − T

(11.1.13) (11.1.14)

The ﬁnal sample in output block B has an intrablock index of n[B] = N − 1, which corresponds to a global index n = BN − 1. The position of the sinc window for generating this output is obtained from Eq. (11.1.6) as / τ0 k2 = BN − 1 − T The ﬁrst input sample needed for the the summation in this position has a global index m2A given by / τ0 −M +1 (11.1.15) m2A = BN − 1 − 1τ 2 T = BN − −M (11.1.16) T and the ﬁnal input sample needed has a global index m2B given by / τ0 +M m2B = BN − 1 − 1 Tτ 2 +M = BN − 1 − T

(11.1.17) (11.1.18)

378

Synchronization and Signal Shifting

Chapter 11

When τ /T > M, the early samples in output block B will be generated using only samples from input block B − 1, as depicted in Figure 11.8. Curly braces are used to indicate the span of the sinc interpolation window for various output samples. Speciﬁcally, when τ /T = M + p, 0 < p ≤ (N + 1 − 2M), the ﬁrst p output samples in block B are generated using only samples from input block B − 1. The ﬁrst sample in output block B is computed when 1τ 2 m1A = (B − 1)N − −M +1 T = (B − 1)N − (M + p) − M + 1 Large Delays

= (B − 2)N + N − 2M − p + 1 or m[B−1] = N − 2M − p + 1 1A The ﬁrst sample in output block B that can be computed without using any samples from input block B − 1 corresponds to the point at which m[B] 1A = 0 or m1A = (B − 1)N. This sample is stored in location n[B] = 2M + p − 1. Evaluation of Eqs. (11.1.15) and (11.1.17) for τ /T = M + p reveals that the ﬁnal sample in output block B is generated using samples N − 2M − p through N − 1 − p from input block B. When τ /T = M + p for p > (N + 1 − 2M), some of the early samples in output block B will require input samples from input blocks B − 2 and prior. Small Delays When the delay is smaller than half the width of the sinc window, the generation of some samples late in output block B will require samples from input block B + 1, as depicted in Figure 11.9. Speciﬁcally, when τ /T = M − q, 0 < q < M, generation of sample N − 1 in output block B requires samples 0 through q − 1 from input block B + 1. Generation of the ﬁrst sample in output block B uses samples N − 2M + q + 1 through N − 1 from input block B − 1 plus samples 0 through q from input block B. Because the sinc interpolation sometimes must wait for future samples, even when used to implement a delay, it encounters buffering difﬁculties similar to those discussed in Section 11.1.2 for signal advance models. 11.1.3.2

Dynamic Block Size

The ﬁrst output sample that depends upon block B input is generated when the sinc window is positioned such that its ﬁnal sample is aligned with the ﬁrst sample in input block B. Thus, the ﬁrst sample in output block B is computed using one input from block B and 2M − 1 inputs from block B − 1. The ﬁnal output that can be

379

Shifting Signals in Time

N -1 N -1

N - 1- p

N - 2M - p

2M - 1

p -1

N -1 0

input block B

0

N - 2M

N - 2M - p + 1

input block B - 1

2M + p - 1

Section 11.1

output block B

Figure 11.8

Relationship between input and output blocks when τ /T > M.

generated before block B + 1 becomes available is computed when the sinc window is positioned such that its ﬁnal sample is aligned with the ﬁnal sample in input block B, as depicted in Figure 11.10. If there are N samples in the input block, there will be a total of N output samples corresponding to the end of the sinc window aligning with each of the N input samples. However, it would be inappropriate to issue these N outputs as output block B. Assuming that all input blocks have ﬁxed length N , sample N − 1 of block B has a global index of BN − 1. When the ﬁnal sample of the sinc window is aligned with this sample, k = BN − 1 − M and the corresponding output index is obtained from Eq. (11.1.7) as 1τ 2 n = BN − 1 − M + T which corresponds to a sample in block B + 1 or beyond whenever τ /T > M.

380

Synchronization and Signal Shifting

N -1

q -1

N - 2M + q

2M - 1 2M - q - 1

q

0

input block B

0

N - 2M + q + 1

input block B - 1

Chapter 11

output block B

Figure 11.9

Relationship between input and output blocks when τ /T < M.

The output that should be issued as sample N − 1 of block B is generated when 1τ 2 k = BN − 1 − T or when the ﬁnal sample of the sinc window is aligned with input sample m where m is obtained as m = k+M = BN − 1 + M −

1τ 2 T

Consider the very ﬁrst output sample indexed by n = 0. Equation (11.1.7) indicates that for this sample, the sinc alignment index is given by 1τ 2 k=− T

Section 11.1

381

Shifting Signals in Time

KL - M - 1 0

1

KL

N -1

2

⎡τ ⎤ KL + ⎢ ⎥ ⎢T ⎥

N -1

Figure 11.10 Alignment of sampling function for generating ﬁnal sample in output block.

For this alignment, the summation (11.1.9) involves input samples x[m0A ] through x[m0B ], where m0A = − m0B = −

1τ 2 1 Tτ 2 T

−M +1 +M

For τ /T > M, the summation for y[0] involves only inputs x[m] for which m < 0. Assuming that x[m] = 0 for m < 0, the proper output value should be y[0] = 0. The summation will not begin to involve inputs x[m] for m ≥ 0 until the output index n equals or exceeds τ /T − M. For τ /T = M with n = 0, the ﬁnal summand in Eq. (11.1.9) involves x[0]. All of the other summands involve x[m] for m < 0. For τ /T < M, the summand for y[0] involves a number of inputs x[m] for m ≥ 0. Assuming that the input block has a length of N, the ﬁnal sample in the block will have an index m = N − 1. When the ﬁnal sample of the sinc window is aligned with this sample, the alignment index is k = N − 1 − M, and the corresponding

382

Synchronization and Signal Shifting

output index is obtained from Eq. (11.1.7) as n=N −1−M +

Chapter 11

1τ 2

T Thus, if the ﬁrst output block contains all of the samples that can be generated from the ﬁrst input block, there will be N − M + τ /T samples in this block. The ﬁnal 2M − 1 samples from the input block must be saved for use in generating the ﬁrst 2M − 1 samples in the second output block. The ﬁrst sample in the second output block is generated when the sinc window is positioned such that its ﬁnal sample is aligned with the ﬁrst sample of the second input block. Thus, the ﬁrst sample will be computed using one input from the second block and 2M − 1 inputs that were saved from the ﬁrst block. The ﬁnal output that can be generated using the second input block is computed when the sinc window is positioned such that its ﬁnal sample is aligned with the ﬁnal sample in the input block. If there are N samples in the input block, there will be N output samples—one output corresponding to each of the N possible alignments of the sinc window. If τ /T is larger than N, there may be one or more all-zero output blocks before the interpolation starts using actual input samples. In such a situation it would be possible to determine the exact length of the all-zero preamble and (1) prepend this preamble to the ﬁrst output block computed from actual inputs, (2) issue this preamble as one large block during any pass prior to the pass that begins using actual inputs, or (3) issue this preamble as a number of smaller blocks spread over all passes prior to the pass that begins using actual inputs. 11.1.3.3

Continuous Delay Model

Table 11.3 summarizes the model ContinuousDelay that can be used for delaying signals by arbitrary amounts. This model is provided as a template that can be instantiated for signals of various types. The model can operate in any of the four different modes described for DiscreteDelay. Additional constructors are provided that do not require the controls for gating and dynamic delay to be connected if they are not needed. The model uses interpolation to delay the input signal by intervals that in general are not integer multiples of the sampling interval. The particular interpolation technique to be used is selected by the value of Interp Mode that is read from the parameter input ﬁle when the simulation is being conﬁgured. 1. INTERP MODE LINEAR. Delayed signal values are determined using simple linear interpolation between the point immediately before the desired time and the point immediately after the desired time.

Section 11.1

383

Shifting Signals in Time

2. INTERP MODE QUADRATIC. Quadratic interpolation is used to determine delayed signal values. 3. INTERP MODE SINC. Delayed signal values are interpolated using the sampling function (sin x)/x. The number of points used for this interpolation is speciﬁed by an input parameter to the model. Near the end of each block, the model must save a number of input samples for use during processing of the subsequent block. It would be possible to draw most samples to be processed directly from the input signal buffer, and only use the internal buffer for those samples that have been saved from the previous block. However, it is conceptually easier to copy samples from the input buffer to an internal interpolation buffer and then perform all processing using samples from this internal buffer. Let’s assume that this internal buffer is implemented as a circular buffer having L ≥ 2M locations. At the beginning of pass 1, the model initializes locations L − M through L − 1 to zero. The ﬁrst M input samples are read into locations 0 through M − 1. At this point, the buffer is set up for interpolating a value at a time t, where −T < t ≤ 0. In other words, the maximum available right-bracketing input sample is x0 . Saying that input sample xk+1 is the maximum available right bracket means that samples xk+1 through xk+M but not xk+M+1 have been read into the interpolation buffer. By convention in PracSim, when rate-changing is being performed, the ﬁrst output sample always has the same value as the ﬁrst input sample. Then, if the output waveform is being compressed in time, the second output is interpolated at some time t, where 0 < t ≤ T . In this case, the second output can be generated without reading additional inputs into the interpolation buffer. If the output waveform is being stretched in time, then the second output is interpolated at some time t, where T < t ≤ 2T . In this case, it is necessary to read input sample xM into the buffer and thereby change the maximum available right-bracket sample from x0 to x1 . In general, to compute output sample yN , the maximum required right-bracketing input sample will be xk+1 , where k + 1 is obtained as ,

N Tout k+1= Tin

-

Whenever the available right-bracket index equals or exceeds the required rightbracket index, the output sample can be interpolated without reading further input samples into the interpolation buffer. Whenever the required right-bracket index exceeds the available right-bracket index, additonal samples must be read into the interpolation buffer.

384

Synchronization and Signal Shifting

Table 11.3

Summary of model ContinuousDelay.

Constructors: ContinuousDelay< T > ::ContinuousDelay( char* instance name, PracSimModel* outer model, Signal* in signal, Signal* out signal, Control *new delay, Control *delay change enabled ) ContinuousDelay< T > ::ContinuousDelay( //note 2 char* instance name, PracSimModel* outer model, Signal* in signal, Signal* out signal, Control *new delay) ContinuousDelay< T > ::ContinuousDelay( //note 3 char* instance name, PracSimModel* outer model, Signal* in signal, Signal* out signal) Parameters: DELAY MODE T Delay Mode; INTERP MODE T Interp Mode; double Initial Delay; int Max Delay; Notes: 1. BlockSize for this model is set by the PracSim system. 2. This constructor does not support DELAY MODE GATED. 3. This constructor does not support DELAY MODE GATED or DELAY MODE DYNAMIC. 4. Source code is contained in ﬁle contin delay T.cpp.

Chapter 11

Section 11.2

11.2

385

Correlation-Based Delay Estimation

Correlation-Based Delay Estimation

A simple cross-correlation between a transmitted waveform and the corresponding received waveform can be used to estimate the delay experienced by the waveform in passing through a simulated channel. In continuous time, the correlation of x(t) and y(t) is deﬁned by ∞ Rxy (τ ) = x(t)y ∗ (t + τ )dt −∞

Subject to certain constraints, if y(t) is simply a delayed version of x(t), the value of τ for which z(τ ) is maximized is equal to the delay between x(t) and y(t). One important constraint is that x(t) be a waveform that has “good” autocorrelation properties. The autocorrelation Rx (τ ) is simply the correlation of a signal with itself, as in ∞ Rx (τ ) = x(t)x ∗ (t + τ )dt −∞

A signal is said to have good autocorrelation properties if the autocorrelation function Rx (τ ) has a single maximum that is sharply peaked and easily distinguished from other large “near-maximum” values. A sinusoid is an example of a waveform with “bad” autocorrelation properties—the autocorrelation of sin ωt is a periodic train of impulses at delays τ = 2kπ /ω where k = 0, ±1, ±2, . . . . In simulations, correlations must be performed between ﬁnite blocks of samples from discrete-time signals. The length-N discrete-time correlation of x[n] and y[n] is deﬁned by Rxy [k] =

N−1

x[n] y ∗ [n + k]

(11.2.1)

n=0

Subject to certain constraints, estimating the delay can be mapped into the equivalent problem of ﬁnding the delay between x[n] ˜ and y[n] ˜ where x[n] ˜ and y[n] ˜ are the periodic extensions of x[n] and y[n] x[mN ˜ + n] = x[n] y[mN ˜ + n] = y[n]

for n = 0, 1, 2, . . . , N − 1 and m = 0, ±1, ±2, . . .

Assuming that N is a power of 2, this mapping allows the correlation to be performed using FFT-based fast correlation techniques. Speciﬁcally, the correlation of x[n] and y[n] can be obtained as the inverse FFT of the product X[m]Y [m], where X[m] and

386

Synchronization and Signal Shifting

Chapter 11

Y [m] are respectively the FFTs of x[n] and y[n]. The result of the IFFT is searched to ﬁnd the sample with the greatest magnitude, and the index of this sample is multiplied by the sampling interval to obtain the delay interval. In other words, if sample L in the IFFT result has the largest magnitude, then the delay is τ = LTS , where TS is the sampling interval. Because of the assumed periodicities implicit in the DFT, it is impossible to distinguish between the case of y[n] delayed by L samples with respect to x[n] and the case of x[n] delayed by N − L samples with respect to y[n]. There are three different approaches for coping with this ambiguity. If it is known that y[n] is always delayed with respect to x[n], then all delays can be assumed to be positive and computed as τ = LTS . If it is not known whether y[n] is delayed with respect to x[n] or if x[n] is delayed with respect to y[n], but the magnitude of the delay is restricted to be less than N TS /2, then delay can be computed as τ=

LTS (L − N )TS

0 ≤ L ≤ N/2 (N/2) < L < N

A negative value of τ indicates that x[n] is delayed with respect to y[n]. The third, and most robust, approach involves padding both x[n] and y[n] with N zero-values samples prior to performing the FFTs. The result of the IFFT will then have a length of 2N, and the delay can be computed as τ=

LTS (L − 2N )TS

0≤L≤N N < L < 2N

This approach accomodates delays from (1−N )TS to (N −1)TS without ambiguity. However, as the magnitude of the delay approaches N samples, the amount of overlap between x[n] and y[n] decreases, making the results of the ﬁnite correlation very unreliable. The estimation of delays larger than approximately 0.8N TS is usually performed as a two-step process. Using knowledge about the source and nature of the delay, an analyst can make a rough estimate of the delay between signals x[n] and y[n]. In the simulation, an extra delay element is used to explicitly create xD [n] as a delayed version of x[n]. The amount of delay between x[n] and xD [n] is an integer number of sampling intervals and is chosen to be large enough that the delay between xD [n] and y[n] is “guaranteed” to be less than 0.8N sample intervals. This residual delay can be estimated using fast correlation and added to the ﬁxed delay between x[n] and xD [n] to obtain the total delay between x[n] and y[n].

Section 11.2

11.2.1

Correlation-Based Delay Estimation

387

Software Implementation

The CoarseDelayEstimator model, summarized in Table 11.4, uses fast correlation to estimate the delay between an input waveform in sig and a reference waveform ref sig. The model assumes that in sig is similar to a delayed version of the reference waveform. If the two waveforms are unrelated, the CoarseDelayEstimator model still returns a delay estimate that corresponds to the peak of the correlation. The model does not perform any thresholding that would be needed to determine if the peak represents a valid alignment between similar waveforms or simply the largest correlation between two random uncorrelated waveforms.

Table 11.4

Summary of model CoarseDelayEstimator.

Constructor: CoarseDelayEstimator ::CoarseDelayEstimator( char* instance name, PracSimModel* outer model, Signal* in sig, Signal* ref sig, Control* estim valid ctl, Control* delay est ctl, Control* samps delay est ctl); Parameters: int Num Corr Passes bool Limited Search Window Enab int Search Window Beg int Search Window End bool Invert Input Sig Enab Notes: 1. This model is a signal sink and has no output signals. 2. The nominal block length for the input signals is set by the PracSim infrastructure. The correlation length is 2k+1 where 2k is the smallest power of two that equals or exceeds the nominal block length. 3. Source code is contained in ﬁle coarse delay est.cpp.

388

Synchronization and Signal Shifting

Chapter 11

A block diagram of the CoarseDelayEstimator model is shown in Figure 11.11. The input signals in sig and ref sig must have nominal block lengths that are equal. However, because these signals generally arrive via two very different paths, they may have different values for valid block length in any given pass. The model must employ special measures to ensure that different block lengths do not cause an apparent block-to-block variation in the relative delay between in sig and ref sig. An instance of the SignalReblocker class is created for each input to ensure that an equal number of samples from each input signal is used each time a correlation is performed. If there is an insufﬁcient number of available samples for either waveform, the model exits without generating a new delay estimate. The SignalReblocker objects accumulate unused samples for use in subsequent passes. The parameter Limited Search Window Enab, when set true, causes the search for the correlation peak to be conﬁned to delays greater than Search Window Beg and less than Search Window End. If Limited Search Window Enab is set false, the search window parameters need not be speciﬁed, and the search is conducted over all delays from (1 − N )TS to (N − 1)TS . If the parameter Invert Input Sig Enab is set true, each sample of in sig is multiplied by −1 before the correlation is performed. Correlation is performed for the number of passes indicated by Num Corr Passes. Once this limit is reached, the model operates in a bypass mode for the remainder of the simulation. The delay estimate is conveyed to other models via three controls, which are outputs from CoarseDelayEstimator. Once a correlation has been performed, (1) the value of the samps delay est ctl control is set to the estimated delay in samples, (2) the value of the delay est ctl control is set to the estimated delay in normalized time units, and (3) the value of estim valid ctl is set to true indicating that the delay estimates are valid for use by downstream models. On subsequent passes in which a correlation is performed, the values of delay est ctl and samps delay est ctl are updated to reﬂect the results of the new correlation. Figure 11.12 shows a block diagram indicating how these controls might be used in a simulation. A simulation similar to this block diagram is provided in the ﬁle coarsedelayest sim.cpp.

11.3

Phase-Slope Delay Estimation

The delay estimation approach described in Section 11.2 can estimate a delay to the nearest integer multiple of the sampling interval. For certain applications, ﬁner estimates may be needed. Such estimates can be obtained using the differential phase slope approach, which is based on the time-delay property of the Fourier transform.

Section 11.3

389

Phase-Slope Delay Estimation

in_sig

ref_sig

signal reblocker

signal reblocker Invert_Input_Sig_Enab

negate

conjugate

FFT

sampleby-sample multiply

FFT

IFFT model parameters sampleby-sample magnitude

Ltd_Search_Win_Enab Search_Win_Beg Search_Win_End

Samp_Intvl

search for peak

est_is_valid_ctl delay_est_ctl

samps_delay_est_ctl

Figure 11.11

Block diagram for CoarseDelayEstimator model.

390

Synchronization and Signal Shifting

Chapter 11

BitGener

BasebandWaveform

ButterworthFilterByIir

DiscreteDelay

in_sig

ref_sig

CoarseDelayEstimator

est_is_valid_ctl samps_delay_est_ctl

in_sig

DiscreteDelay

delay_change_enab_ctl dynam_delay_ctl out_sig

these two signals are time aligned

Figure 11.12 Block diagram for a simulation that uses the CoarseDelayEstimator model.

Section 11.3

Phase-Slope Delay Estimation

391

Consider a signal x[n] having X[m] as its discrete Fourier transform. If x[n] is delayed by τ , the Fourier transform of the delayed signal is equal to exp(−j 2π τ f ) times the transform of the original signal. x[n] ⇐⇒ X[m] xD [n] ⇐⇒ X[m] exp(−j 2π τ f ) If the transform of x[n] is multiplied by the conjugate of the transform of xD [n], the result will have a magnitude of |X[m]|2 and a phase of θ [m] = 2π τ mF . Theoretically, the delay τ can be determined by estimating the phase θ [m] at any discrete frequency m and computing θ [m] 2π mF However, in practice, an estimate based on a single frequency is subject to various sources of error. A robust estimate based on multiple frequencies can be obtained by observing that the phase function θ [m] is a linear function of the frequency index m. The value for θ [m] can be estimated at a number of frequencies, and then a line can be ﬁtted to the estimated points using linear regression, a standard statistical analysis technique described in numerous texts. The estimated delay is then given by the slope of the ﬁtted line. Figure 11.13 shows plots of |X[m]|2 and θ [m] in degrees for the case of an NRZ waveform (see Chapter 3) for a random sequence of bits. The sampling interval is T = 0.125, and there are eight samples per bit. The nominal block length is 2048 samples padded with zeros to make a correlation record of length 4096. The delay is τ = 1.46. There are two important phenomena illustrated in this ﬁgure. First, the phase spectrum exhibits “wrapping” such that the phase values remain in the range −180 to +180 degrees. Second, the phase spectrum appears to become “noisier,” exhibiting larger deviations from a straight line at those frequencies for which the power spectrum becomes very small. The linear regression should be conﬁned to a range of frequencies for which wrapping does not occur and the deviations from linear remain small. The explicit formula for computing the delay estimate is

# 2 NT m ¯ θ m − θ¯ m=m1 (m − m) τ= # 2 π m ¯ 2 m=m1 (m − m) τ=

where θ is in radians and the overbar denotes the time average from m = m1 to m2 . For large delays, the slope of the phase spectrum becomes steep and wrapping occurs within a relatively small number of samples. Therefore, the two-step process described in Section 11.2 is extended to a three-step process when high resolution estimates of large delays are desired:

392

Synchronization and Signal Shifting

Chapter 11

1. Using knowledge about the source and nature of the delay, an analyst makes a rough estimate of the delay between signals x[n] and y[n]. In the simulation, an extra delay element is used to explicitly create xD [n] as a delayed version of x[n]. The amount of delay between x[n] and xD [n] is an integer number of sampling intervals and is chosen to be large enough that the delay between xD [n] and y[n] is “guaranteed” to be less than N sample intervals. 2. The residual delay is estimated using fast correlation and added to the ﬁxed delay between x[n] and xD [n]. 3. The delay between the adjusted xD [n] and y[n] is estimated using the differential phase slope approach. The FineDelayEstimator model, summarized in Table 11.5, uses fast correlation to estimate the delay between an input waveform in sig and a reference waveform ref sig. The model assumes that in sig is similar to a delayed version of the reference waveform. A block diagram of the FineDelayEstimator model is shown in Figure 11.14. The input signals in sig and ref sig are reblocked using the SignalReblocker class, as discussed for the CoarseDelayEstimator model in Section 11.2. The parameters Regression Index Begin and Regression Index End specify the range of frequencies over which regression of the phase is to be performed. (A deluxe version of FineDelayEstimator might be designed to use knowledge about the spectra of various types of waveform to automatically set the range of the frequency index over which the regression is performed.) If the parameter Invert Input Sig Enab is set true, each sample of in sig is multiplied by −1 before the phase is estimated. Estimation is performed for the number of passes indicated by Num Corr Passes. Once this limit is reached, the model operates in a bypass mode for the remainder of the simulation. Because the FineDelayEstimator model is often used to reﬁne a delay estimate produced by the CoarseDelayEstimator model, an input control, estim enab cntl, is provided so that operation of FineDelayEstimator can be suppressed until after other models have applied a coarse delay adjustment to the ref sig input signal. The ﬁne estimate of delay is conveyed to other models via two controls, which are outputs from FineDelayEstimator. Once an estimate has been computed, the value of estimated delay ctl is set to the estimated delay in normalized time units, and the value of estim valid ctl is set to true, indicating that the delay estimate is valid for use by downstream models. On subsequent passes in which a phase-slope estimation is performed, the value of estimated delay ctl is updated to reﬂect the results of the new estimate.

Section 11.4

393

Changing Clock Rates

20 (a)

magnitude (dB)

10 0 -10 -20 -30 -40 0

512

1024 sample index

1536

2048

phase (degrees)

180 (b)

120 60 0 -60 -120 -180 0

512

1024 sample index

1536

2048

Figure 11.13 Spectrum for ﬁne delay estimation for simulated ideal NRZ waveform: (a) power spectrum, (b) differential phase spectrum.

Figure 11.15 shows a block diagram idicating how these controls might be used in a simulation. A simulation similar to this block diagram is provided in the ﬁle finedelayest sim.cpp.

11.4

Changing Clock Rates

It is sometimes necessary to model the effects of clock-rate discrepancies between a transmitter and a receiver. For example, in a data transmitter, baseband I and Q inputs

394

Synchronization and Signal Shifting

Table 11.5

Chapter 11

Summary of model FineDelayEstimator.

Constructor: FineDelayEstimator ::FineDelayEstimator( char* instance name, PracSimModel* outer model, Signal* in sig, Signal* ref sig, Signal* out sig); Parameters: int Num Corr Passes bool Limited Search Window Enab int Search Window Beg int Search Window End bool Invert Input Sig Enab Notes: 1. Source code is contained in ﬁle fine delay est.cpp.

for a quadrature modulator are often generated by passing a symbol-rate sequence of complex samples through a pair of pulse-shaping ﬁlters. Ideally, a similar pair of baseband waveforms will be available at the output of a quadrature demodulator in the receiver. These waveforms are then sampled at the symbol rate to obtain the sequence of symbol values. The symbol-rate clock signals in the transmitter and receiver are nominally equal, but in practical systems, the two clock rates usually differ by some small amount, and it is often necessary to model the effects of this difference. A constant offset between clock rates is technically a frequency shift, but the techniques for modeling a small offset are more closely related to the techniques of time-shifting than they are to the techniques of frequency-shifting. The interpolation techniques of Section 11.1.3.1 can easily be adapted for making small rate changes. In a ﬁxed-delay model, a single set of interpolation coefﬁcients is computed based on the relative time offset between the set of available samples and the set of desired samples. In a rate-change model, the relative off-

Section 11.4

395

Changing Clock Rates

in_sig

ref_sig

signal reblocker

signal reblocker Invert_Input_Sig_Enab

negate

FFT

conjugate

sampleby-sample multiply

sampleby-sample phase

FFT

model parameters Regression_Start Regression_Stop

linear regression

NT p

slope est_is_valid_ctl

delay_est_ctl

Figure 11.14

Block diagram for FineDelayEstimator model.

set between available samples and desired samples is continuously changing, so in general, a new set of interpolation coefﬁcients must be computed for each output sample to be produced. In some cases, it would be possible to have a number of different sets of interpolation coefﬁcients and periodically cycle through these sets. Speciﬁcally, consider the rate change factor FR , deﬁned as FR =

Tin Rout = Tout Rin

396

Synchronization and Signal Shifting

Chapter 11

If Tin and Tout are rational numbers, then FR is a rational number. If FR is expressed as a ratio FR =

NF DF

where NF and DF are integers whose greatest common factor is 1, then a complete cycle of interpolation coefﬁcients will occur every DF output samples. Generating each cycle of DF output samples will consume exactly NF input samples. The RateChanger model, summarized in Table 11.6, takes the most general approach of computing a new set of interpolation coefﬁcients for each output sample. Table 11.6

Summary of model RateChanger.

Constructor: RateChanger ::RateChanger( char* instance name, PracSimModel* outer model, Signal* in sig, Signal* out sig); Parameters: int Num Sidelobes double Rate Change Factor Notes: 1. Source code is contained in ﬁle rate changer T.cpp. 2. This template model is instantiated for types int, float, and complex.

Section 11.4

397

Changing Clock Rates

BitGener

BasebandWaveform

ButterworthFilterByIir

ContinuousDelay

in_sig

ref_sig

in_sig

CoarseDelayEstimator

DiscreteDelay

delay_change_enab_ctl dynam_delay_ctl

est_is_valid_ctl samps_delay_est_ctl

out_sig

in_sig

ref_sig

in_sig

FineDelayEstimator

ContinuousDelay

delay_change_enab_ctl

est_is_valid_ctl

dynam_delay_ctl

delay_est_ctl

out_sig

these two signals are time aligned

Figure 11.15 Block diagram FineDelayEstimator model.

for

a

simulation

that

uses

the

Appendix 11A

EXAMPLE SOURCE CODE

The companion Web site includes nine Microsoft Visual Studio projects, each comprising a simulation that demonstrates and provides a test vehicle for a different signal shifter model, as listed in Table 11A.1. Table 11A.1

Projects in Shifters directory.

project DiscDelay DiscAdv ContinDelay ContinAdv RealCorrTest RateChange DftDelay CoarseDelayEst FineDelayEst

11A.1

featured model DiscreteDelay DiscreteAdvance ContinuousDelay ContinuousAdvance RealCorrelator RateChanger DftDelay CoarseDelayEstimator FineDelayEstimator

DiscreteDelay

As discussed in Section 11.1.1.2, DiscreteDelay has three different constructors, which are provided in Listings 11A.1 through 11A.3. Tasks common to all the constructor forms are performed by the Constructor Common Tasks method, which is provided in Listing 11A.4. The Execute method is provided in Listing 11A.5.

398

Section 11A.1 DiscreteDelay

Listing 11A.1 Constructor for DiscreteDelay model that supports gated, dynamic operation.

template< class T > DiscreteDelay< T >::DiscreteDelay( char* instance_name, PracSimModel* outer_model, Signal* in_sig, Signal* out_sig, Control *dynam_dly_ctl, Control *delay_chg_enab_ctl ) :PracSimModel( instance_name, outer_model) { MODEL_NAME(DiscreteDelay_T); this->Constructor_Common_Tasks( instance_name, in_sig, out_sig); //--------------------------// Controls Dynam_Dly_Ctl = dynam_dly_ctl; Delay_Chg_Enab_Ctl = delay_chg_enab_ctl; return; }

399

400

Example Source Code

Appendix 11A

Listing 11A.2 Constructor for DiscreteDelay model that supports ungated, dynamic operation.

template< class T > DiscreteDelay< T >::DiscreteDelay( char* instance_name, PracSimModel* outer_model, Signal* in_signal, Signal* out_signal, Control *dynam_dly_ctl ) :PracSimModel(instance_name, outer_model) { this->Constructor_Common_Tasks( instance_name, in_signal, out_signal); //-----------------------------------------// Controls Dynam_Dly_Ctl = dynam_dly_ctl; switch (Delay_Mode){ case DELAY_MODE_NONE: case DELAY_MODE_FIXED: case DELAY_MODE_DYNAMIC: break; case DELAY_MODE_GATED: ostrstream temp_stream; temp_stream Constructor_Common_Tasks( instance_name, in_signal, out_signal); char *message; ostrstream temp_stream; switch (Delay_Mode){ case DELAY_MODE_NONE: case DELAY_MODE_FIXED: break; case DELAY_MODE_DYNAMIC: temp_stream GetBlockSize(); Samp_Intvl = fsig_Input->GetSampIntvl(); Filter_Core->Initialize(block_size, Samp_Intvl); Osc_Output_Prev_Val = 0.0; OscOutput = 0; Phi_Sub_2 = 0; Prev_Input_Positive = true; Prev_Input_Val = 0; Prev_Filt_Val = 0; Prev_Time_Zc = 0; Prev_State = 0; Prev_Cap_Val = 0; Time_Of_Samp = 0.0; Prev_Osc_Phase = 0.0; }

433

434

Example Source Code

Listing 12A.4

Appendix 12A

Constructor for DigitalPLL model.

int DigitalPLL::Execute() { // pointers for signal data float float float float float float

*fsOutput_ptr; *fs_filtered_error_ptr; *fsOscOutput_ptr; *fsOscFreq_ptr; *fsOscPhase_ptr; *fsInput_ptr;

float float float float float

input_val; prev_input_val; osc_output_val; filt_val; inst_freq;

double samp_intvl; double err_sum=0; double err_avg; int block_size, is; double time_of_samp; double time_zc; double delta_T; double osc_phase; double cap_val; double prev_cap_val; double prev_osc_phase; double prev_filt_val; double prev_time_zc; double output_phase; double tau_n; int prev_state; int new_state; double time_dwell_plus; double time_dwell_minus; // set up pointers to data buffers for input and // output signals fsOutput_ptr = GET_OUTPUT_PTR( fsig_Output ); fs_filtered_error_ptr = GET_OUTPUT_PTR( fsig_Filtered_Error );

Section 12A.1 DigitalPLL

Listing 12A.4

continued.

fsOscOutput_ptr = GET_OUTPUT_PTR( fsig_Osc_Output ); fsOscFreq_ptr = GET_OUTPUT_PTR( fsig_Osc_Freq ); fsOscPhase_ptr = GET_OUTPUT_PTR( fsig_Osc_Phase ); fsInput_ptr = GET_INPUT_PTR( fsig_Input ); samp_intvl = Samp_Intvl; osc_output_val = Osc_Output_Prev_Val; prev_input_val = Prev_Input_Val; prev_filt_val = Prev_Filt_Val; prev_time_zc = Prev_Time_Zc; prev_state = Prev_State; prev_cap_val = Prev_Cap_Val; prev_osc_phase = Prev_Osc_Phase; block_size = fsig_Input->GetValidBlockSize(); fsig_Output->SetValidBlockSize(block_size); fsig_Filtered_Error->SetValidBlockSize(block_size); fsig_Osc_Output->SetValidBlockSize(block_size); fsig_Osc_Freq->SetValidBlockSize(block_size); fsig_Osc_Phase->SetValidBlockSize(block_size); for (is=0; is= 0) != Prev_Input_Positive){ // zero crossing has occurred time_zc = time_of_samp - samp_intvl * input_val /(input_val - prev_input_val); // compute elapsed interval delta_T = time_zc - prev_time_zc; // update oscillator phase inst_freq = Omega_Sub_0 + K_Sub_0 * prev_filt_val; osc_phase = prev_osc_phase + inst_freq * delta_T;

435

436

Example Source Code

Listing 12A.4

Appendix 12A

continued.

// based on osc_phase and prev_osc_phase, // determine if the oscillator waveform has // had a positive-going zero crossing between // times prev_time_zc and time_zc. // // Normalize osc_phase and prev-osc_phase if(osc_phase > TWO_PI){ osc_phase -= TWO_PI; prev_osc_phase -= TWO_PI; } if(osc_phase >= 0.0 && prev_osc_phase < 0.0){ // a positive-going zero crossing has // occurred, so compute the crossing time tau_n = -delta_T * prev_osc_phase/ ( osc_phase - prev_osc_phase); } else{ tau_n = 0.0; } //-----------------------------------------// do state machine for phase detector switch (prev_state){ case 1: if(tau_n !=0.0){ new_state = 0; time_dwell_plus = tau_n; time_dwell_minus = 0.0; } else{ new_state = 0; time_dwell_plus = delta_T; time_dwell_minus = 0.0; } break; case -1: if(!Prev_Input_Positive){ //step 3 Algorithm 12.3 new_state = -1; time_dwell_plus = 0.0; time_dwell_minus = delta_T; }

Section 12A.1 DigitalPLL

Listing 12A.4

continued.

else{ if(tau_n != 0.0){ //step 4a Algorithm 12.3 new_state = -1; time_dwell_plus = 0.0; time_dwell_minus = delta_T - tau_n; } else{ //step 4b Algorithm 12.3 new_state = 0; time_dwell_plus = 0.0; time_dwell_minus = 0.0; } } break; case 0: if(Prev_Input_Positive){ if(tau_n != 0.0 ){ // step 5a Algorithm 12.3 new_state = 0; time_dwell_plus = tau_n; time_dwell_minus = 0.0; } else{ // step 5b Algorithm 12.3 new_state = 1; time_dwell_plus = delta_T; time_dwell_minus = 0.0; } } else{ if(tau_n != 0.0 ){ // step 6a Algorithm 12.3 new_state = -1; time_dwell_plus = 0.0; time_dwell_minus = delta_T - tau_n; } else{ // step 6b Algorithm 12.3 new_state = 0; time_dwell_plus = 0.0; time_dwell_minus = 0.0; } } }

437

438

Example Source Code

Listing 12A.4

Appendix 12A

continued.

// Perform Filtering // if(time_dwell_plus == 0 && time_dwell_minus == 0){ // step 6 Algorithm 12.5 cap_val = prev_cap_val; filt_val = cap_val; } else{ if(time_dwell_minus > 0){ // step 7 Algorithm 12.5 cap_val = prev_cap_val*(1.0 time_dwell_minus/(Tau_1 + Tau_2)); filt_val = cap_val * (1.0 - (Tau_2/ (Tau_1+Tau_2))* (time_dwell_minus/delta_T)); } else{ // step 8 Algorithm 12.5 cap_val = prev_cap_val + (time_dwell_plus/(Tau_1 + Tau_2))* (Supply_Volts - prev_cap_val); filt_val = cap_val + (time_dwell_plus/delta_T)* (Tau_2/(Tau_1+Tau_2))* (Supply_Volts - cap_val); } } //--------------------------------------------// update delayed variables prev_state = new_state; prev_time_zc = time_zc; prev_osc_phase = osc_phase; prev_cap_val = cap_val; prev_filt_val = filt_val; Prev_Input_Positive = (input_val >= 0); } delta_T = time_of_samp - prev_time_zc; inst_freq = Omega_Sub_0 + K_Sub_0 * prev_filt_val; output_phase = prev_osc_phase + inst_freq * delta_T; *fsOutput_ptr++ = sin(output_phase);

Section 12A.1 DigitalPLL

Listing 12A.4

continued.

*fs_filtered_error_ptr++ = prev_filt_val; *fsOscPhase_ptr++ = output_phase; *fsOscFreq_ptr++ = inst_freq/TWO_PI; fsInput_ptr++; } Prev_Input_Val = prev_input_val; Prev_Filt_Val = prev_filt_val; Prev_Time_Zc = prev_time_zc; Prev_State = prev_state; Prev_Cap_Val = prev_cap_val; Time_Of_Samp = time_of_samp; Prev_Osc_Phase = prev_osc_phase;

err_avg = err_sum / block_size; BasicResults 1 Nin − dk−1 − RNout

14.1.2

Interpolation by Integer Factors

The basic idea behind interpolation is to produce new sample values between the existing samples. Consider the sampled signal and its continuous-frequency spectrum shown in Figure 14.2. With a sampling rate of FS , the simulation bandwidth x [n ] LPF

↓M

y [n ]

Figure 14.1 Block diagram of decimation, which consists of antialias ﬁltering followed by downsampling.

Section 14.1

Basic Concepts of Multirate Signal Processing

Table 14.1

467

Summary of template model Downsampler.

Constructor: Downsampler::Downsampler( char* instance name, PracSimModel* outer model, Signal* in signal, Signal* out signal); Parameter: int Decim Rate Note: 1. Source code is contained in ﬁle downsampler t.cpp.

extends from −FS /2 to FS /2, thus rejecting all but the baseband image, as shown in Figure 14.2(c). Suppose we wish to triple the sampling rate. Upsampling is accomplished by inserting two zero-valued samples between each pair of original samples to yield the sequence shown in Figure 14.3. With a new sampling rate of 3FS , the system bandwidth now extends from −3FS /2 to 3FS /2. Now three images of the original spectrum ﬁt within the system bandwidth, as shown in Figure 14.4. Lowpass ﬁltering is performed to limit the signal to a bandwidth equal to half the original sampling rate. This ﬁltering removes the two extra spectral images from the system bandwidth to yield the signal and spectrum depicted in Figure 14.5. For this reason, the ﬁlter in an interpolator is sometimes called an anti-imaging ﬁlter. Sometimes in the DSP literature, the introduction of the zero-valued samples, as in Figure 14.3, is described as compressing the signal’s spectrum by a factor of M. Upsampling does not really compress the spectrum, but this description arises from the common practice of using the sampling rate to normalize the frequencies in a DSP system. If the frequencies in Figure 14.2(b) are normalized by FS , the bandwidth of the baseband image is conﬁned to the normalized frequency range of ±1/2. After interpolation, the sampling rate is 3FS , and if the frequencies are normalized accordingly, the baseband image is conﬁned to the normalized frequency range of ±1/6. Thus, due to the change in normalization, the spectrum appears to have been compressed by a factor of three. To allow for maximum ﬂexibility in the choice of anti-imaging ﬁlters, PracSim does not include an integrated interpolation model. Instead, interpolation is accom-

468

Multirate Simulations

Chapter 14

(a)

−3FS

−2FS

−FS

FS

2FS

3FS

(b)

−FS 2

FS 2 (c)

Figure 14.2 Waveforms for discussion of interpolation: (a) sampled signal, (b) its spectrum, and (c) relationship between spectral images and simulation bandwidth.

plished by using an upsampler followed by a separate ﬁlter model. The Upsampler template model is summarized in Table 14.2.

14.1.3

Decimation and Interpolation by Noninteger Factors

The sampling rate of a signal can be changed by a rational factor L/M by ﬁrst interpolating by a factor of L and then decimating by a factor of M. If L > M, the net effect is interpolation by a factor of L/M. If M > L, the net effect is decimation

Section 14.2

469

Filter Design for Interpolators and Decimators

Figure 14.3 Sampled signal after zero-valued samples have been inserted.

−FS −3FS 2

FS 3FS 2

Figure 14.4 Relationship between simulation bandwidth and the spectrum of the signal after zero-valued samples have been inserted.

by a factor of M/L. Interpolation by a noninteger factor can be used to convert a compact disc (CD) signal into a digital audio tape (DAT) signal. The sampling rate for CD recordings is 44.1 kHz, and the sampling rate for DAT recordings is 48 kHz. Interpolation by a factor of 160 can be used to convert the CD sample rate from 44.1 kHz to 7056 kHz. Decimation by a factor of 147 can then be used to convert the 7056 kHz sample rate down to the DAT rate of 48 kHz.

14.2

Filter Design for Interpolators and Decimators

In DSP applications in which both sampling rates and ﬁlter lengths need to be kept as low as possible, the design of ﬁlters for decimation and interpolation can sometimes be quite a challenge. However, in simulations, which typically sample signals at rates many times higher than the Nyquist rate, and which tolerate whatever ﬁlter lengths are needed for high-ﬁdelity modeling, the design of the necessary ﬁlters can be relatively easy. In DSP applications, the ﬁlter design difﬁculties are often eased by using a multistage approach [2], but in a simulation context, single-stage designs are almost always adequate.

470

Multirate Simulations

Chapter 14

(a)

−3FS

−2FS

−FS

FS

2FS

3FS

(b)

Figure 14.5 (a) Sampled signal from Figure 14.3, and (b) its spectrum after lowpass ﬁltering.

Table 14.2

Summary of template model Upsampler.

Constructor: Upsampler::Upsampler( char* instance name, PracSimModel* outer model, Signal* in signal, Signal* out signal); Parameter: int Interp Rate Note: 1. Source code is contained in ﬁle upsampler t.cpp.

Section 14.2

Filter Design for Interpolators and Decimators

471

The Remez exchange algorithm provides a convenient way to design FIR ﬁlters that can be used for interpolation and decimation. In most digital ﬁlter design programs, such as those provided in [2] or in various commercial packages, passband and stopband edge frequencies are normalized by the sample rate. For example, given a sampling rate of 10,000 samples per second, a passband edge at 3 kHz would be speciﬁed using a normalized value given by fp =

3 × 103 = 0.3 104

The other design parameter used by the Remez exchange is the ripple ratio, which is the ratio of maximum passband ripple to minimum stopband attenuation. For a maximum passband ripple of δ 1 = 0.025 and a minimum stopband attenuation of LS = 60 dB, the ripple ratio K can be obtained as K=

14.2.1

δ1 (−L 10 S /20)

=

0.025 = 25.0 10(−60/20)

Interpolation

Figure 14.4 shows the continous-frequency spectrum for a signal that has been upsampled by a factor of three. Three images of the original spectrum ﬁt within the new system bandwidth indicated by the unshaded area. The relative spacing of the images depicts a typical DSP situation in which the original sampling rate is only slightly larger than twice the original signal’s one-sided bandwidth. The interpolation ﬁlter must remove all but the center image; therefore, the ﬁlter’s passband must be wide enough to accomodate this image. The ﬁlter’s transistion band must be narrow so that the adjacent spectral image falls completely within the stopband. Example 14.1 Consider a signal consisting of four sinusoids of equal magnitudes. The normalized frequencies of the sinsoids are 3.0, 1.5, 0.75, and 0.375. The sampling rate is 8 samples per second. A segment of such a signal generated by the MultipleToneGener model is shown in Figure 14.6. The spectrum of this signal is shown in Figure 14.7. The spectrum shown was generated by the SpectrumAnalyzer model using the parameters listed in Table 14.3. After the signal is upsampled by a factor of four, the spectrum develops images as shown in Figure 14.8. The interpolation ﬁlter needs to have a ﬂat passband response and good attenuation in the stopband. The ﬁlter must pass the signal components at ±3 Hz while rejecting all components at ±5 Hz and beyond. A ﬁlter

472

Multirate Simulations

Chapter 14

having a passband edge frequency of 3.5 and a stopband edge frequeny of 4.5 should satisfy these requirements. Normalized to the sampling rate, these two frequencies are speciﬁed as fP = 3.5/32 = 0.109375 and fS = 4.5/32 = 0.140625. Figures 14.9 and 14.10 show the response of a 71 tap ﬁlter having these critical frequencies and a ripple ratio of 20. When the upsampled signal is passed through this ﬁlter the result will have the spectrum shown in Figure 14.11. The baseband image is passed with low distortion, and the undesired images are attenuated by more than 55 dB. The ultimate test of goodness for an interpolator would be some measure of the deviation between the the interpolated waveform and what the original waveform would have been if it had been originally generated at the higher sample rate. While not possible in a real system, this kind of test can easily be performed in a simulated system. Simply generate a test waveform at the sample rate desired for the interpolated signal and downsample to the rate that will be used for the interpolator input in the production runs of the simulation. The ouput of the interpolator can then be compared to the original high-sample-rate waveform. The ﬁltering operation will introduce a delay in the interpolated signal, and the original reference signal must be delayed by the same amount before a sample-by-sample comparison can be performed to gauge the ﬁdelity of the interpolated signal. When the simulation depicted in Figure 14.12 is run using the interpolation ﬁlter of Figure 14.9, the measured signal-to-distortion ratio is approximately 33.3 dB. This simulation generates a reference signal directly at the interpolator’s output sample rate. The reference signal is then downsampled to obtain the test signal that will be interpolated. The signal-to-distortion ratio is measured by comparing the interpolated signal to the originally generated reference signal.

Section 14.2

473

Filter Design for Interpolators and Decimators

0.8

amplitude

0.4

0

-0.4 -0.8 0

Figure 14.6

5

10 time

15

20

Segment of the four-sinusoid signal used in Example 14.1.

40 20

relative PSD (dB)

0 -20 -40 -60 -80 -100 -120 -140 -160 -4

-3

-2

-1

0

1

2

frequency (Hz) Figure 14.7

Spectrum of four-sinusoid signal for Example 14.1.

3

4

474

Multirate Simulations

Table 14.3 14.1.

Chapter 14

Spectrum analyzer parameters for Example

Kind Of Spec Estim = SPECT CALC BARTLETT PDGM Num Segs To Avg = 600 Seg Len = 4000 Fft Len = 4096 Hold Off = 0 Norm Factor = 1.0 Freq Norm Factor = 1.0 Output In Decibels = true Plot Two Sided = true

40

relative PSD (dB)

20 0 -20 -40 -60 -80 -100 -120 -16

Figure 14.8

-12

-8

-4 0 4 frequency (Hz)

8

12

16

Spectrum of four-sinusoid signal after upsampling by a factor of four.

Section 14.2

475

Filter Design for Interpolators and Decimators

10

magnitude (dB)

-10 -30 -50 -70 -90 -110

2

0

6

4

8 10 frequency (Hz)

12

16

14

Figure 14.9 Magnitude response of 71-tap interpolation ﬁlter designed for a passband edge of 3.5 Hz and a stopband edge of 4.5 Hz.

0.2

magnitude (dB)

0 -0.2 -0.4 -0.6 -0.8 -1

0

Figure 14.10

1

2 frequency (Hz)

3

4

Magniﬁed passband detail for ﬁlter response shown in Figure 14.9.

476

Multirate Simulations

Chapter 14

30

relative PSD (dB)

0 -30 -60 -90 -120 -150 -16

-12

-8

-4

0

4

8

12

16

frequency (Hz) Figure 14.11

Spectrum of interpolated signal after ﬁltering.

MultipleSineGener ref_sig

SignalAnchor

Downsampler test_sig Upsampler upsamp_test_sig AnlgDirectFormFir DiscreteDelay filt_sig delayed_ref_sig MeanSquareError

Figure 14.12 interpolation.

Block diagram for assessing the signal-to-distortion ratio due to

Section 14.2

Filter Design for Interpolators and Decimators

477

In Example 14.1, the original sampling rate was only slightly larger than twice the bandwidth of the test signal. In a simulation context, the sampling rate is usually much larger than twice the signal bandwidth, making it easier to design a high-performance interpolation ﬁlter. Example 14.2 Consider the signal used in Example 14.1 but with a sample rate of 16 rather than 8 samples per second. After the signal is upsampled by a factor of four, the spectrum develops images as shown in Figure 14.13. The interpolation ﬁlter must pass the signal components at ±3 Hz while rejecting all components at ±11 Hz and beyond. Figures 14.14 and 14.15 show the response of a 71-tap ﬁlter designed for a passband edge frequency of 4 Hz, a stopband edge frequency of 8 Hz, and a ripple ratio of 20. When the upsampled signal is passed through this ﬁlter, the result will have the spectrum shown in Figure 14.16. The baseband image is passed with virtually no distortion, and the undesired images are attenuated by more than 90 dB. The measured signal-to-distortion ratio is approximately 66.7 dB. When a signal consisting of a desired signal plus white noise is interpolated, the SNR at the interpolator output is generally different from the SNR at the interpolator input. For an interpolation factor of L, the upsampling process spreads both the original signal energy and the original noise energy over a bandwidth L times greater than the original bandwidth. A well-designed interpolation ﬁlter passes almost exactly 1/L of the upsampled signal’s energy. However, the amount of noise energy passed by the ﬁlter depends upon the noise-equivalent bandwidth of the ﬁlter. If the amount passed is less than 1/L times the total noise energy in the simulation bandwidth, then the overall SNR will be improved by the interpolation process. On the other hand, if the amount of noise energy passed is greater than 1/L times the total noise energy, then the overall SNR will be degraded by the interpolation process. These changes must be accounted for when setting the level of additive noise in a simulation.

478

Multirate Simulations

Chapter 14

20 0

relative PSD (dB)

-20 -40 -60 -80 -100 -120 -140 -160 -32

-24

-16

-8 0 8 frequency (Hz)

16

24

32

Figure 14.13 Spectrum of four-sinusoid signal from Example 14.2 after upsampling by a factor of four. 10

magnitude (dB)

-10 -30 -50 -70 -90 -110

0

4

8

12

16 20 frequency (Hz)

24

28

32

Figure 14.14 Magnitude response of 71-tap interpolation ﬁlter designed for a passband edge of 4 Hz and a stopband edge of 8 Hz.

Section 14.2

479

Filter Design for Interpolators and Decimators

magnitude (dB)

0.05

0

-0.05

-0.1

1

0

Figure 14.15

4

2 3 frequency (Hz)

5

Magniﬁed passband detail for ﬁlter response shown in Figure 14.14.

20

relative PSD (dB)

0 -20 -40 -60 -80 -100 -120 -140 -160 -32

Figure 14.16

-24

-16

-8 0 8 frequency (Hz)

16

24

32

Spectrum of interpolated signal from Example 14.2 after ﬁltering.

480

Multirate Simulations

14.2.2

Chapter 14

Decimation

Consider the case in which a lowpass signal is to be decimated. If decimation is to be useful, the signal of interest must occupy a bandwidth which is much smaller than half the original sampling rate. For DSP applications, after the sampling rate is reduced, the signal bandwidth must still be less than half the new sampling rate. These requirements drive the design of the decimation ﬁlter. As shown in Figure 14.17, the passband of the ﬁlter must be wide enough to accomodate the bandwidth of the desired signal, and the ﬁlter’s stopband edge frequency must be less than half the new sampling rate. If the signal to be decimated consists of a desired signal plus some amount of AWGN, care must be taken to ensure that the decimation ﬁlter does not introduce unacceptable changes in the spectral characteristics of the noise. Example 14.3 Consider a signal consisting of three sinusoids of equal magnitudes. The normalized frequencies of the sinusoids are 1.5, 0.75, and 0.375. The sampling rate is 64 samples per second. A segment of such a signal generated by the MultipleToneGener model is shown in Figure 14.18. The estimated PSD of this signal is shown in Figure 14.19. The signal is to be decimated by a factor of four, reducing the sample rate

signal spectrum filter response

0

f stop

fH

f samp f pass

Figure 14.17

2

Critical frequencies in the design of a decimation ﬁlter.

Section 14.2

481

Filter Design for Interpolators and Decimators

to 16 samples per second. In a DSP application, a decimation ﬁlter having a passband just large enough to pass the desired signal might be used. If the ﬁlter from Example 14.2 is applied to the signal from Figure 14.18 prior to downsampling, the result after downsampling will have a PSD that agrees with Figure 14.19. However, if AWGN is added to the original signal such that the SNR is 5 dB, the signal will be as shown in Figure 14.20, and the estimated PSD of this signal will be as shown in Figure 14.21. If the ﬁlter from Example 14.2 is applied to this signal prior to downsampling, the result after downsampling will have a PSD as shown in Figure 14.22. The noise in this signal is not white—it is bandlimited to a range of ±4 Hz.

0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 0

2

4

6

8

10

12

14

16

18

time Figure 14.18

Segment of the three-sinusoid signal used in Example 14.3.

20

482

Multirate Simulations

Chapter 14

20 0

relative PSD (dB)

-20 -40 -60 -80 -100 -120 -140 -160 -180 -32

-24

-16

-8

0

8

16

24

32

frequency (Hz) Figure 14.19

Estimated PSD of the three-sinusoid signal for Example 14.3.

1.0 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1.0 0

Figure 14.20

2

4

6

8

10 time

12

Noisy test signal for Example 14.3.

14

16

18

20

Section 14.2

483

Filter Design for Interpolators and Decimators

15 10

relative PSD (dB)

5 0 -5 -10 -15 -20 -25 -32

-24

-16

-8

0

8

16

24

32

frequency (Hz) Figure 14.21

Estimated PSD for noisy test signal used in Example 14.3.

20

relative PSD (dB)

0 -20 -40 -60 -80 -100 -120 -140 -160 -8

Figure 14.22

-6

-4

-2 0 2 frequency (Hz)

4

6

Estimated PSD for decimated signal from Example 14.3.

8

484

Multirate Simulations

Chapter 14

In order to preserve the whiteness of the downsampled noise, it seems reasonable that the bandwidth of the decimation ﬁlter should be matched to the downsampled simulation bandwidth. It may not be immediately clear exactly what constitutes a good match. Practical ﬁlters have transition bands of nonzero width. If the bandwidth is set so that the transition band falls within the simulation bandwidth, the PSD will roll off before reaching the folding frequency, as demonstrated in Example 14.3. If the bandwidth is set so that the transition band falls outside of the simulation bandwidth, the energy in the transition band will alias back into the simulation bandwidth and the PSD will exhibit peaks near the folding frequency. The trick is to ﬁnd the combination of (1) passband edge frequency, (2) transition width, and (3) transition-band-response shape such that the transition band straddles the folding frequency as shown in Figure 14.23. In the ideal combination, energy (a) from the transition band above ffold is aliased into transition band frequencies (b) below ffold in just the right amounts to replace the “missing” energy at (c) and thereby yielding a ﬂat PSD out to the folding frequency.

passband edge

c

b

a

folding frequency Figure 14.23

stopband edge

Optimal transition-band conﬁguration for a decimation ﬁlter.

Section 14.2

485

Filter Design for Interpolators and Decimators

Example 14.4 Figure 14.24 shows the magnitude response of a 71-tap ﬁlter designed for a passband edge frequency of 8 Hz, a stopband edge frequency of 11 Hz, and a ripple ratio of 20. If a signal comprising only the noise portion of Figure 14.20 is decimated using this ﬁlter, the result will have a PSD as shown in Figure 14.25. This spectrum has peaks near the folding frequency due to noise energy in the transition band of the ﬁlter being aliased to frequencies within the passband. A 71-tap ﬁlter designed for a passband edge frequency of 7.12 Hz and a stopband edge frequency of 10.12 Hz comes close to the ideal, as shown by the nearly ﬂat PSD in Figure 14.26.

10

magnitude (dB)

-10

-30

-50

-70

-90

-110

0

4

8

12

20 16 frequency (Hz)

24

28

32

Figure 14.24 Magnitude response of a 71-tap ﬁlter designed for a passband edge frequency of 8 Hz and a stopband edge frequency of 11 Hz.

486

Multirate Simulations

Chapter 14

0 -2 -4

relative PSD (dB)

-6 -8 -10 -12 -14 -16 -18 -20 -22 -24 -8

-6

-4

-2

0

2

4

6

8

frequency (Hz) Figure 14.25 Estimated PSD for noise-only signal from Example 14.4 decimated using a ﬁlter having a transition band from 8 Hz to 11 Hz.

relative PSD (dB)

-14 -16 -18 -20 -8

-6

-4

-2

0

2

4

6

8

frequency (Hz) Figure 14.26 Estimated PSD for noise-only signal from Example 14.4 decimated using a ﬁlter having a transition band from 7.1 Hz to 10.1 Hz.

Section 14.3

14.3

Multirate Processing for Bandpass Signals

487

Multirate Processing for Bandpass Signals

Prior to downsampling a bandpass signal, the bandwidth of the signal is usually reduced by moving the signal’s passband to a lower center frequency. Conversely, for interpolation of bandpass signals, once the signal has been upsampled, the signal bandwidth is usually increased by moving the signal’s passband to a higher center frequency.

14.3.1

Quadrature Demodulation

A real-valued bandpass signal has a spectrum that is conjugate symmetric; that is, the real part of the spectrum is even-symmetric and the imaginary part of the spectrum is odd-symmetric. Consider the real-valued discrete-time signal x[n] having the DTFT spectrum X(ej ω ) shown in Figure 14.27(a). If the signal x[n] is multiplied by ej ω0 T , the spectrum will be shifted to the right by ω0 , as shown in Figure 14.27(b). The signal can then be lowpass-ﬁltered to remove the spectral component centered at 2ω0 . The resulting spectrum, shown in Figure 14.27(c), is in general not conjugate-symmetric. This means that the corresponding time signal will in general be complex-valued. This signal is called the complex envelope [2] of x[n] and is usually denoted as x[n]. ˜ The complex envelope can be expressed in terms of an inphase component xI [n] and a quadrature component xQ [n]: x[n] ˜ = xI [n] + j xQ [n]

(14.3.1)

Because x[n] is real-valued and ej ω0 nT = cos (ω0 nT ) + j sin (ω0 nT ) the inphase and quadrature components of x[n] can be obtained via quadrature demodulation using the demodulator structure shown in Figure 14.28. As an alternative to Eq. (14.3.1), the complex envelope can be expressed in polar form as x[n] ˜ = a(nT ) exp (j φ (nT )) where a(nT ) is the envelope and φ (nT ) is the phase of the signal x[n].

14.3.2

Quadrature Modulation

Given a complex-valued signal x[n] ˜ that has been obtained via quadrature demodulation of a real-valued bandpass signal, the process of quadrature modulation can be used to reconstruct the original bandpass signal. Consider the complex envelope

488

Multirate Simulations

Chapter 14

(a)

−w 0

w0 (b)

2w 0 (c)

Figure 14.27 Spectral interpretation of quadrature demodulation: (a) real-valued bandpass signal, (b) shifted spectrum, (c) ﬁltered spectrum.

signal’s spectrum shown in Figure 14.27(c) and repeated in Figure 14.29(a). Multiplying x[n] ˜ by exp(−j ω0 nT ) will shift the spectrum to the left by ω0 , as shown in Figure 14.29(b). Our goal is to replicate the spectrum shown in Figure 14.27(a), so we can multiply x ∗ [n], the complex conjugate of x[n], by exp(j ω0 nT ) to produce the shifted spectrum shown in Figure 14.29(c). Clearly, the original bandpass spectrum of Figure 14.27(a) can be obtained by adding together the two signals represented by the spectra in Figures 14.29(b) and 14.29(c): x[n] = x[n] ˜ exp(−j ω0 nT ) + x˜ ∗ [n] exp(j ω0 nT )

= xI [n] + j xQ [n] exp(−j ω0 nT ) + xI [n] − j xQ [n] exp(j ω0 nT ) = xI [n] cos(ω0 nT ) + xQ [n] sin(ω0 nT )

Section 14.3

489

Multirate Processing for Bandpass Signals

cos ( w 0nT )

h [n ]

x Ι [n ]

h [n ]

x Q [n ]

x [n ]

sin ( w 0nT ) Figure 14.28

Block diagram of a quadrature demodulator.

(a)

(b)

−w 0 (c)

w0 Figure 14.29 Spectral interpretation of quadrature modulation: (a) spectrum of complex envelope, (b) shifted spectrum, (c) shifted complex-conjugate spectrum.

This page intentionally left blank

Chapter 15

MODELING DSP COMPONENTS M

odeling DSP components is different from modeling other constituent parts of a communications system. The model of an analog device is based on mathematical descriptions of processes that are deﬁned by the laws of physics, often with only limited observability of all the various processes actually involved. Modeling of a DSP device is simpliﬁed by the the fact that the model of a DSP device is based on mathematical descriptions of processes that are deﬁned mathematically. On the other hand, modeling of a DSP device is often complicated by the need to model the effects of quantized signal values, quantized coefﬁcients, and ﬁnite-precision arithmetic. When quantization effects are included, otherwise linear systems become nonlinear, making the exact architecture of the device and the order of mathematical operations an important consideration in the design of the model.

15.1

Quantization and Finite-Precision Arithmetic

In the analysis and modeling of DSP devices, quantization effects are usually grouped into three categories: coefﬁcient quantization, signal quantization, and ﬁnite-precision arithmetic.

15.1.1

Coefﬁcient Quantization

Design algorithms for digital ﬁlters often determine the ﬁlter coefﬁcients to a very ﬁne level of precision. If the ﬁlters are to be implemented using ﬂoating-point hardware or software, most or all of the precision in the coefﬁcients can be maintained. However, if the ﬁlter is to be implemented using ﬁxed-point hardware or software, the necessary quantization of the design coefﬁcients introduces degradation into the performance of the ﬁlter. It is sometimes possible to predict and possibly mitigate 491

492

Modeling DSP Components

Chapter 15

this degradation as part of the design process by introducing coefﬁcient quantization into the design process itself. These approaches may reduce, but not entirely eliminate, the need for simulations to determine ﬁlter performance. Most of the attempts to treat coefﬁcient quatization in the design process still assume nearly inﬁnite precision in the representation of the input signal and in the arithmetic used to implement the ﬁlter. The ultimate ﬁlter performance will include the effects of interactions between coefﬁcient quantization, signal quantization, and ﬁnite-precision arithmetic; and these interactions are most easily explored via simulation. 15.1.1.1

Floating-Point Formats

Most higher level languages such as Microsoft Visual C++ represent ﬂoating-point numbers using formats speciﬁed in IEEE Standard 754. A float in Visual C/C++ is stored in the IEEE 754 single-precision format depicted in Figure 15.1. The 23bit mantissa is effectively 24 bits because for normalized values there is always an implicit bit with value 1 just to the left of the radix point. The 8-bit exponent is in excess 127 form with values 0x00 and 0xFF reserved for special values. In excess 127 format, the actual value of the exponent is the amount by which the indicated value exceeds 127. An exponent of −3 is represented as 12410 = 0x7C, an exponent of 0 is represented as 12710 = 0x7F, and an exponent of 51 is represented as 17810 = 0xB2. The implicit bit to the left of the mantissa’s radix point makes it impossible to exactly represent the value zero. The smallest possible value in normalized form corresponds to an explicit mantissa of zero and has a value of 2−126 ≈ 1.175×10−38 . The second smallest value in normalized form corresponds to the least signiﬁcant bit (LSB) of the mantissa set to 1 and has a value of (1 + 2−23 ) × 2−126 . The interval between the negative value closet to zero and the smallest positive value is 2148 times larger than the interval between the smallest and second smallest positive values, thus creating a resolution “gap” around zero. The standard includes provisions for a denormalized form that provides an exact representation for zero and ﬁlls in the resolution gap around zero. The denormalized form, which is indicated by the reserved exponent value of 0x00, removes the implicit bit to the left of the radix point and uses an effective exponent value of 2−126 . In this form, when the mantissa LSB is set to 1, the value represented is 2−149 . Zero is exactly represented in denormalized form by a mantissa of zero. When the result of an operation has a magnitude that exceeds the largest representable magnitude of (2 − 2−23 ) × 2127 ≈ 3.4 × 1038 , the value is reported as inﬁnity by setting the mantissa to zero and the exponent to all ones. Inﬁnity can be negative or positive depending upon the value of the sign bit. The result of an indeterminate operation is reported as a Quiet Not a Number (QNaN), which has an exponent of all ones and

Section 15.1

493

Quantization and Finite-Precision Arithmetic

31 30

s

23

22

0

e

m

exponent

mantissa

sign normalized: value = (−1)S × M × 2(e −127)

1 ≤ e ≤ 254

M

20

2−1 2−2

2−22 2−23

1 implicit radix point implicit 1

denormalized: value = (−1)S × M × 2−126

M

2−1 2−2

2−22 2−23

Figure 15.1 Single-precision ﬂoating-point format.

a nonzero mantissa, with the most signiﬁcant explicit bit of the mantissa set to 1. The result of an invalid operation is reported as a Signaling Not a Number (SNaN), which has an exponent of all ones and a nonzero mantissa, with the most signiﬁcant explicit bit of the mantissa set to 0. A C/C++ double is stored in the IEEE 754 double-precision format depicted in Figure 15.2. This format is similar to the single-precision format but has longer

494

Modeling DSP Components

Chapter 15

mantissa and exponent ﬁelds. The exponent is in excess 1023 form with values 0x000 and 0x7FF reserved to indicate denormalized form, inﬁnity, and NaNs as described for the single-precision format.

63 62

52

s

51

0

e

m

exponent

mantissa

sign normalized: value = (−1)S × M × 2(e −1023)

1 ≤ e ≤ 2046

M

20

2−1 2−2

2−51 2−52

1 implicit radix point implicit 1

denormalized: value = (−1)S × M × 2−1022

M

2−1 2−2

Figure 15.2

Double-precision ﬂoating-point format.

2−51 2−52

Section 15.1

15.1.2

Quantization and Finite-Precision Arithmetic

495

Signal Quantization

Quantization of the signals in a communications system can be signiﬁcantly more complicated than it ﬁrst appears. It is fairly straightforward to quantize a given range of voltages into 2N different N-bit digital values. The difﬁcult part is determining where the signal of interest will be positioned within this range of voltages. If the gain is optimized with respect to the quantizer range, the signal of interest may span 7 or 8 bits. On the other hand, if the signal is particularly weak or the gain is set improperly, the signal may span only 2 or 3 bits of the quantizer range. The performance of subsequent DSP stages will differ greatly for these two cases. A particular system may need to be designed so as to provide adequate performance at both extremes.

15.1.3

Finite-Precision Arithmetic

Assessing performance degradations due to ﬁnite-precision arithmetic is very much dependent on the details of how a particular algorithm is implemented. For rough estimates of the impact that quantization will have on the performance of a particular DSP device, it is a common practice to use a ﬂoating-point model of the device and apply “generic” quantization to the input, coefﬁcients, and output. Generic quantization ignores the numerical format of the actual implementation and simply introduces granularity into representation of the various values. One very simple approach for approximating N-bit quantization is to multiply each ﬂoating-point input sample by 2N−M−1 , truncate the result to an integer, and then ﬂoating-point-divide this integer by 2N−M−1 . The value of M is the smallest positive integer for which the input samples are guaranteed to have magnitudes less than 2M . For very simple devices like FIR ﬁlters, this treatment of quantization may be adequate. However, for devices that involve feedback like IIR ﬁlters, or adaptive coefﬁcients like equalizers and RAKE demodulators, the only way to ensure accuracy of the simulation is to employ bit true modeling of the device. If the actual device multiplies a 6-bit signed input sample in fractional form by a 7-bit signed coefﬁcient in fractional form and truncates the result to a 10-bit signed value, then a bit true model does exactly the same thing. Constructing bit true models of DSP devices is facilitated by libraries that perform the ﬁnite-precision arithmetic. In one particularly elegant approach, every signal sample is represented by a C++ object that encapsulates both a ﬁxed-point representation and a “full-precision” ﬂoating-point representation. Arithmetic operations are implemented in methods belonging to the class. These operations are performed on both the ﬁxed-point and ﬂoating-point representations, storing the results in a new instance of the C++ object. Consequently, at any point in the pro-

496

Modeling DSP Components

Chapter 15

cessing, every ﬁxed-point result can be immediately compared to the value it would have had if quantization were not in the picture. These comparisons help pinpoint those locations in a proposed device where additional precison will yield the greatest performance improvements.

15.2

FIR Filters

A ﬁnite impulse response (FIR) digital ﬁlter is one of the simplest DSP devices. The deﬁning equation for an N-tap FIR ﬁlter is y[k] =

N −1

hk x[k − n]

n=0

where y[k] is the ﬁlter output at time k x[k] is the ﬁlter input at time k hk is the ﬁlter coefﬁcient for delay k There are a number of well-known techniques for determining the coefﬁcients for an FIR ﬁlter having speciﬁed characteristics. These techniques include windowing, frequency sampling, and the Remez exchange [2]. The principles of modeling FIR ﬁlters are the same regardless of the method used to generate the coefﬁcients. As depicted in Figure 15.3, the direct form of an N-tap FIR ﬁlter consists of a string of N − 1 single-sample delays with N coefﬁcient multipliers and a summer. Sometimes this structure is referred to as a tapped-delay line or transversal ﬁlter. If the input signal consists of signed values having bin bits and the coefﬁcients have bcoef bits, then the outputs of the multiplers can have at most (bin + bcoef − 1) bits. The sum 6 of the7multiplier outputs will have at most (bN + bin + bcoef − 1) bits, where bN = log2 N . The model IntDirectFormFir, provided on the companion Web site, models an FIR ﬁlter implemented using integer arithmetic. This model accepts an input signal of type Signal assumed to contain no more than bin bits per value. Coefﬁcients are externally scaled into integers of bcoef bits and supplied to the model via the parameter input mechanism. Multiplier and summer outputs are allowed to grow to as many bits as needed. Internal calculations use type int64, so use of this model is limited to cases where (bN + bin + bcoef ) < 64. The summer output is right-shifted by bshift bits and the bmask least signiﬁcant bits of the result are issued as the ﬁlter output. Both bshift and bmask are user-speciﬁed values. The model FracDirectFormFir, provided on the companion Web site, takes a different approach, which is diagrammed in Figure 15.4. The bscale LSBs of the

Section 15.2

x [k ]

h0

497

FIR Filters

T

h1

T

T

h N −2

h2

h N −1

∑ y [n ] Figure 15.3 FIR ﬁlter.

input signal are scaled into fractional values of bin bits. Coefﬁcients are externally scaled into fractional values of bcoef bits and supplied to the model via the parameter input mechanism. Multiplier outputs are in fractional form and truncated to bmult bits. The summer output is truncated to bsum bits. Values for bscale , bin , bcoef , bmult , and bsum are all user-speciﬁed. In exploring alternative topologies for implementing digital ﬁlters, it is convenient to depict the topology in the form of a signal ﬂow graph (SFG) as in Figure 15.5. The interpretation of an SFG is subject to four simple rules: 1. The direction of signal ﬂow is indicated by an arrowhead near the center of each branch, and the gain of the branch is indicated near this arrowhead. 2. An unlabeled branch has unity gain. 3. A delay of M sample times is indicated by a branch gain of z−M . 4. All of the branch signals entering a node are added together to obtain the signal exiting the node. Interpreting Figure 15.5 by these rules reveals that the SFG in the ﬁgure is equivalent to the block diagram in Figure 15.3. The transposition theorem for SFGs states that the system represented by a particular SFG can be transposed into a different but equivalent system by simply

498

Modeling DSP Components

x [k ]

b scale bits

b c = b coef b m = b mult

scale

b in bits

T

h0

T

h1

bc

T

h N −2

h2

bc

bc

bc

frac trunc frac trunc

bm

bm

Chapter 15

h N −1 bc

frac trunc frac trunc

bm

frac trunc

bm

∑

bm

frac trunc

b sum Figure 15.4

y [n ]

Fractional quantization scheme for a direct-form FIR ﬁlter.

reversing the ﬂow direction in every branch and reversing the roles of the systemlevel input node and output node. Figure 15.5 can be transposed to yield the SFG shown in Figure 15.6, which can be redrawn as in Figure 15.7. The direct-form implementation delays raw input samples, multiplies the current sample and N − 1 delayed samples by the ﬁlter coefﬁcients, and then immediately sums the multiplier outputs. The transposed direct form multiplies the current sample by all N ﬁlter coefﬁcients and then sums each multiplier output into a different point in a delay chain, as depicted in Figure 15.8. FIR ﬁlters are often selected for an application because they can be designed to have constant group delay, which is a desirable property for ﬁlters because nonconstant group delay will cause envelope distortion in modulated-carrier signals and pulse-shape distortion in baseband data signals. A ﬁlter’s frequency response H (ej ω ) can be expressed in terms of amplitude response A(ω) and phase response

Section 15.2

499

FIR Filters

z −1

z −1

z −1

x [n ] h1

h0

h2

h N −2

h N −1 y [n ]

Figure 15.5 Signal ﬂow graph for a direct-form realization of an FIR ﬁlter.

z −1

z −1

z −1

y [n ] h1

h0

h2

h N −2

h N −1 x [n ]

Figure 15.6 Transposed signal ﬂow graph for a direct-form realization of an FIR ﬁlter.

θ (ω): H (ej ω ) = A(ω)ej θ(ω) The ﬁlter will have constant group delay if and only if θ (ω) = β + αω

(15.2.1)

where α and β are constants. It can be shown that an N-tap FIR ﬁlter will satisfy Eq. (15.2.1) if all of the following are satisﬁed: N −1 2 π β = ± 2 h[n] = −h[N − 1 − n] α =

(15.2.2a) (15.2.2b) 0≤n≤N −1

(15.2.2c)

500

Modeling DSP Components

Chapter 15

x [n ] h N −1

h N −2

z −1 Figure 15.7 FIR ﬁlter.

h N −3

h1

z −1

h0 z −1

y [n ]

Signal ﬂow graph for a transposed direct-form realization of an

x [k ] h N −1

h N −2

h N −3

T

T

h1

T

h0

T

y [n ]

Figure 15.8 Transposed direct-form FIR ﬁlter.

A constant group-delay ﬁlter will have linear phase if the phase response passes through the origin; that is, if θ (ω) = αω

(15.2.3)

It can be shown that an N -tap FIR ﬁlter will satisfy Eq. (15.2.3) if both of the following are satisﬁed: N −1 2 h[n] = h[N − 1 − n] α =

(15.2.4a) 0≤n≤N −1

(15.2.4b)

FIR ﬁlters having constant group delay are usually categorized into four types corresponding to the four combinations of odd/even N and odd/even symmetry of h[n]. These four types and their properties are summarized in Table 15.1. Because of the symmetry in their coefﬁcients, FIR ﬁlters with constant group delay can be implemented more efﬁciently than the general implementations of Figures 15.5 and 15.7.

Section 15.3

501

IIR Filters

Signal ﬂow graphs for implementations of types 1 through 4 are shown in Figures 15.9 through 15.12. Table 15.1

Properties of FIR ﬁlters having constant group delay.

Type Length, N h[n] symmetry Linear phase Constant group delay

z −1

z −1

1

2

3

4

Odd Even Yes Yes

Even Even Yes Yes

Odd Odd No Yes

Even Odd No Yes

z −1

x [n ]

z −1 y [n ]

h0

Figure 15.9

15.3

z −1 h1

z −1 h2

h (N −3) / 2

h (N −1) / 2

Signal ﬂow graph for a Type 1 constant-group-delay FIR ﬁlter.

IIR Filters

Inﬁnite impulse response (IIR) digital ﬁlters were discussed in Chapter 8 in connection with using the bilinear transformation to model classical analog ﬁlters. The current section revisits IIR ﬁlters from the perspective of modeling them when they are used as part of the DSP processing in a communication system. IIR ﬁlters offer some advantages over FIR ﬁlters; they also suffer some disadvantages. IIR ﬁlters can usually acheive narrow transition bands and high levels of stopband attenuation using signiﬁcantly fewer coefﬁcients than a comparable FIR ﬁlter, but IIR ﬁlters cannot be designed to have exactly linear phase or constant group delay. IIR ﬁlters are also more likely to experience stability and numerical precision problems when

502

Modeling DSP Components

z −1

Chapter 15

z −1

z −1

x [n ] z −1

z −1 y [n ]

h0

z −1 h1

Figure 15.10

z −1 h2

h (N / 2 ) − 2

h (N / 2 ) − 1

Signal ﬂow graph for a Type 2 constant-group-delay FIR ﬁlter.

z −1

z −1

z −1

x [n ] −1

z −1 y [n ]

h0

Figure 15.11

−1

−1

−1

z −1 h1

z −1 h (N −3) / 2

h2

Signal ﬂow graph for a Type 3 constant-group-delay FIR ﬁlter.

implemented using ﬁnite-precision arithmetic. The deﬁning equation for an IIR ﬁlter is y[k] =

N n=1

an y[k − n] +

M

bm x[k − m]

m=0

The SFG for the direct-form 1 realization of an IIR ﬁlter is shown in Figure 15.13. This system can be viewed as two systems in cascade—a moving average (MA) system followed by an autoregressive (AR) system. Because both of these systems are linear time-invariant systems, the order of the cascade can be reversed to obtain the system shown in Figure 15.14. The two delay chains running down the center of the ﬁgure are delaying the same signal, so they can be merged into a single chain

Section 15.3

503

IIR Filters

z −1

z −1

z −1

x [n ] −1

−1

z −1 y [n ]

h0

−1

z −1 h1

Figure 15.12

−1

−1

z −1

z −1 h2

h (N / 2 ) − 2

h (N / 2 ) − 1

Signal ﬂow graph for a Type 4 constant-group-delay FIR ﬁlter.

to yield the system in Figure 15.15. This system is known as the direct-form 2 realization of an IIR ﬁlter. The models DirectForm1Iir and DirectForm2Iir, provided on the companion Web site, both use generic quantization strategies.

b0

x [k ] z −1

y [k ] z −1

b1 z −1

z −1 b2

b M −1 z −1 bM

Figure 15.13

z −1

Signal ﬂow graph for direct-form 1 IIR ﬁlter.

504

Modeling DSP Components

b0

x [k ] −1 z −1 z

b1 −1 z −1 z

b2

b M −1 z −1 z −1

Figure 15.14

bM

IIR ﬁlter with order of AR and MA sections reversed.

Chapter 15

y [k ]

Section 15.3

505

IIR Filters

b0

x [k ] z

−1

a1

b1 z −1

a2

b2

b M −1

aM −1

z −1 bM

aM

aN −1 z −1 aN Figure 15.15

Signal ﬂow graph for direct-form 2 IIR ﬁlter.

y [k ]

Chapter 16

CODING AND INTERLEAVING E rror-correction codes used in wireless communication can be

divided into two broad categories: block codes and convolutional codes. The implementations for these two categories are very different, and so are the corresponding simulation models. Convolutional codes almost always need to be simulated at a level of detail that amounts to a de facto implementation. On the other hand, it is often possible to simulate the performance of block codes without explicitly modeling the details of the encoder or decoder at all. Interleavers are often employed with block codes to improve their performance. An introduction to both block and covolutional codes can be found in [35].

16.1

Block Codes

A block code operates on ﬁxed-length blocks of information bits. These blocks are called message blocks. A block encoder operates on a message block of k information bits to generate an output block of n bits, where n > k. The output block is called a code word, code vector, or code block. The rules for generating the code block vary depending upon the speciﬁc code (Hamming, Reed-Solomon, BCH, etc.) being used. An information block of k bits is capable of conveying 2k different messages. The encoded transmission uses n bits to convey only k bits of useful information, so the code block contains (n − k) bits of redundancy. It is this redundancy that is the source of the code’s ability to detect or correct errors. A block code having n bits per code block and k bits per message block is designated as a (n, k) code. The minimum distance d or number of correctable errors τ is often included as an explicitly identiﬁed parameter, as in a “(7, 4, d = 3) code” or “(7, 4, τ = 1) code.” Encoding and decoding of block codes involve arithmetic over Galois ﬁelds, which is summarized in Appendix C. 506

Section 16.1

16.1.1

507

Block Codes

Cyclic Codes

A binary block code is linear if and only if the modulo-2 sum of any two codewords is also a codeword. A linear block code C is a cyclic code if every cyclic shift of a codeword in C is also a codeword in C. Linear cyclic codes possess mathematical properties that make them easier to encode and decode than linear codes that are not cyclic. The codewords for a (7, 4, d = 3) cyclic code are listed in Table 16.1. The codewords have been sorted into groups such that the codewords within each group are cyclic shifts of each other. Just like the elements of a Galois ﬁeld, as discussed in Appendix C, each of the codewords can be represented by a polynomial as shown in the table. The nonzero code polynomial of minimum degree is unique and is designated as the generator polynomial of the code and denoted as g(x). A number of useful results have been developed concerning the generator polynomial: • If the generator polynomial of a cyclic code is of the form g(x) =

r

gi x i

i=0

the constant term g0 will always equal 1. • For an (n, k) cyclic code, the degree of the generator polynomial is n − k. • A polynomial of degree n − 1 or less with binary coefﬁcients is a code polynomial of the code C if and only if the polynomial is divisible by the code’s generator polynomial g(x). • If g(x) is the generator polynomial of an (n, k) cyclic code, then g(x) is a factor of x n + 1. • If g(x) is a polynomial of degree n − k and is a factor of x n + 1, then g(x) is the generator polynomial for some (n, k) cyclic code. Algorithm 16.1 provides an encoding approach for cyclic codes that is based on synthetic division. For small values of n and k, this approach can be applied manually using pencil and paper, perhaps to verify the correct operation of a cyclic encoder model. The generator polynomial for the code in Table 16.1 is x 3 + x + 1. The information sequence for the tenth codeword in the table corresponds to the polynomial p(x) = x 3 + 1. When multiplied by x n−k , this becomes x 6 + x 3 . Synthetic division of x 6 + x 3 by x 3 + x + 1 is shown in Figure 16.1. The remainder of x 2 + x corresponds to the check bits 110, which agrees with the check bits shown for the tenth entry in Table 16.1.

508

Coding and Interleaving

Table 16.1 code.

Info block 0000 0001 0010 0101 1011 0110 1100 1000 0100 1001 0011 0111 1110 1101 1010 1111

Codewords for linear cyclic (7,4)

Check block 000 011 110 100 000 001 010 101 111 110 101 010 100 001 011 111

Polynomial 0 x +x+1 x4 + x2 + x x5 + x3 + x2 x6 + x4 + x3 x5 + x4 + 1 x6 + x5 + x x6 + x2 + 1 x5 + x2 + x + 1 x6 + x3 + x2 + x x4 + x3 + x2 + 1 x5 + x4 + x3 + x x6 + x5 + x4 + x2 x6 + x5 + x3 + 1 x6 + x4 + x + 1 x6 + x5 + x4 + x3 +x 2 + x + 1 3

Chapter 16

Section 16.2

509

BCH Codes

Algorithm 16.1

Using synthetic division to encode cyclic codes. n is the code length. k is the number of information bits. Execute: 1. Form the polynomial representation of the k-bit information sequence. 2. Multiply this polynomial by x n−k . 3. Divide this product by the generator polynomial g(x). The remainder produced by this division is the polynomial representation of the check bits.

x3 + x x3 + x +1 x6 +x 3 6 4 x +x +x 3 x4 x 4 +x 2 +x 2 +x x 2 +x Figure 16.1

16.2

Synthetic division of x 6 +x 3 by x 3 +x +1.

BCH Codes

BCH codes are a class of linear cyclic block codes discovered by R. C. Bose and D. K. Ray-Chaudhuri [36] and independently by A. Hocquenghem [37]. A coding theory result known as the BCH bound states that a linear cyclic code is guaranteed to have a minimum distance of δ or greater if the code is constructed such that • each codeword contains n bits. • the code’s generator polynomial g(x) has included among its roots (δ − 1)

510

Coding and Interleaving

Chapter 16

consecutive powers of β, where β is an element of order n from the extension ﬁeld GF(2m ). BCH codes are the result of creating a generator polynomial having a sequence of roots that satisﬁes the BCH bound. In order to correct τ errors, the code must have a minimum distance of at least δ = 2τ + 1. The sequence of required roots can be denoted as β b+1 , β b+2 , . . . , β b+2τ Because the roots β b+1 through β b+2τ are drawn from the extension ﬁeld GF(2m ), the polynomial formed as the product of the factors (x + β b+1 ) through (x + β b+2τ ) will, in general, have coefﬁcients that are also elements of GF(2m ). For a binary code, the generator polynomial must have coefﬁcients from the prime ﬁeld GF(2). Therefore, the generator polynomial is formed as g(x) = (x + β b+1 )(x + β b+2 )(x + β b+3 ) · · · (x + β b+2τ )p(x) where the polynomial p(x) contains additional roots that are needed to ensure that each coefﬁcient of g(x) is either 0 or 1. The additional roots needed to deﬁne p(x) can be found using minimal polynomials. (See Appendix C.) For each required root β r , there is a minimal polynomial M (r) (x) that has β r as a root and has binary coefﬁcients drawn from GF(2). It follows, then, that a generator polynomial that has binary coefﬁcients and that includes all the required roots can be obtained as the least common multiple of the minimal polynomials M (r) (x) for r = b + 1 through r = b + 2τ : g(x) = lcm M (b+1) (x), M (b+2) (x), . . . , M (b+2τ ) (x) The BCH codes most often encountered in practical communications system are primitive narrow-sense BCH codes. A primitive BCH code results when the element β is a primitive element of the extension ﬁeld GF(2m ). A narrow-sense BCH code results when b = 0, making the sequence of required roots β 1, β 2, . . . , β τ All of the BCH codes considered in this book are primitive narrow-sense BCH codes even if not speciﬁcally identiﬁed as such. Algorithm 16.2 can be used to construct the generator polynomial for given values of n and τ . This algorithm is implemented by the class BchGenPoly, which is summarized in Table 16.2.

Section 16.2

511

BCH Codes

Algorithm 16.2

Constructing the generator polynomial for a primitive narrow-sense BCH code.

n is the code length subject to the constraint n = 2m − 1, where m is an integer. τ is the desired maximum number of errors to be corrected in each block of n bits. Initialize: g(x) = 1 Execute: For n = 0, 1, 2, . . . , 1. From GF(2m ) select a primitive element β = α j , where j is an integer such that 1 ≤ j ≤ 2m − 2 and gcd(j, 2m − 1) = 1. 2. Use Algorithm C.5 to decompose 2m − 1 into cyclotomic cosets C0 , C1 , C3 , . . . , CQ . 3. Set up = 0 for p = 1, 3, . . . , Q. 4. For j = 0, 1, 2, . . . , τ − 1 compare (2j + 1) to the elements of each cyclotomic coset. If (2j + 1) is an element of cyclotomic coset Cp , then set up = 1. 5. For p = 1, 3, . . . , Q, if up = 1, then form the minimal polynomial M (p) (x) as % (x + q) M (p) (x) = q∈Cp

6. Form the generator polynomial g(x) as g(x) =

Q %

up M (p) (x)

odd p=1

The models BchEncoder and BchDecoder both make use of BchGenPoly. The BchEncoder model implements Algorithm 16.1 as tailored for the BCH case. The BchDecoder model implements the Peterson-Berlekamp algorithm described in the next section.

512

Coding and Interleaving

Table 16.2

Chapter 16

Summary of class BchGenPoly.

Constructors: BchGenPoly::BchGenPoly( int code block len, int max correctable errs) :PolyOvrPrimeField(); BchGenPoly::BchGenPoly( int code block len, int info block len); :PolyOvrPrimeField(); Public methods: int GetMinDistance( void ); int GetInfoBlockLen( void ); int GetMaxCorrectableErrs( void ); Notes: 1. This class does not inherit from PracSimModel and does not read from ParmInput. 2. This class inherits from the base class PolyOverPrimeField. 3. This class creates an instance of class BinaryExtenField. 4. This class creates an instance of class CyclotomicPartition. 5. This class creates several instances of class MinimalPolynomial. 6. Source code is contained in ﬁle bch gen poly.cpp.

Section 16.3

16.3

Interleavers

513

Interleavers

Block codes such as BCH codes fail when more than the correctable number of bit errors occur within a single code block. There are many channels in which errors tend to occur in bursts, greatly increasing the likelihood that an excessive number of bit errors will occur within a single code block. In communications systems designed to operate over bursty channels, interleavers are often used to reduce the likelihood of code-block failures. The interleavers permute the transmission order of the encoded bits so that bits from a single code block are dispersed over many codeblock durations in the channel. Within a single code-block duration in the channel, there will be bits from many different code blocks. Thus, a burst of errors created in the channel will be spread over many different code blocks when the deinterleaver restores the encoded bits to their original sequence at the receiver prior to being decoded. With a properly designed combination of block code and interleaver length, each deinterleaved code block will contain sufﬁciently few bit errors so that the decoder can correct them and deliver an error-free information block. There are two basic types of interleavers: block interleavers and convolutional interleavers.

16.3.1

Block Interleavers

A block interleaver is conceptually very simple. As depicted in Figure 16.2, a rectangular array of bit cells is ﬁlled row by row, and when the array is full, the bits are read out column by column. In a practical system, continuous delivery of bits out of the interleaver is accomplished by having two arrays so that one can be read out while the other is being written. Once the input array is completely ﬁlled, it becomes the output array, and the original output array becomes the new input array. At the receiver, the deinterleaver conceptually ﬁlls its array column by column and reads out row by row. As a practical matter, the sense of the rows and columns is not important. If the interleaver has an NR × NC array and ﬁlls this array row by row, a second interleaver with an NC × NR array can accomplish the deinterleaving while ﬁlling its array row by row also. The important issue is the relative dimensions of the interleaving and deinterleaving arrays. A single simulation model can be designed to perform either interleaving or deinterleaving and avoid alltogether the notion of rows and columns. The BlockPermuter model, summarized in Table 16.3, conﬁgures its internal arrays based on user-supplied values for Fill Segment Len and Drain Segment Len. The value of Drain Segment Len for the interleaver must equal the value of Fill Segment Len for the deinterleaver. Likewise for the interleaver Fill Segment Len and the deinterleaver Drain Segment Len.

514

Coding and Interleaving

Chapter 16

1, 2, 3, 4, 5, 6, 7, 8, . . . fill by rows

1

2

3

4

5

6

7

8

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 read by columns 1, 9, 17, 25, 2, 10, 18, 26, . . . Figure 16.2

Table 16.3

Block interleaver.

Summary of model BlockPermuter.

Constructors: BlockPermuter(

char* instance name, PracSimModel* outer model, Signal< int >* in sig, Signal< int >* out sig )

Parameters: int Fill Segment Len; int Drain Segment Len; Notes: 1. BlockSize for this model is set by the PracSim system. 2. Source code is contained in ﬁle block perm.cpp.

16.3.2

Convolutional Interleavers

The conceptual design of a convolutional interleaver is depicted in Figure 16.3. A routing commutator cycles through a set of delay lines, routing each successive symbol through a different line. At the other end of the delay lines, a selecting commutator cycles through the parallel lines to construct a permuted serial symbol

Section 16.4

515

Convolutional Codes

stream for transmission over the channel. At the receiver, a routing commutator cycles through a set of delay lines, routing each successive symbol from the permuted symbol sequence through a different line. A selecting commutator cycles through the parallel lines to construct a serial symbol stream that is restored to the same order as before the interleaver at the transmitter. The delay lines in the interleaver are fed in sequence from the shortest (no delay) to the longest delay. The delay lines in the deinterleaver are fed in sequence from longest to shortest. The model ConvolutionalPermuter, summarized in Table 16.4, can be used to implement either an interleaver or deinterleaver according to the value of the parameter Shortest Delay First.

input

receiver

transmitter

Figure 16.3

16.4

output

channel

Convolutional interleaver.

Convolutional Codes

An encoder for a simple convolutional code is shown in Figure 16.4. Bits are shifted into the left end of the three-stage shift register. The switch changes position at twice the input shift rate, thereby producing two output bits for each input bit. The code produced by this encoder can be described as a rate one-half constraint length 3 code, because the input rate is half the output rate and the encoder has 3 bits of memory. The two adders shown in the ﬁgure perform modulo-2 addition of the shiftregister bits to which they are connected. The encoder is a state machine that can be represented as the Moore machine, shown in Figure 16.5. The taps (connections to the shift register) are characterized by polynomials, which can be represented using a k-tuple of bits. These k-tuples are further abbreviated by using their equivalent octal values. For the encoder in Figure 16.4, the octal representations are g0 = 78 and g1 = 58 . For the encoder in Figure 16.6, the octal tap representations are g0 = 1718 and g1 = 1338 .

516

Coding and Interleaving

Table 16.4

Chapter 16

Summary of model ConvolutionalPermuter.

Constructors: ConvolutionalPermuter( char* instance name, PracSimModel* outer model, Signal* in sig, Signal* out sig ) Parameters: int Numb Delay Lines; int Delay Increment; bool Shortest Delay First; Notes: 1. BlockSize for this model is set by the PracSim system. 2. Source code is contained in ﬁle conv perm.cpp.

g0

g1 Figure 16.4 A simple convolutional encoder.

In the Moore machine model of the encoder, there are 23 = 8 states, and the two output bits are functions of the machine’s state immediately after a new input bit has been shifting in to create the new state. The rightmost bit of the shift register at state-time k plays a role in the outputs for time k, but is of no consequence in the transition to a new state at time k + 1. The state at time k + 1 and the corresponding outputs are completely determined by the input value at time k + 1 along with

Section 16.4

517

Convolutional Codes

1

100 11 0

1

0

0 000 00

010 10

1

0

0 001 11

Figure 16.5

110 01

0

101 00 1 0

1

1

111 10 0

1

1

011 01

Moore machine representation of the encoder from Figure 16.4.

g0

g1 Figure 16.6 A convolutional encoder for a constraint length 7 code with g0 = 1718 and g1 = 1338 .

two leftmost register bits at time k. This view of encoder operation suggests the Mealy machine representation shown in Figure 16.7. Because only the two leftmost register bits are considered, there are only 22 = 4 states in this representation. In a Mealy machine, the outputs are associated with the state transitions rather than with the states. The Moore machine representation is more of a static view and is perhaps easier to think about. However, the Mealy machine representation leads to a simpler trellis representation of encoder operation.

518

Coding and Interleaving

Chapter 16

0/00

00 1/11

0/11 0/10

01

10 1/00 1/01

0/01

11 1/10

Figure 16.7 Mealy machine representation of the encoder from Figure 16.4.

16.4.1 Trellis Representation of a Convolutional Encoder The trellis representation of encoder operation is essential to the understanding of how a Viterbi decoder operates. Figure 16.8 shows a trellis representation for the encoder corresponding to the Mealy machine in Figure 16.7. Each vertical column of nodes corresponds to the possible states at a single time step. As drawn, the trellis indicates the assumption that the encoder is always initialized to the 00 state at time 0. Depending upon the value of the ﬁrst input bit, the encoder can remain in state 00 or transition to state 10 for time 1. Remaining in state 00 is indicated by the upper branch leaving state 00 time 0 and entering state 00 time 1. This branch is labeled 00, indicating that output bits G0 and G1 are both 0. Transition to state 10 is indicated by the lower branch leaving state 00 time 0 and entering state 10 time 1. This branch is labeled 11, indicating that output bits G0 and G1 are both 1. If the encoder remains in state 00 at time 1, the possible transitions from time 1 to time 2 are the same as for time 0 to time 1. However, if the encoder is in state 10 at time 1, the second input bit will force a transition to either state 01 or state 11 for time 2. Transition to state 01 is indicated by the upper branch leaving state 10 time 1 and entering state 01 time 2. This branch is labeled 10, indicating that the output bits are

Section 16.4

519

Convolutional Codes

G0 = 1 and G1 = 0. Transition to state 11 is indicated by the lower branch leaving state 10 time 1 and entering state 11 time 2. This branch is labeled 01, indicating that the output bits are G0 = 0 and G1 = 1. Notice that the transition branches are not labeled with the particular input value needed to cause the transition. To eliminate clutter in trellis diagrams, it is a common practice to arrange the states within each column so that for the transitions leaving a state, the transition caused by a 0 input is drawn above the transition caused by a 1 input. 00

00 11

00

00

00

11

10

01

01

01 01

00

11

11 00 10

10

00

11

11

11

00

11

11

00

11

11

11

00

00

00

10

10

10

10

01

01 01

01

01

01

01

11

11

00

01

00

11

11 00 10

00 10

01 01

01 01

11 10 time: 0

1

Figure 16.8

2

10

10

3

4

10 5

10 6

10 7

10 8

9

Encoder trellis for 7-bit message plus 2-bit tail.

16.4.2 Viterbi Decoding Figure 16.9 shows the encoder trellis from Figure 16.8 with highlighting on the path that would be traversed to encode the 7-bit message 100010100 plus a 2-bit tail of zeros added to ﬂush out the encoder. The trellis has been modiﬁed for the speciﬁc message length. The eighth and ninth input bits will always be zeros, so the transitions corresponding to inputs of 1 have been removed from the eighth and ninth transition columns of the trellis. The output generated by the encoder is 11

10

11

00

11

10

00

10

11

Suppose that two bit errors occur so that the received code sequence is 10

10

11

00

11

11

00

10

11

The receiver “knows” that the encoder started at the 00 state. Therefore, upon receiving the ﬁrst dibit of 10, the receiver knows that one of two possibilites must have occurred:

520

Coding and Interleaving

input: 1 output: 11 00

0 10

00 11

0 11

00

0 00

00

00

11

10 10 01

01

01 01

11

11

0 10

0 11

00

00

00

11

11

11

11

00

00

00

00

10

10

10

10

01 01

1 00

00

11

11 00 10

0 10

00

11

11

11

1 11

01 01

Chapter 16

01 01

11 10

01 01

01

11 10 time: 0

Figure 16.9 100010100.

1

2

10 3

10 4

10 5

10 6

7

8

9

Encoder trellis showing the path traversed to encode the input sequence

1. The ﬁrst information bit was 0, causing the encoder to remain in the 00 state and produce an output symbol of 00. This symbol was received with an error in the ﬁrst bit, changing 00 into 10. 2. The ﬁrst information bit was 1, causing the encoder to transition from the 00 state to the 10 state and produce an output symbol of 11. This symbol was received with an error in the second bit, changing 11 into 10. The two possibilities are equally likely because each implies that one bit error was introduced by the channel. These results are summarized in the partial trellis of Figure 16.10. Upon receiving the second symbol of 10, the receiver knows that one of four possibilities must have occurred: 1. The encoder was in the 00 state, and the second information bit was 0, causing the encoder to remain in the 00 state and produce an output symbol of 00. This symbol was received with an error in the ﬁrst bit, changing 00 into 10. This possibility implies one bit error in symbol 1 and one bit error in symbol 2 for a total of two bit errors. 2. The encoder was in the 00 state, and the second information bit was 1, causing the encoder to transition from the 00 state to the 10 state and produce an output symbol of 11. This symbol was received with an error in the second

Section 16.4

521

Convolutional Codes

received: 10 00 00 11

1 error

1 error

10

01

11 time: 0

Figure 16.10

1

Partial encoder trellis for time 1.

bit, changing 11 into 10. This possibility implies one bit error in symbol 1 and one bit error in symbol 2 for a total of two bit errors. 3. The encoder was in the 10 state, and the second information bit was 0, causing the encoder to transition from the 10 state to the 01 state and produce an output symbol of 10. This symbol was received correctly. This possibility implies one bit error in symbol 1. 4. The encoder was in the 10 state, and the second information bit was 1, causing the encoder to transition from state 10 to state 11 and produce an output symbol of 01. This symbol was received with errors in both the ﬁrst and second bits, changing 01 into 10. This possibility implies one bit error in symbol 1 plus two bit errors in symbol 2 for a total of three bit errors. Possibility 3 is the most likely, because only one bit error would be needed to turn the hypothesized transmit sequence into the received sequence. The other possibilities would require two or three bit errors to produce the received sequence. The results after reception of the second symbol are summarized in the partial trellis of Figure 16.11. The analysis of the third received symbol gets interesting, because this analysis will reveal the “trick” that makes the Viterbi decoder into something that is vastly more efﬁcient than a combinatorial analysis of all possible error scenarios.

522

Coding and Interleaving

received: 10 00 00 11

Chapter 16

10 2 errors

00 11

2 errors

10 10 01

1 error

01

11 time: 0

Figure 16.11

3 errors 1

2

Partial encoder trellis for time 2.

Upon receiving the third symbol of 11, the receiver knows that one of eight possibilities must have occurred: 1. The encoder was in state 00, and the third information bit was 0, causing the encoder to remain in state 00 and produce an output symbol of 00. This symbol was received with errors in both the ﬁrst and second bits, changing 00 into 11. This possibility implies a total of four bit errors—two in symbol 3 and two in prior symbols. 2. The encoder was in state 00, and the third information symbol was 1, causing the encoder to transition from state 00 to state 10 and produce an output symbol of 11. This symbol was received correctly. This possibility implies two bit errors, both in prior symbols. 3. The encoder was in state 10, and the third information bit was 0, causing the encoder to transition from state 10 to state 01 and produce an output symbol of 10. This symbol was received with an error in the second bit, changing 10 into 11. This possibility implies a total of three bit errors—one in symbol 3 and two in prior symbols. 4. The encoder was in state 10, and the third information bit was 1, causing the

Section 16.4

Convolutional Codes

523

encoder to transition from state 10 to state 11 and produce an output symbol of 01. This symbol was received with an error in the ﬁrst bit, changing 01 into 11. This possibility implies a total of three bit errors—one in symbol 3 and two in prior symbols. 5. The encoder was in state 01, and the third information bit was 0, causing the encoder to transition from state 01 to state 00 and produce an output symbol of 11. This symbol was received correctly. This possibility implies one bit error in a prior symbol. 6. The encoder was in state 01, and the third information bit was 1, causing the encoder to transition from state 01 to state 10 and produce an output symbol of 00. This symbol was received with errors in both the ﬁrst and second bits, changing 00 into 11. This possibility implies a total of three bit errors—two in symbol 3 and one in a prior symbol. 7. The encoder was in state 11, and the third information bit was 0, causing the encoder to transition from state 11 to state 01 and produce an output symbol of 01. This symbol was received with an error in the ﬁrst bit, changing 01 into 11. This possibility implies a total of four bit errors—one in symbol 3 and three in prior symbols. 8. The encoder was in state 11, and the third information bit was 1, causing the encoder to remain in state 11 and produce an output symbol of 10. This symbol was received with an error in the second bit, changing 10 into 11. This possibility implies a total of four bit errors—one in symbol 3 and three in prior symbols. The results after reception of the third symbol are summarized in the partial trellis of Figure 16.12. The ultimate goal of the decoding process is to determine, after all nine 2-bit symbols have been received, which complete path through the trellis the encoder most likely followed during the encoding operation. If the selected path is indeed the path that the encoder followed, then all nine information bits can be recovered correctly even though errors may have occurred in the reception of the encoded symbols. In Figure 16.12, each state at time 3 has two paths arriving from different states in time 2. The circled number near each path indicates the metric for the path. In this case, the metric is the Hamming distance, which is simply the cummulative number of differing bits between the received sequence and the encoder output sequence corresponding to that path. It is assumed that paths with lower metrics are

524

Coding and Interleaving

received: 10 00 00 11

10

11

00

4 errors

00

1 error

11

11

2 errors

11

10

00 10

10 01

01

Chapter 16

3 errors 3 errors

01 4 errors

01

3 errors

11 10 time: 0

Figure 16.12

1

2

4 errors 3

Partial encoder trellis for time 3.

more likely than paths with higher metrics. If the most likely complete path passes through state X at time 3, then this path must include the most likely partial path from state 00 at time 0 to state X at time 3. Thus, of the two paths arriving at each node for time 3, the less likely one can be pruned away. Figure 16.13 shows the pruning result of the partial trellis from Figure 16.12. A sequence of partial trellises for times 4 through 9 is shown in Figures 16.14 through 16.19. The trellis for time k shows all eight possible transitions from states at time k − 1. Then, in the trellis for time k + 1, the non-surviving paths from time k − 1 to time k are pruned away. The trellis in Figure 16.16 shows a tie for state 10 at time 4; both transitions entering state 10 have a metric of 3. In cases of a tie, the surviving path can be selected arbitrarily. The soft-decision metrics, discussed in Section 16.5, greatly reduce the incidence of ties. In the contrived example just presented, the total message length was only 9 bits, including 2 ﬂush bits. Typical message lengths are longer than this, but the practice of using ﬂush bits to drive the encoder back to the 00 state is common for relatively short message lengths. However, for longer messages, the decoding of received bits does not need to wait for the entire encoded message to be received. As shown in Figure 16.16, by time 6, all surviving paths share a common subpath from time 0 through time 3. More complicated codes have more rows in their trellis and it takes more than three symbol times for the surviving paths to merge into a

525

Section 16.5 Viterbi Decoding with Soft Decisions

received: 10 00 00 11

10

11 1 error

00 11

11

11

10 10 01

2 errors 10 3 errors

01

11 time: 0

3 errors 1

2

3

Figure 16.13 Partial encoder trellis for time 3 after pruning to remove non-surviving paths.

single subpath. However, at some point, the early portions of the surviving paths all share a common subpath and it is possible to decode the bits corresponding to this subpath. The number of symbol intervals that must elapse before the decoder can assume that all surviving paths are merged is called the traceback depth of the decoder. For the commonly used constraint-length 7 codes, a traceback depth of 40 symbol times is typical. This means that at time k the decoder can issue the decoded bits for times up through k − 40.

16.5 Viterbi Decoding with Soft Decisions The previous discussion of Viterbi decoders involved only hard decisions—the received symbol decisions input to the decoder were 00, 01, 10, and 11. The real strength of Viterbi decoders is their ability to easily make use of soft decisions. Soft decisions convey an indication of signal quality and how conﬁdent the receiver is regarding the decisions that have been made. Assume that a demodulator output voltage of +1 v corresponds to a bit value of 1, and an output voltage of −1 v corresponds to a bit value of 0. When Gaussian noise is added to the signal, the demodulator output for a binary 1 will have a probability density function (PDF) like the one shown in Figure 16.20. Under a hard-decision

526

Coding and Interleaving

received: 10 00 00 11

10

11

00 1

00

00 11

10

3

11

3

00 10

10 01

5

11

11

11

Chapter 16

00 10 3

01

01

4

01

3

11 10 time: 0

1

Figure 16.14 received: 10 00 00 11

2

4

3

4

Partial encoder trellis for time 4. 10

11

00

00

11

00 11

00

11

3

11 1

11

10

3

11

10

00 5 10 4

10

01

01

01

4

01

4

11 10 time: 0

1

Figure 16.15

2

3

4

4

5

Partial encoder trellis for time 5.

paradigm, all positive voltages would be decided as 1, and all negative voltages decided as 0. The shaded area in the ﬁgure represents the probability of correctly deciding 1 for this noisy demodulator output. The unshaded area under the curve represents the probability of incorrectly deciding 0 instead of 1. A similar PDF can be drawn for a demodulator output consisting of a −1 v signal plus AGN.

527

Section 16.5 Viterbi Decoding with Soft Decisions

received: 10 00

10

11

00

11

00 11

11

11

00

00

11

4

11

11

10

5

3

11

00 4 10

10

01

10

01

2

01

5

01

2

11 10 time: 0

1

2

Figure 16.16 received: 10 00

3

4

5

5

6

Partial encoder trellis for time 6.

10

11

00

11

00 11

11

11

00

4

00

11

11

11

10

00

4

11

11

6

11

00 2 10

10

01

10

4

10

01

01

3

01

4

11 10 time: 0

1

Figure 16.17

2

3

4

5

6

3

7

Partial encoder trellis for time 7.

Soft decisions divide the decision space for a bit interval into more than two regions. In Figure 16.21, the abscissa has been divided into four zones. The zone for v > 0.5 is a conﬁdent decision of 1. This decision is designated as 1H . The zone for 0 < v < 0.5 is a less conﬁdent decision of 1, designated as 1L . The zone for v < −0.5 is a conﬁdent (albeit incorrect) decision of 0, designated 0H , and the zone for −0.5 < v < 0 corresponds to a less conﬁdent decision of 0, designated 0L . Instead of the binary symmetric channel assumed for hard-decision

528

Coding and Interleaving received: 10 00

10

11

00

11

11

00

00 11

11

10 5

00

00

11

4

11

10

Chapter 16

11

11 00

10

10

01

10 2

10 01 01

5

01

11 10 time: 0

1

2

Figure 16.18

received: 10 00

10

3

4

5

6

7

8

10

11 6

Partial encoder trellis for time 8.

11

00

11

11

00

00

00 11

11

2

10 11

00

10 10

01

11

11

10

01 01

11 time: 0

1

2

Figure 16.19

3

4

5

6

7

8

9

Partial encoder trellis for time 9.

decoding, the set of four soft decisions 0H , 0L , 1H , and 1L imply the binary-toquaternary discrete memoryless channel (DMC) diagrammed in Figure 16.22. The probabilities shown in the ﬁgure should not be used directly as branch metrics in a Viterbi decoder. Because they are probabilities, they would need to be multiplied when combining branch metrics into path metrics, and multiplication is something to be avoided in high-speed decoder implementations. Instead, the branch metrics for the decoder should be based on the logarithms of the probabilities, which are listed in Table 16.5. Because multiplication of two numbers corresponds to the addition of their logarithms, branch metrics based on logarithms can be added to

529

Section 16.5 Viterbi Decoding with Soft Decisions

p

area = 0.9 area = 0.1

-1

0

1

V

Figure 16.20 PDF for signal consisting of a 1v level plus additive Gaussian noise.

form path metrics. The values shown in Table 16.5 are all negative with the least negative value corresponding to the most likely event. For convenience, the negative signs can be dropped, making the smallest value correspond to the most likely event. Figure 16.23 shows the trellis from Figure 16.9 redrawn to indicate metrics based on soft decisions. p

area = 0.6 area = 0.075 area = 0.025 -1

-0.5

Figure 16.21

area = 0.3 0

0.5

1

PDF for noisy signal showing regions for soft decisions.

V

530

Coding and Interleaving

0H

0.6

0.025

0

0.3

0L

0.075 0.075 0.3 1L

1

0.025 0.6

1H

Figure 16.22 Binary to quaternary discrete memoryless channel.

Table 16.5 Transition probabilities and their logarithms for the DMC shown in Figure 16.22.

probability 0.60 0.30 0.075 0.025

logarithm −0.22 −0.52 −1.12 −1.60

Chapter 16

00 0 L0H

11 1H1H

10 1L1L

00 0L0H

10 1H0H

11 1H1H

1.64

2.98

6.18

3.56

6.76

7.44

8.51

10.18

12.6

42

11

10

58

54

11

32

0

7.2

0

0

8.3 6 58

10

82

6.

10 01

00

9.

6.98

8

01

.1

10

6

9.00

8.3

01

4

5

00

64

10

10

10

4

01

11

11

7.

5.6

Figure 16.23

3

7.36

6

38

6.

01

6

10

7.

10

00

.1

11

00

7.3

2

7.70

4

1

01

5.5

8

6

time: 0

36

70

7.

01

10

10 01

01

5.5

4.3

6.18

76

00

10

5.

36

18

6.

00

7.

01 01

11

58

01

38

01

10

60

9.

7.

4.

5.

2.

531

10

11

11

96

00

00 10

00

11

6.

5.

11

4.

5.

24

3.

76

00

9.4

11

3.

00

6.

64

00 11

8.4

11

1.

00 11

7. 76

11

00

5.2

00

00

6

11 1H1H

8.3 0

00

10 1L0H

2

Rx:

11 1L0L

2.8

Tx:

6

Decoder trellis showing soft-decision path metrics.

7

8

9

Appendix A

MATHEMATICAL TOOLS

A.1 Trigonometric Identities tan x =

sin x cos x

sin(−x) = − sin x

(A.1.2)

cos(−x) = cos x

(A.1.3)

tan(−x) = − tan x

(A.1.4)

cos2 x + sin2 x = 1

(A.1.5)

cos2 x =

1 [1 + cos (2x)] 2

(A.1.6)

sin(x ± y) = (sin x)(cos y) ± (cos x)(sin y)

(A.1.7)

cos(x ± y) = (cos x)(cos y) ∓ (sin x)(sin y)

(A.1.8)

tan(x + y) =

(tan x) + (tan y) 1 − (tan x)(tan y)

sin(2x) = 2(sin x)(cos x) 532

(A.1.1)

(A.1.9)

(A.1.10)

533

Section A.1 Trigonometric Identities

cos(2x) = cos2 x − sin2 x tan(2x) =

2(tan x) 1 − tan2 x

(A.1.13)

(cos x)(cos y) =

1 [cos(x + y) + cos(x − y)] 2

(A.1.14)

(sin x)(cos y) =

1 [sin(x + y) + sin(x − y)] 2

(A.1.15)

(sin x) + (sin y) = 2 sin

x−y x+y cos 2 2

(A.1.16)

(sin x) − (sin y) = 2 sin

x−y x+y cos 2 2

(A.1.17)

(cos x) + (cos y) = 2 cos

x+y x−y cos 2 2

(A.1.18)

x+y x−y sin 2 2

(A.1.19)

A cos(ωt + ψ) + B cos(ωt + φ) = C cos(ωt + θ )

(A.1.20)

(cos x) − (cos y) = −2 sin

1/2 C = A2 + B 2 − 2AB cos(φ − ψ) A sin ψ + B sin φ −1 θ = tan A cos ψ + B cos φ A cos(ωt + ψ) + B sin(ωt + φ) = C cos(ωt + θ )

where

(A.1.12)

1 [− cos(x + y) + cos(x − y)] 2

(sin x)(sin y) =

where

(A.1.11)

1/2 C = A2 + B 2 − 2AB sin(φ − ψ) −1 A sin ψ − B cos φ θ = tan A cos ψ + B sin φ

(A.1.21)

534

Mathematical Tools

Appendix A

A.2 Table of Integrals

1 dx = ln x x

1 ax e a

(A.2.2)

ax − 1 ax e a2

(A.2.3)

eax dx = xeax dx =

1 sin(ax) dx = − cos(ax) a

cos(ax) dx =

1 sin(ax) a

1 sin(ax + b) dx = − cos(ax + b) a

cos(ax + b) dx =

1 sin(ax + b) a

1 x x sin(ax) dx = − cos(ax) + 2 sin(ax) a a

x cos(ax) dx =

x 1 sin(ax) + 2 cos(ax) a a

(A.2.1)

(A.2.4)

(A.2.5)

(A.2.6)

(A.2.7)

(A.2.8)

(A.2.9)

sin2 ax dx =

sin 2ax x − 2 4a

(A.2.10)

cos2 ax dx =

x sin 2ax + 2 4a

(A.2.11)

535

Section A.2 Table of Integrals

x 2 sin ax dx =

1

2ax sin ax + 2 cos ax − a 2 x 2 cos ax 3 a

(A.2.12)

x 2 cos ax dx =

1

2 2 2ax cos ax − 2 sin ax + a x cos ax a3

(A.2.13)

1 sin3 x dx = − cos x(sin2 x + 2) 3

(A.2.14)

cos3 x dx =

1 sin x(cos2 x + 2) 3

sin x cos x dx = sin(mx) cos(nx) dx =

1 2 sin x 2

cos(m + n)x − cos(m − n)x − 2(m − n) 2(m + n)

(A.2.15)

(A.2.16)

(m2 = n2 ) (A.2.17)

1 1 x − sin(4x) sin x cos x dx = 8 4 2

2

sin x cosm x dx =

− cosm+1 x m+1

(A.2.19)

sinm+1 x m+1

(A.2.20)

sinm x cos x dx =

(A.2.18)

udv = uv −

vdu

(A.2.21)

536

A.3

Mathematical Tools

Appendix A

Logarithms

The base-10 logarithm or common logarithm of a number x is the power to which 10 must be rasied to equal x: y = log10 x ⇔ x = 10y The base-e logarithm or natural logarithm of a number x is the power to which e must be raised to equal x: y = loge x = ln x ⇔ x = ey In the study of Galois ﬁelds and coding theory, it is sometimes necessary to use base-2 logarithms: y = log2 x ⇔ x = 2y Table A.1 lists a number of useful properties of logarithms. Table A.1 Properties of Logarithms.

1.

logb (xy) = logb x + logb y

2.

logb

3.

logb (y x ) = x logb y

4. 5.

1 x

= − logb x

log x logc c = logb x logc b = logb c b ∞ n |z| < 1 ln (1 + z) = (−1)n−1 zn n=1

6.

x

ln x = 1

7.

A.4

d dx

(ln x) =

1 y 1 x

dy

x>0 x>0

Modiﬁed Bessel Functions of the First Kind

The modiﬁed Bessel function of the ﬁrst kind of order zero is used in the analysis of Rice random variables. The modiﬁed Bessel function of the ﬁrst kind of order n

Section A.4

is denoted In (x) and is deﬁned as 1 In (x) = 2π

A.4.1

537

Modiﬁed Bessel Functions of the First Kind

π

exp (x cos θ ) cos (nθ ) dθ

(A.4.1)

−π

Identities ∞ (x/2)2m+n In (x) = m!(n + m)! m=0

(A.4.2)

I−n (x) = In (x)

(A.4.3)

In (−x) = (−1)n In (x)

(A.4.4)

exp(x cos θ ) =

∞

In (x) exp(j nθ )

(A.4.5)

n=−∞

exp(x cos θ ) = I0 (x) + 2

∞

In (x) cos(nθ )

(A.4.6)

n=1

d n x In (x) = x n In−1 (x) dx

(A.4.7)

d In (x) In+1 (x) = dx xn xn

(A.4.8)

1 I0 (x) = π 1 I0 (x) = π

π

exp(x cos θ ) dθ

(A.4.9)

cosh(x cos θ ) dθ

(A.4.10)

0

π 0

Appendix B

PROBABILITY DISTRIBUTIONS IN COMMUNICATIONS

B.1

Uniform Distribution

A random variable uniformly distributed between a and b, where a < b, has a probability density function (PDF) given by . 1 a≤x≤b b−a p(x) = 0 elsewhere The mean is given by μ=

a+b 2

and the variance is given by σ2 =

B.2

(b − a)2 12

Gaussian Distribution

The Gaussian distribution is ubiquitous throughout science and engineering. The Gaussain distribution is also called the normal distribution. A Gaussian random variable has a PDF given by 1 −(x − μ)2 p(x) = √ exp 2σ 2 σ 2π 538

Section B.3

539

Exponential Distribution

where μ is the mean and σ 2 is the variance. For zero mean and unity variance, the PDF reduces to 2 −x 1 p(x) = √ exp 2 2π The cumulative distribution function (CDF) is obtained by integrating the PDF: 2 X 1 −x exp P (x ≤ X) = √ dx 2 2π −∞ This integral cannot be evaluated in closed form, but to facilitate manipulations of Gaussian CDFs, the error function has been deﬁned as x

2 exp −y 2 dy erf x = √ π 0 The complementary error function, erfc, is deﬁned by ∞

2 erfc x = √ exp −y 2 dy π x = 1 − erf x The Q function is deﬁned as 2 ∞ −y 1 exp dy Q(x) = √ 2 2π x x 1 = erfc √ 2 2

B.3

Exponential Distribution

An exponentially distributed random variable has a PDF given by λ exp(−λx) x≥0 p(x) = 0 x