Coordinated Multiuser Communications [1 ed.] 1402040741, 9781402040757, 9781402040740

Coordinated Multiuser Communications provides for the first time a unified treatment of multiuser detection and multiuse

197 63 3MB

English Pages 282 Year 2006

Report DMCA / Copyright

DOWNLOAD PDF FILE

Recommend Papers

Coordinated Multiuser Communications [1 ed.]
 1402040741, 9781402040757, 9781402040740

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

COORDINATED MULTIUSER COMMUNICATIONS

Coordinated Multiuser Communications by

CHRISTIAN SCHLEGEL University of Alberta, Edmonton, Canada and

ALEX GRANT University of South Australia, Adelaide, Australia

A C.I.P. Catalogue record for this book is available from the Library of Congress.

ISBN-10 ISBN-13 ISBN-10 ISBN-13

1-4020-4074-1 (HB) 978-1-4020-4074-0 (HB) 1-4020-4075-X ( e-book) 978-1-4020-4075-7 (e-book)

Published by Springer, P.O. Box 17, 3300 AA Dordrecht, The Netherlands. www.springer.com

Printed on acid-free paper

All Rights Reserved © 2006 Springer No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Printed in the Netherlands.

to Rhonda and Robyn

Contents

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix 1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 The Dawn of Digital Communications . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Multiple Terminal Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Multiple-Access Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 Degrees of Coordination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4.1 Transmitter and Receiver Cooperation . . . . . . . . . . . . . . . 7 1.4.2 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.4.3 Fixed Allocation Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.5 Network vs. Signal Processing Complexity . . . . . . . . . . . . . . . . . . 10 1.6 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2

Linear Multiple-Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Continuous Time Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Discrete Time Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Matrix-Algebraic Representation . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Symbol Synchronous Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Principles of Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 Sufficient Statistics and Matched Filters . . . . . . . . . . . . . . 2.5.2 The Correlation Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.3 Single-User Matched Filter Detector . . . . . . . . . . . . . . . . . 2.5.4 Optimal Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.5 Individually Optimal Detection . . . . . . . . . . . . . . . . . . . . . 2.6 Access Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.1 Time and Frequency Division Multiple-Access . . . . . . . . . 2.6.2 Direct-Sequence Code Division Multiple Access . . . . . . . 2.6.3 Narrow Band Multiple-Access . . . . . . . . . . . . . . . . . . . . . . . 2.6.4 Multiple Antenna Channels . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.5 Cellular Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.6 Satellite Spot-Beams Channels . . . . . . . . . . . . . . . . . . . . . . 2.7 Sequence Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.1 Orthogonal and Unitary Sequences . . . . . . . . . . . . . . . . . .

13 14 17 18 21 22 24 25 27 29 30 31 31 32 36 37 39 41 42 42

VIII

Contents

2.7.2 Hadamard Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3

Multiuser Information Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The Multiple-Access Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Probabilistic Channel Model . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 The Capacity Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Binary-Input Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Binary Adder Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Binary Multiplier Channel . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Gaussian Multiple-Access Channels . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Scalar Gaussian Multiple-Access Channel . . . . . . . . . . . . . 3.4.2 Code-Division Multiple-Access . . . . . . . . . . . . . . . . . . . . . . 3.5 Multiple-Access Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Block Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2 Convolutional and Trellis Codes . . . . . . . . . . . . . . . . . . . . . 3.6 Superposition and Layering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7 Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8 Asynchronous Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45 45 46 46 48 54 54 59 59 59 63 73 75 81 81 84 90

4

Multiuser Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 4.2 Optimal Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.2.1 Jointly Optimal Detection . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.2.2 Individually Optimal Detection: APP Detection . . . . . . . 107 4.2.3 Performance Bounds – The Minimum Distance . . . . . . . . 109 4.3 Sub-Exponential Complexity Signature Sequences . . . . . . . . . . . 112 4.4 Signal Layering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 4.4.1 Correlation Detection – Matched Filtering . . . . . . . . . . . . 118 4.4.2 Decorrelation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 4.4.3 Error Probabilities and Geometry . . . . . . . . . . . . . . . . . . . 120 4.4.4 The Decorrelator with Random Spreading Codes . . . . . . 122 4.4.5 Minimum-Mean Square Error (MMSE) Filter . . . . . . . . . 124 4.4.6 Error Performance of the MMSE . . . . . . . . . . . . . . . . . . . . 126 4.4.7 The MMSE Receiver with Random Spreading Codes . . . 127 4.4.8 Whitening Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 4.4.9 Whitening Filter for the Asynchronous Channel . . . . . . . 132 4.5 Different Received Power Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 4.5.1 The Matched Filter Detector . . . . . . . . . . . . . . . . . . . . . . . 134 4.5.2 The MMSE Filter Detector . . . . . . . . . . . . . . . . . . . . . . . . . 135

Contents

IX

5

Implementation of Multiuser Detectors . . . . . . . . . . . . . . . . . . . . 139 5.1 Iterative Filter Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 5.1.1 Multistage Receivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 5.1.2 Iterative Matrix Solution Methods . . . . . . . . . . . . . . . . . . . 142 5.1.3 Jacobi Iteration and Parallel Cancellation Methods . . . . 143 5.1.4 Stationary Iterative Methods . . . . . . . . . . . . . . . . . . . . . . . 147 5.1.5 Successive Relaxation and Serial Cancellation Methods . 148 5.1.6 Performance of Iterative Multistage Filters . . . . . . . . . . . 151 5.2 Approximate Maximum Likelihood . . . . . . . . . . . . . . . . . . . . . . . . 158 5.2.1 Monotonic Metrics via the QR-Decomposition . . . . . . . . 159 5.2.2 Tree-Search Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 5.2.3 Lattice Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 5.3 Approximate APP Computation . . . . . . . . . . . . . . . . . . . . . . . . . . 170 5.4 List Sphere Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 5.4.1 Modified Geometry List Sphere Detector . . . . . . . . . . . . . 172 5.4.2 Other Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

6

Joint Multiuser Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 6.2 Single-User Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 6.2.1 The Projection Receiver (PR) . . . . . . . . . . . . . . . . . . . . . . . 179 6.2.2 PR Receiver Geometry and Metric Generation . . . . . . . . 182 6.2.3 Performance of the Projection Receiver . . . . . . . . . . . . . . 185 6.3 Iterative Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 6.3.1 Signal Cancellation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 6.3.2 Convergence – Variance Transfer Analysis . . . . . . . . . . . . 195 6.3.3 Simple FEC Codes – Good Codeword Estimators . . . . . . 202 6.4 Filters in the Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 6.4.1 Per-User MMSE Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 6.4.2 Low-Complexity Iterative Loop Filters . . . . . . . . . . . . . . . 214 6.4.3 Examples and Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . 217 6.5 Asymmetric Operating Conditions . . . . . . . . . . . . . . . . . . . . . . . . . 219 6.5.1 Unequal Received Power Levels . . . . . . . . . . . . . . . . . . . . . 220 6.5.2 Optimal Power Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 6.5.3 Unequal Rate Distributions . . . . . . . . . . . . . . . . . . . . . . . . . 228 6.5.4 Finite Numbers of Power Groups . . . . . . . . . . . . . . . . . . . . 232 6.6 Proof of Lemma 6.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234

A

Estimation and Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 A.1 Bayesian Estimation and Detection . . . . . . . . . . . . . . . . . . . . . . . . 237 A.2 Sufficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 A.3 Linear Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 A.4 Quadratic Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 A.4.1 Minimum Mean Squared Error . . . . . . . . . . . . . . . . . . . . . . 242 A.4.2 Cram´er-Rao Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

X

Contents

A.4.3 Jointly Gaussian Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 A.4.4 Linear MMSE Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 245 A.5 Hamming Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 A.5.1 Minimum probability of Error . . . . . . . . . . . . . . . . . . . . . . . 246 A.5.2 Relation to the MMSE Estimator . . . . . . . . . . . . . . . . . . . 246 A.5.3 Maximum Likelihood Estimation . . . . . . . . . . . . . . . . . . . . 246 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265

List of Figures

1.1 1.2 1.3 1.4 1.5

Basic setup for Shannon’s channel coding theorem. . . . . . . . . . . . Multi-terminal networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A historical overview of multiuser communications. . . . . . . . . . . . Multiple-access channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Degrees of cooperation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 3 5 6 8

2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8

Simplified two-user linear multiple-access channel. . . . . . . . . . . . . Continuous time linear multiple-access channel. . . . . . . . . . . . . . . Sampling of the modulation waveform. . . . . . . . . . . . . . . . . . . . . . . The modulation vectors sk [i]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Synchronous model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Symbol synchronous matched filtered model. . . . . . . . . . . . . . . . . . Structure of the cross-correlation matrix. . . . . . . . . . . . . . . . . . . . . Symbol synchronous single-user correlation detection for antipodal modulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Optimal joint detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Modulating waveform built from chip waveforms. . . . . . . . . . . . . . Chip match-filtered model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Multiple transmit and receive antennas. . . . . . . . . . . . . . . . . . . . . . Simplified cellular system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Satellite spot beam up-link. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13 15 19 20 23 25 26

Two-user multiple-access channel. . . . . . . . . . . . . . . . . . . . . . . . . . . Example of a discrete memoryless multiple-access channel. . . . . . Coded multiple-access system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Two-user achievable rate region. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Three-user achievable rate region. . . . . . . . . . . . . . . . . . . . . . . . . . . Two-user binary adder channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Convex hull of two achievable rate regions for the two-user binary adder channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Capacity region of the two-user binary adder channel. . . . . . . . . .

47 49 50 52 52 54

2.9 2.10 2.11 2.12 2.13 2.14 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8

28 30 33 34 37 40 41

56 57

XII

List of Figures

3.9 3.10 3.11 3.12 3.13 3.14 3.15 3.16 3.17 3.18 3.19 3.20 3.21 3.22 3.23 3.24 3.25 3.26 3.27 3.28 4.1 4.2 4.3 4.4 4.5 4.6

4.7 4.8 4.9

Channel as seen by user two. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Capacity region of the two-user binary multiplier channel. . . . . . Example of Gaussian multiple-access channel capacity region. . . Rates achievable with orthogonal multiple-access. . . . . . . . . . . . . . Convergence of spectral density. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spectral efficiency of DS-CDMA with optimal, orthogonal and random spreading. Eb /N0 = 10 dB. . . . . . . . . . . . . . . . . . . . . . . . . . Spectral efficiency of DS-CDMA with random spreading. . . . . . . Random sequence capacity with Rayleigh fading. . . . . . . . . . . . . . Finding the asymptotic spectral efficiency via a geometric construction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rates achieved by some existing codes for the BAC . . . . . . . . . . . Combined 2 user trellis for the BAC . . . . . . . . . . . . . . . . . . . . . . . . Two user nonlinear trellis code for the BAC . . . . . . . . . . . . . . . . . Successive cancellation approach to achieve vertex of capacity region. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Two-user MAC with perfect feedback. . . . . . . . . . . . . . . . . . . . . . . Capacity region for the two-user binary adder channel with feedback. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simple feedback scheme for the binary adder channel. . . . . . . . . . Channel seen by V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Capacity region for the two-user GMAC channel with feedback. Capacity region for symbol-asynchronous two-user Gaussian multiple-access channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Capacity region for two-user collision channel without feedback.

58 60 61 63 66 68 68 73 74 75 82 82 83 85 88 89 89 90 94 95

Classification of multiuser detection and decoding methods. . . . . 98 A joint detector considers all available information. . . . . . . . . . . . 100 Matched filter bank serving as a front-end for an optimal multiuser CDMA detector. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Illustration of the correlation matrix R for three asynchronous users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Illustration of the recursive computation of the quadratic form in (4.11). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Illustration of a section of the CDMA trellis used by the optimal decoder, shown for three interfering users, i.e. K = 3, causing 8 states. Illustrated is the merger at state s, where each of the path arrives with the metric (4.12). . . . . . . . . . . . . . . 105 Illustration of the forward and backward recursion of the APP algorithm for individually optimal detection. . . . . . . . . . . . . . . . . . 108 Bounded tree search. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Histograms of the distribution of the minimum distances of a CDMA system with length-31 random spreading sequences, for K = 31 (dashed lines), and K = 20 users (solid lines), and maximum width of the search tree. . . . . . . . . . . . . . . . . . . . . . . . . . 113

List of Figures

XIII

4.10 Linear preprocessing used to condition the channel for a given user (shaded). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 4.11 Information theoretic capacities of various preprocessing filters. 117 4.12 Geometry of the decorrelator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 4.13 Shannon Bounds for the AWGN channel and the random CDMA decorrelator-layered channel. Compare with Figure 4.11. 124 4.14 Shannon bounds for the AWGN channel and the MMSE layered single-user channel for random CDMA. Compare with Figures 4.11 and 4.13. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 4.15 The partial decorrelating feedback detector uses a whitened matrix filter as a linear processor, followed by successive cancellation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 4.16 Shannon Bounds for an MMSE joint detector for the unequal received power scenarios of one strong user, and equal power for the remaining users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 4.17 Shannon Bounds for an MMSE joint detector the case of two power classes with equal numbers of users in each group. . . . . . . 138 5.1 5.2 5.3 5.4 5.5

5.6 5.7

5.8 5.9 5.10 5.11 5.12 5.13

Illustration of the asynchronous blocks in the correlation matrix R. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 The multistage receiver for synchronous and asynchronous CDMA systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 Example performance of a parallel cancellation implementation of the decorrelator as a function of the number of iteration steps.144 BER Performance of Jacobi Receivers versus system load for an equal power system, i.e. A = I. . . . . . . . . . . . . . . . . . . . . . . . . . . 146 Visualization of the Gauss-Seidel update method as iterative minimization procedure, minimizing one variable at a time. The algorithm starts at point A. . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Iterative MMSE filter implementations for random CDMA systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Shannon bounds for the AWGN channel and multistage filter approximations of the MMSE filter for a random CDMA system with load β = 0.5. The iteration constant τ was chosen according to (5.39). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 Similar Shannon bounds for multistage filter approximations of the decorrelator with load β = 0.5. . . . . . . . . . . . . . . . . . . . . . . . 157 Three-user binary tree. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Performance of the IDDFD for a system with 20 active users and random spreading sequences of length 31. . . . . . . . . . . . . . . . . 164 Performance of the IDDFD under an unequal received power situation with one strong user. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Performance of different multiuser decoding algorithms as a function of the number of active users at Eb /σ 2 = 7 dB. . . . . . . . 166 Sphere detector performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

XIV

List of Figures

5.14 Sphere detector average complexity. . . . . . . . . . . . . . . . . . . . . . . . . 170 6.1

6.2 6.3 6.4 6.5 6.6

6.7 6.8 6.9 6.10

6.11 6.12

6.13

6.14 6.15 6.16 6.17 6.18

6.19

Comparison of the per-dimension capacity of optimal and linearly processed random CDMA channels. The solid lines are for β = 0.5 for both linear and optimal processing, the dashed lines are for a full load β = 1. . . . . . . . . . . . . . . . . . . . . . . . 177 Diagram of a “coded CDMA” system, i.e. a CDMA system using FEC coding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 Projection Receiver block diagram using an embedded decorrelator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 Projection Receiver diagram using an embedded decorrelator. . . 183 Lower bound on the performance loss of the PR. . . . . . . . . . . . . . 186 Performance examples of the PR for random CDMA. The dashed lines are from applying the bound from Theorem 6.1. The values of Eb /N0 is in dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Iterative multiuser decoder with soft information exchange. . . . . 193 Soft cancellation variance transfer curve. . . . . . . . . . . . . . . . . . . . . 196 Code VT curves for a selection of low-complexity FEC codes. . . 197 VT chart and iteration example for a highly loaded CDMA system. FEC code VT is dashed, the cancellation VT curve is the solid curve. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 Illustration of the turbo effect of iterative joint CDMA detection.199 Illustration of variance transfer curves of various powerful error control codes, such as practical-sized LDPC codes of length N = 5000 and code rate R = 0.5, as well as two serially concatenated turbo codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 VT transfer chart and iteration example for a highly loaded CDMA system using a strong serially concatenated turbo code (SCC 2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 Determination of the limiting performance of FEC coded CDMA systems via VT curve matching for Eb /N0 → ∞. . . . . . . 203 Bit error performance of SCC2 from Table 6.1 as a function of the number of iterations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 Interference cancellation with a weak rate R = 1/3 repetition code, acting as non-linear layering filter. . . . . . . . . . . . . . . . . . . . . . 205 Achievable spectral efficiencies using linear and nonlinear layering processing in equal power CDMA systems. . . . . . . . . . . . 210 Variance transfer curves for matched filter (simple) cancellation and per-user MMSE filter cancellation (dashed lines) for β = 2 and Es /N0 = 0dB and Es /N0 → ∞. . . . . . . . . . . 213 Variance transfer curves for various multi-stage loop filters for a β = 2 and two values of the signal-to-noise ratio: P/σ 2 = 3dB and P/σ 2 = 23dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

List of Figures

XV

6.20 Variance transfer chart for an iterative decoder using convolutional error control codes and a two-stage loop filter, showing an open channel at Eb /N0 = 4.5dB. . . . . . . . . . . . . . . . . . 217 6.21 Bit error rate performance of an iterative canceler with a two-stage loop filter for 1,10,20, and 30 iterations, compared to the performance of an MMSE loop filter. . . . . . . . . . . . . . . . . . . 218 6.22 Optimal and linear preprocessing capacities for various system loads for random CDMA signaling. . . . . . . . . . . . . . . . . . . . . . . . . . 218 6.23 Performance of low-rate repetition codes in high-load CDMA systems, compared to single-user layered capacities for matched and MMSE filter systems. . . . . . . . . . . . . . . . . . . . . . . . . . 220 6.24 CDMA spectral efficiencies achievable with iterative decoding with different power groups assuming ideal FEC coding with rates R = 1/3 for simple cancellation as well as MMSE cancellation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 6.25 Capacity polytope illustrated for a three-dimensional multiple-access channels. User 2 is decoded first, than user 1, and finally user 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 6.26 CDMA spectral efficiencies achievable with iterative decoding with equal power groups assuming ideal FEC coding with optimized rates according to (6.124). . . . . . . . . . . . . . . . . . . . . . . . . 231 6.27 Illustration of different power levels and average VT characteristics shown for both a serial turbo code (on the left) and a convolutional code (on the right). The system parameters are K1 = 22, K2 = 18, K3 = 16 for the SCC system at an Eb /N0 = 13.45dB, and K1 = K2 = K3 = 20 at and Eb /N0 = 8.88dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

List of Tables

3.1 3.2 3.3 3.4 6.1

Coding schemes shown in Figure 3.18. . . . . . . . . . . . . . . . . . . . . . . Uniquely decodeable rate 1.29 code for the two-user BAC. . . . . . Non uniquely decodeable code for the two-user BAC. . . . . . . . . . Rate R = (0.571, 0.558) uniquely decodeable code for the two-user binary adder channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

76 76 77 78

Serially Concatenated Codes of total rate 1/3 whose VT curves are shown in Figure 6.12. For details on serial concatenated turbo codes, see [120, Chapter 11]. . . . . . . . . . . . . . . . . . . . . . . . . . . 200

Preface

Mathematical communications theory as we know it today is a fairly young, but rapidly maturing science just over 50 years old. Multiple-user theories extend back to the same recent birthplace, but are only recently showing the first early signs of maturation. The goal of this book is to present both classical and new approaches to the problems of designing co-ordinated communications systems for large numbers of users. The problems of reliable information transfer are in most cases intertwined with the problems of allocating the sparse resources available for use. The multiuser philosophy attempts to optimize whole systems, by combining the multiple-access and information transmission aspects. It is the purpose of this book to introduce the reader to the concepts involved in designing multiple-user communications systems. To achieve this goal, conventional multiple-access strategies are contrasted with newer techniques, such as joint detection and interference cancellation. Emphasizing the theory and practice of unifying accessing and transmission aspects of communications, we hope that this book will be a valuable reference for students and engineers, presenting many years of research in book format. Chapter 2 sets out the main area of interest of the book, namely the linear multiple-access channel. The emphasis is on obtaining a general model with wide application. Chapter 3 gives an overview of results from multiuser information theory, concentrating on the multiple-access channel. The remainder of the book, Chapters 4–6 are devoted to the design and analysis of multiuser detectors and decoders. Chapter 4 describes joint detection strategies for uncoded systems, and implementation details for such detectors are considered in Chapter 5. Joint decoders for systems with error control coding is the subject of Chapter 6, which concentrates on the iterative decoding paradigm. The multiple-user communications philosophy does not solve the world’s communications needs. With every problem addressed, others are peering out of dark places under the guise of complexity. It is the goal of this book to put the tools, techniques and most importantly the philosophy of multiple-user communications systems into the hands and minds of the reader.

XX

Preface

As we write these final words, it seems that the information and communications theory community is embarking on a renewed multiple-user revolution, far beyond the scope of this book. We look forward with great anticipation to what the future holds for communications networks.

Park City, Utah and Adelaide, South Australia May 2005

Christian Schlegel Alex Grant

1 Introduction

1.1 The Dawn of Digital Communications Early in the last century, a fundamental result by Nyquist, the sampling theorem, ushered in the era of digital communications. Nyquist [91] showed that a band-limited signal, that is, a signal whose spectral representation was sharply contained in a given frequency band, could be represented by a finite number of samples. These samples can be used to exactly reconstruct the original signal. In other words, the sampling theorem showed that it is sufficient to know a signal at discrete time intervals, and there is no need to store an entire signal waveform. This discretization of time for purposes of information transmission was a very important starting point for the sweeping success of digital information representation and communications later in the 20th century. In 1948, Shannon [123] showed that the time-discrete samples used to represent a communications signal could also be discretized in amplitude, and that the number of levels of amplitude discretization depended on the noise present in the communications channel. This is in essence Shannon’s celebrated channel coding theorem, which assigns to any given communications channel a capacity, which represents the largest number of (digital) information bits that can be reliably transported through this channel. Combined with Nyquist’s sampling theorem, Shannon’s channel coding theorem states that information can be transported in discrete amounts at discrete time intervals without compromising optimality. That is, packaging information into time-discrete units with discrete (possibly fractional) numbers of bits in each unit is the optimal way of transmitting information. This realization has had a profound impact on communications and information processing. Virtually all information nowadays is represented, processed, and transported in discrete digital form. The Shannon channel coding theorem clearly played a pivotal role in this drive towards digital signaling. It quantifies the fundamental limit of the information carrying capacity of a communications channel between a single transmitter and a single user. This set-up is illustrated in Figure 1.1, where

2

1 Introduction

a transmitter sends time-discrete symbols from a (typically) finite signaling alphabet through a transmission channel. The channel is the sum total of all that happens to the signal from transmitter to receiver. It includes distortion, noise, interference, and other effects the environment has on the signal. The receiver extracts the transmitted information from the channel output signal. It can do so only if the transmission rate R, in bits per symbol, is smaller than the channel capacity C, also measured in bits per symbol. Channel

Information Source

Message

Signal Transmitter

+

Received Signal

Message Receiver

Destination

Noise Source

Fig. 1.1. Basic setup for Shannon’s channel coding theorem.

Shannon’s channel coding theorem says more, however. The above statement is generally known as the converse to the channel coding theorem, stating what is not possible, i.e. where the limits are in terms of admissible rates. The direct part of the theorem states that there exist encoding and decoding procedures that allow the transmitted rate R approach the channel capacity arbitrarily closely with an error rate that can be made arbitrarily small. The cost to achieve this lies in the length of the code, that is, the block of data that is processed as a unit has to grow in size. Additionally, the complexity of the decoding algorithm increases as well, leading to ever more complex error control decoding circuits which can push rates closer to the capacity of the channel. Typically, the computation of the channel capacity is a fairly straightforward exercise, while the design, study, and analysis of capacity achieving encoding and decoding methods can be exceedingly difficult. For example, if the transmitter is restricted to transmit with average power P and the channel is affected only by additive white Gaussian noise with power N , and has a bandwidth of W Hz, its capacity is given by   P C = W log2 1 + , [bits/s/Hz]. (1.1) N Equation (1.1) is arguably the most famous of Shannon’s formulas, and its simplicity belies the depth of the channel coding theorem associated with it.

1.2 Multiple Terminal Networks

3

We will encounter channels such as that one in Figure 1.1 repeatedly throughout this book. It behooves us to recall that the tremendous progress towards realizing the potential of a communications channel via complex encoding and decoding algorithms was mainly fueled by the enormous strides that the technology of very large-scale integrated (VLSI) circuits and systems has made over the last 5 decades since the invention of the transistor, also in 1948.

1.2 Multiple Terminal Networks In real-world situations, the clean arrangement of a single transmitter and a single receiver is more and more becoming a special case, as most transmissions are occurring in a multi-terminal network environment, which consists of a potentially large number of transmitters and receivers. Communication takes place over a common physical channel. The messages from each transmitter are only intended for some subset of the receivers, but may be received by all receivers. This is illustrated in Figure 1.2 in which network nodes are shown as circles and transmissions are the links..

8 2 9 1

5 4 3 7

6

Fig. 1.2. Multi-terminal networks.

Transmitters cause interference for the non-intended receivers. Traditionally, this interference has been lumped into the channel noise, and a set of single-channel models has been applied to multiple terminal communications. This is still the case with modern spread-spectrum based multiple-access systems such as the cdma2000 standard [134].

4

1 Introduction

Multiple terminal networks can be decomposed into more basic components depending on the functionality of the system or the service. • In a multiple-access channel (MAC) a number of terminals attempt to reach a common receiver via a shared joint physical channel (e.g. nodes 1, 3, 4 and 2 in Figure 1.2). This scenario is the closest to a single-user channel and is best understood theoretically. • In broadcast channels the reverse is the case. A common transmitter attempts to reach a multitude of receivers (e.g. nodes 5, 8, and 9). The goal is to reach each receiver with possibly different messages at different data rates, at a minimum of resources. The simple description of the broadcast channel belies its theoretical challenge, and very little is known for general broadcast channels. • In a relay channel information is relayed over many communications links from a transmitter to a receiver (e.g. nodes 4, 5, and 8). The relays may have no information of their own to transmit, and exist to assist the transmission of other users. In the simplest case, these links are single-user channels. The Internet uses such channels, where information traverses many communications links from source to destination. • In an interference channel each transmitter has one intended receiver and one or more unintended receiver (e.g. nodes 4, 5, 6 and 7). • In two-way channels both terminals act as transmitters and receivers and share a common physical channel which allows transmission to occur in both directions. A data network comprises an arbitrary combination of all of these channels, and a general theory of network communication is still in its infancy. An early overview of multi-terminal channels can be found in [87]. A more recent treatment can be found in [24]. In this book we will deal almost exclusively with the multiple-access channel. This has several reasons. First, the multiple-access channel is the most natural extension of the single-user channel, and Shannon’s fundamental results are fairly easily extendible to this case. The multiple-access channel is also the most basic physical level channel, since, even in a more general data network, it describes the behavior of a single receiver node which is within reach of several transmitters, and the fundamental limits of the multiple-access channel apply to this situation. Lastly, the information transmission problem for the multiple-access channel, in contrast to the other multiple terminal network arrangements, can be addressed by transmitter and receiver designs conceptually analogously to single-channel designs largely by designing appropriate physical layer systems. The multiple-access model includes some very important modern-day examples, such as the code-division multiple-access (CDMA) channel and the multiple-antenna channel, a recently popularized example of a multiple-input multiple-output (MIMO) channel (see Chapter 3).

1.2 Multiple Terminal Networks

5

Figure 1.3 shows a coarse time-line of some major developments in multiple-access information theory, signaling, and receiver design for the multiple-access channel. Multiple-terminal information theory began with Shannon, who considered the two way channel. He also claimed to have found the capacity region of the multiple access channel, but this was never published, and it was not until the early 1970s that research into multiple-terminal information theory became widespread. Successively more detailed and more general channel models have been considered since then, and this progress is subject of Chapter 3, which also describes some of the early attempts at code design for the multiple-access channel. In the mid 1970s it was realized that the performance of multiple-access receivers for uncoded transmissions could be improved using joint detection, at a cost of increased implementation complexity. This motivated much research into practical signal processing methods for joint detection, particularly linear filtering methods. This is the subject of Chapter 4 and 5, the latter dealing specifically with implementation details. After about 1996, much of the receiver design and analysis has focused on iterative turbo-type receivers for coded transmissions. These methods are discussed in detail in Chapter 6 of this book.

























฀ ฀ ฀ ฀



฀ ฀

฀ ฀









฀ ฀











฀ ฀







฀ ฀





฀ ฀







฀ ฀









฀ ฀



฀ ฀



฀ ฀ ฀



฀ ฀ ฀ ฀





฀ ฀











฀ ฀



฀฀



฀ ฀

฀ ฀











฀ ฀ ฀



฀ ฀





฀ ฀ ฀ ฀ ฀ ฀

฀ ฀ ฀

฀ ฀



Fig. 1.3. A historical overview of multiuser communications.



฀ ฀ ฀ ฀ ฀

฀ ฀ ฀ ฀













฀฀









฀ ฀







6

1 Introduction

1.3 Multiple-Access Channel Figure 1.4 shows the basic situation of a multiple-access channel where a number of terminals communicate to a joint receiver using the same channel. As mentioned earlier, this situation arises naturally in a number of important practical cases, such a cellular telephone networks. Much is known about the multiple access channel, such as its capacity, or more appropriately its capacity region discussed in Chapter 3, which is the analog of the channel capacity in the single-channel case. The information theoretic properties of the multiple-access channel were first investigated in [2]. An important innovation that grew out of the information theoretic results for the multiple-access channel is that users should be decoded jointly, rather than independently, treating other users as interference. The joint receiver makes use of the known structure of all the users’ transmitted signals, and potentially significant gains can be accomplished by doing this. The how and why of joint decoding is the major theme of this book. As can be appreciated from Figure 1.4, the capacity limits of the multiple-access channel, and algorithmic ways to approach them, is of major importance in any kind of multi-terminal network, since the data flow through the multi-access node is limited by the multiple-access channel it forms with the transmitting terminals. These limits apply whether data is destined for the multi-access node or not. We wish to restate that most of the concepts and results find application in the physical layer of a communications system, albeit the next higher layer, the medium access layer, will also have to be involved in efficient overall network design. Functions which are to be executed at the medium access layer include the selection of power levels, transmission rates, and transmission formats.

Information Source 1

Message 1

Information Source 2

Message 2

Information Source 3

Message 3

Transmitter 1

Multiple-Access Channel

Message 1

Received Signal

Transmitter 2

+

Transmitter 3

Noise Source

Receiver

Fig. 1.4. Multiple-access channel.

Destination 1

Message 2 Destination 2

Message 3

Destination 3

1.4 Degrees of Coordination

7

1.4 Degrees of Coordination One important conceptual and practical extension that needs to be added to the multiple-access communications problem is the coordination among the different communicating terminals. Different levels of coordination have a strong impact on the ability of the multi-access receiver to operate efficiently, even though the information theoretic capacity of the channel can be quite insensitive to these different levels of cooperation. The main cooperation concepts are i) source cooperation and ii) synchronization: 1.4.1 Transmitter and Receiver Cooperation Different levels of transmitter and receiver cooperation correspond to different statistical assumptions concerning the users’ transmissions, and different assumptions about the type of receiver signal processing. Three types of cooperation are represented in Figure 1.5 No cooperation: Each source independently transmits information. The received signal is decoded separately by different receivers. The decoder for each user treats every other user as unwanted noise. This situation essentially turns each of the channels into an independent single-user channel, where all the effects of concurrent users are lumped into channel impairments. Current cellular telephony systems and the vast majority of communications systems to date apply this methodology. Receiver cooperation: Each source independently transmits information. The receiver makes full use of the received signal by performing joint decoding. This is usually what we mean by coordinated multiuser communications. The resulting channel is the multiple-access channel with independent transmissions. Full cooperation: The sources may fully coordinate their transmissions. Joint decoding is used at the receiver. If full cooperation is allowed (requiring communication between users), there is no restriction of independence on the joint transmission of the different users. Full cooperation allows the channel resource to be used as if by a single “super-source”, which can be used as a benchmark. In certain cases, full cooperation may not add to the fundamental capacity (region) of many practically important multipleaccess channels (such as the Gaussian multiple-access channel). In such cases, from a theoretical perspective, it is unimportant if the communicating terminals are physically separated and independent or are co-located and can coordinate their transmissions. Full cooperation can however be very important in practice to reduce the complexity burden of the receiver. The remaining possibility - transmitter cooperation, without receiver cooperation is also of interest. This is to the broadcast channel, which we do not consider in this book. Receiver cooperation with independent transmitters is the main theme of this book.

8

1 Introduction

Transmitter 1

Receiver 1

Channel

Transmitter 2

Receiver 2

(a) No cooperation.

Transmitter 1

Channel

Receiver

Transmitter 2

(b) Receiver cooperation.

Transmitter

Channel

(c) Full cooperation. Fig. 1.5. Degrees of cooperation.

Receiver

1.4 Degrees of Coordination

9

1.4.2 Synchronization The other important type of coordination between users in a network is synchronization. Whereas the different levels of cooperation described above are concerned with the sharing of transmitted data, or coordinating transmission signals, various degrees of synchronism are a consequence of the availability of a common time reference. Depending upon the time reference available, users may align their symbols or frames. Symbol Synchronism: Users strictly align their symbol epochs to common time boundaries. It is important to note that the reference location is the receiver, i.e. the transmitted symbols must align when received. Frame Synchronism: The users align their codewords to common time boundaries. This is a less restrictive level of synchronization and therefore easier to accomplish than symbol synchronism. Phase/Frequency Synchronism: Especially in wireless transmissions the phase and frequency of the carrier signal are very important. Knowing these, a receiver can be built with significantly less effort than without that knowledge. Phase and frequency synchronization are typically only feasible when full cooperation is possible. This is the case for multipleantenna transmission systems for example. We shall see that loss of either type of synchronism may change the information rates that can be achieved. 1.4.3 Fixed Allocation Schemes One conceptually simple way to share a common channel is to use some kind of fixed allocation. This is the current state of the art, and the following allocation methods are widely used: Time-Division Multiple-Access (TDMA): Users do not transmit simultaneously, they transmit in orthogonal time periods, and therefore, from a “Shannon” point of view, they all use single channels in the conventional sense. This does require that transmissions are (frame) synchronized in time, which typically adds significant system overhead. Frequency-Division Multiple-Access (FDMA): Users transmit in orthogonal frequency bands instead of time intervals. The available channel is sliced up into a number of frequency channels which are used by a single user at a time. This requires coordination of the transmissions and frequency synchronization of the transmissions. Ideally FDMA and TDMA behave similarly on an ideal channel, however, on mobile channels an FDMA system may suffer more significantly from signal fading. The pan-European cellular telephony system GSM [37] uses a combination of FDMA and TDMA.

10

1 Introduction

Code-Division Multiple-Access (CDMA): This relative newcomer uses orthogonal, or nearly orthogonal, signaling waveforms for each of the users. It should properly be called signal-space-division multiple-access, but historically, its most popular representative was CDMA, where the signal waveforms were generated by spreading codes, hence the terminology. Both time- and frequency-division can be viewed as special instances of signal-space-division. Code-division multiple access also requires transmitter synchronization. Furthermore, the signals used must be orthogonal at the receiver. Many real-world channels destroy this orthogonality, and the channel turns into a proper multiple-access channel. CDMA finds application in cellular-based telephony systems such as IS-95 [135], cdma2000 [134], and future third generation systems. Mathematical and physical details of these accessing schemes are discussed in detail in Chapter 2.

1.5 Network vs. Signal Processing Complexity State-of-the art networks such a cellular telephony systems use mostly orthogonal allocation schemes to simplify the transmission technology. This, however, puts organizational and computational burden on the network controls. Networks using fixed allocation schemes must maintain synchronism – this requires reliable feedback channels from the terminals. The network must also perform the resource allocation, which requires that the complete state of the network is known to the controller. These requirements can lead to very complex network control operations, which will become more and more complex as wireless networks migrate to packet transmission formats and ad-hoc operation. The precise power control mechanism required in CDMA cellular networks is an example of such a centralized network function [154]. Joint accessing of a MAC using multiple-user coding strategies has the potential to significantly increase the system capacity while reducing the overhead associated with network control functions. Complexity is transfered into the receiver, which is equipped with signal processing functions which can extract information streams from the correlated composite signal. With the rapid advances of VLSI technology, receiver designs which seemed impossible just years ago, are now well within reach as VLSI chips push processing powers into the tera flop region. A well-designed multiuser receiver will relax or obviate the requirements on network synchronization and coordination. The properties of the signal processing functions provide for access coordination. For example, since a multiuser detector can process signals with different received power levels (see Chapter 6), power control does not need to be realized by the network anymore. Since a multiuser detector can potentially approach the capacity of the multiple-access channel, the physical channel resources can be used optimally. This can lead to dramatic increases in system capacity.

1.6 Future Directions

11

Clearly, we are not getting anything for free, and the complexity of a multiuser receiver can be significantly higher than that of single-channel receivers. In Chapter 5 we will present joint receiver structures which can be viewed as bridges between single-channel receivers and joint receivers, in that they rely on only moderately complex extensions of conventional single-channel receivers. In general, however, it is fair to say that complexity is transferred from the network to the receiver, where is is realized in VLSI circuits. The ability to implement complex systems with comparable ease allows the exploitation of channel resources right at the receiver, rather than having to try to compensate for inefficient receiver processing with expensive network-level measures.

1.6 Future Directions As pressure rises to use expensive channel resources such as power and bandwidth more efficiently, highly resource-efficient systems will replace less efficient solutions. Multiuser receivers are required to exploit these resources optimally at the physical layer of a multiple-access channel. A level higher up, resources will likely have to allocated dynamically as needed rather than as fixed allocations. This requires a high level of coordination. Intelligent medium-access control layer algorithms have to select resource allocations, but physical layer multiuser receivers will allow the optimal effect of a chosen allocation. Such receivers will also alleviate the pressure on the medium-access control layer by allowing the receivers to adjust to many channel aspects, such as different received power levels. The dropping cost of signal processing allows efficient exploitation of channel resources right at the receiver, making it possible to approach the fundamental limits of the communications channel. Joint signal processing at the receiver, combined with novel medium-access control layer control protocols will allow future data networks to harness the intrinsic capacity of the multitude of channels in a multi-terminal network. Laying the foundations of the physical layer processing methodologies for such future networks is the purpose of this book.

2 Linear Multiple-Access

Most of the multiple-access channel models of interest in this book are linear, meaning that the channel output is a linear transformation of the user’s input signals, affected by additive noise. A simplified example of a two-user linear multiple-access channel is shown in Figure 2.1. The corresponding mathematical model is r(t) = d1 (t) + d2 (t) + z(t), where r(t) is the received signal, d1 (t) and d2 (t) are the signals transmitted by user 1 and 2, and z(t) is an additive noise process.

User 1

d1 (t)

r(t)

User 2

d2 (t)

Receiver

z(t)

Noise

Fig. 2.1. Simplified two-user linear multiple-access channel.

The assumption of linearity may be based for example on the underlying linear superposition of signals which occurs in radio transmissions. In many real-world applications however, various non-linear effects (for example due to amplifier non-linearities) may be present. Such non-linearities will be ignored in this book. The purpose of this Chapter is to develop a reasonably generic mathematical model for the linear multiple-access channel, to be adopted for the remain-

14

2 Linear Multiple-Access

der of this book. This model describes a number of users accessing a common channel by modulating their data signals with specially designed waveforms. We shall begin in Section 2.1 by describing the channel model in continuous time, and further develop a signal-space representation. We shall then discretize time in Section 2.2, via sampling and derive a linear matrix-algebraic model in Section 2.3 that will form the basis for much of the discussion in the remainder of this book. This matrix-algebraic model applies to both the asynchronous and synchronous case, which facilitates the later discussion of detection algorithms. The special case of the symbol synchronous model will be discussed in Section 2.4. Following on from this in Section 2.5 we give a brief overview of the principles of single-user and multiple-user detection for these channels. In particular, the single-user correlation detector, described in Section 2.5.3 has motivated the design of many existing multiple-access systems (historically due to its low implementation complexity). In Section 2.6 we describe some of these multiple-access systems, including time- and frequency-division methods, and show how the discrete-time linear model is specialized to each case. Of particular interest is the direct-spread code-division multiple-access channel of Section 2.6.2, which forms a common theme for the remainder of this book. The Chapter concludes with an investigation into the design of signaling waveforms for single-user detection in Section 2.7.

2.1 Continuous Time Model Figure 2.1 shows a schematic representation of a K user continuous time linear multiple-access channel. Each user’s transmitter consists of a data source, an encoder (i.e. forward error correction), and a modulator. Typically the data source and encoder output binary symbols (bits). It will be assumed that any data compression is contained within the data source, and we do not consider it explicitly. Modulation is performed in two stages, first a baseband modulator maps the coded data onto a complex signaling constellation. The resulting complex samples are then amplified and multiplied by a channel modulation waveform, which is the method used to access the common channel. The receiver observes a signal which is the sum of delayed versions of the modulated users’ signals, together with noise. In Figure 2.1, a dashed box shows the portion of the system that we refer to as the multiple-access channel. Strictly speaking, from an information theoretic point of view, the channel modulation waveforms should not be regarded as part of the channel. For the purposes of channel modeling however, we will consider the modulation waveforms as part of the channel. We will now develop a detailed mathematical model for this linear multiple-access channel. Considering the output of the baseband modulator, each user k = 1, . . . , K generates a sequence of n data symbols dk [i], i = 1, 2, . . . , n. The subscript k

2.1 Continuous Time Model √ Data Source

Data Source

Encoder

Encoder

Baseband Modulator

Baseband Modulator

P1 s1 (t)

d1 [i] √

Multiple-Access Channel

τ1

x1 (t)

P2 s2 (t)

d2 [i] τ2

x2 (t) r(t)

Encoder

Baseband Modulator

Noise

τK dK [i]

Receiver

z(t)

√ PK sK (t) Data Source

15

xK (t)

Fig. 2.2. Continuous time linear multiple-access channel.

indexes the user and the square-bracket indexing i denotes the symbol time index. Each data symbol is selected from a baseband modulation alphabet, dk [i] ∈ Dk . The users’ alphabets need not be the same for all users, and may be discrete or continuous. In general, they may be subsets of the complex numbers, Dk ⊂ C, which allows us to use the sequences dk [i] to model various forms of complex baseband modulation. For example antipodal modulation, such as binary phase-shift keying can be modeled by Dk = {−1, +1}. Alternatively, bi-orthogonal, e.g. quaternary phase-shift keyed data can be modeled √ by Dk = {1 + j, −1 + j, −1 − j, 1 − j}/ 2. See [104, 167] for basic concepts of signal space and the complex baseband representation of signals. In Figure 2.1 the thicker lines indicate complex signal paths. For the purposes of channel modeling, we shall not be concerned with how the discrete-time symbol sequences dk [i] are generated. For example they could represent either uncoded or coded information sequences. In Chapter 3, we shall be more concerned with statistical models for the users’ messages, which is of relevance for information theory. Each data symbol has duration T seconds. We turn the discrete-time symbol sequences dk [i] into continuous-time sequences (a function of the continuous time variable t ≥ 0) by writing   t , dk (t) = dk T where ⌈·⌉ is the ceiling function which returns the smallest integer that is greater than or equal to its argument. We shall use the convention that discrete-time sequences are indicated with square bracket indexing by i. Integer indices such as k and i shall start

16

2 Linear Multiple-Access

from 1. Continuous time waveforms will be indicated with the round bracket indexing by t ≥ 0. The symbol √ sequences for each user are amplified via multiplication by the amplitudes Pk ≥ 0. The resulting scaled sequences are then linearly modulated with a real or complex continuous-time waveform sk (t). It will be assumed without loss of generality that the per-symbol energy of the modulating waveforms is normalized so that 1 nT



nT

0

|sk (t)|2 dt = 1.

(recalling that n consecutive symbols are transmitted with total duration nT ). It will also be assumed that sk (t) = 0,

t < 0, t > nT.

Under the further assumption that the modulation waveforms sk (t) are independent of the data sequences dk [i] this results in a per-symbol transmit energy P . Appropriate selection of the modulation waveforms allows us to represent a wide variety of linear modulation schemes. Some examples will be developed in Section 2.6. Each transmitted waveform experiences a time delay of τk seconds. These delays can model the propagation delays inherent in the channel, and the fact that the individual users may not be synchronized in time. Without loss of generality, we can assume that 0 ≤ τk < T (delays longer than one symbol period can be modeled by re-indexing the users’ symbols). Thus (in the absence of delay) the signal contribution xk (t) of user k = 1, 2, . . . , K is   t xk (t) = dk Pk sk (t). T The delay and noise-affected channel output waveform r(t) is therefore given by r(t) =

K

k=1

=

K

k=1

xk (t − τk ) + z(t) dk



t − τk T

 Pk sk (t − τk ) + z(t),

(2.1)

(2.2)

where z(t) is a continuous time noise process, which is usually assumed to be Gaussian (real or complex to match the modulating waveforms and data alphabet). See [49] for a discussion of the Gaussian noise process in relation to communications systems.

2.2 Discrete Time Model

17

2.2 Discrete Time Model For the purposes of analysis and indeed implementation, it can be more convenient to work with an equivalent discrete-time model. By a discrete-time model, we mean a representation of r(t) in terms of an integer-indexed sequence r[j], j = 1, 2, . . . . By equivalent, we require r[j] to be a sufficient statistic for detection of the users’ data, according to Section A.2. A discrete-time model is obtained by representing the received signal r(t) using a complete orthonormal basis (see [49, Chapter 8]). Let the functions φj (t), j = 1, 2, . . . be a complete orthonormal basis. Then the discrete-time representation is

r(t) = r[j]φj (t) where  r[j] = r∗ (t)φj (t) d(t). ∗

where (·) denotes complex conjugation. There are many possible choices for the basis φj , depending on the structure of the transmitted waveforms. In any practical system, the modulating waveforms sk (t) will be (at least approximately) band-limited to a interval of frequencies [−W, W ], and (approximately) time-limited to an interval [0, n T ] and in this case, one common choice of basis is the set of sampling functions sin 2W π t − j−1 2W , j = 1, 2, . . . , 2W nT + 1. (2.3) φj (t) = √ 2W π t − j−1 2W

Using this basis (2.3), a finite dimension discrete-time model is obtained by sampling the received waveform (2.2) at intervals of 1/(2W ) seconds.   j−1 (2.4) r[j] = r 2W 

     K j−1

j−1 j−1 2W − τk = , (2.5) − τk + z dk Pk sk T 2W 2W k=1

where j = 1, 2, . . . , 2W nT + 1 is the sample time index (recalling our convention to start integer indices from 1). With reference to Figure 2.3, let Tk = ⌈2W τk ⌉ be the delay of user k, rounded up and measured in samples, and for future reference define Tmax = max Tk . k

The first non-zero sample instant for user k is at an offset of τk0 =

Tk − τk 2W

18

2 Linear Multiple-Access

seconds from the start of the modulating waveform. Define the sampled modulation waveforms   j−1 − τk . sk [j] = s (2.6) 2W which are zero for j < Tk + 1 and j ≥ nL + Tk + 1, where L = ⌊2W T ⌋ is the number of samples per symbol period. Using these definitions, we arrive at the following discrete-time model, r[j] =

K

k=1

dk [i] Pk sk [j] + z[j].

(2.7)

In the above equation, the data symbol index i of user k corresponding to sample number j is   j 2W − τk i= , T valid for j > Tk . The sequence z[j] = z ((j − 1)/(2W )) are noise samples. Throughout this book, the additive noise will be modeled by a zero mean, circularly symmetric complex white Gaussian process with power spectral density N0 /L. In this case, the noise samples z(t) have independent zero mean Gaussian real and imaginary parts, each with variance N0 /(2L), and these samples are also independent and identically distributed (i.i.d.) over time. The received signal to noise ratio γk for user k is therefore  for complex modulation Pk /N0 γk = 2Pk /N0 for real modulation.

2.3 Matrix-Algebraic Representation Much insight into the behavior of the linear multiple-access channel and the design of multiple-user detectors is gained through expressing the discretetime model (2.7) in terms of matrix-vector operations. In order to accomplish this, we need to define some appropriate vectors and matrices. Modulation Matrix Collect each user’s sampled modulation waveforms into length nL + Tmax column vectors sk [i], one vector for each symbol period of each user, with zero padding corresponding to the delays τk . The vector for user k and symbol i is

2.3 Matrix-Algebraic Representation

19

τk0 sk [5] sk (t) sk [4]

sk [1]

sk [2]

sk [3]

0

1 2W

2 2W

t τk

3 2W

4 2W

Fig. 2.3. Sampling of the modulation waveform. sk [i] = ( 0, . . . , 0 , sk [L(i − 1) + Tk + 1], sk [L(i − 1) + Tk + 2], . . . , sk [Li + Tk ], 0, . . . )t . | {z } L(i−1)+Tk

(2.8)

Note that

n

t

sk [i] = (s[1], sk[1] , . . . , sk[nL + Tmax ])

i=1

is a vector containing the entire sampled modulation waveform for user k. Example 2.1. Figure 2.4 shows a schematic representation of the modulation vectors for user k in a system with L = 3, Tk = 2, Tmax = 3 and n = 3. The zero part of each vector is omitted for clarity. Now define the (Ln + Tmax ) × Kn modulation matrix S, which takes the vectors sk [i], k = 1, 2, . . . , K, i = 1, 2, . . . , n as columns, S = s1 [1]s2 [1] . . . sK [1]s1 [2]s2 [2] . . . sK [2] . . . s1 [i]s2 [i] . . . sK [i] . . .

i.e. all the users’ vectors for symbol 1 (in order of user number), followed by all the vectors for symbol 2 and so on. Column l of S is the sampled modulation sequence of user k = ((l − 1) mod K) + 1 (2.9)

corresponding to symbol

i = ⌈l/K⌉ .

(2.10)

Amplitude Matrix Define a diagonal Kn × Kn matrix √ √ √ √ √ √ A = diag P1 , P2 , . . . , PK , P1 , P2 , . . . , PK , . . .

which contains the user amplitudes, repeated n times. Note that time-varying amplitudes are also easily accommodated.

20

2 Linear Multiple-Access

sk [1]

sk [2]

sk [3]

j=0

Tk

sk [2] sk [3] sk [4]

Tk + L

sk [5] sk [6] sk [7]

Tk + 2L

sk [8] sk [9] sk [10]

Tk + 3L Fig. 2.4. The modulation vectors sk [i].

Data Vector Collect the users’ symbols into a single length n column vector t

d = (d1 [1], d2 [1], . . . , dK [1], . . . , d1 [i], d2 [i], . . . , dK [i], . . . , dK [n]) . The elements of this vector are in the same ordering as the columns of S, namely data symbol 1 for each user followed by data symbol 2 for each user and so on. The Basic Model for the Received Vector Finally define the length nL + Tmax column vectors t

r = (r[1], r[2], . . . , r[nL + Tmax ])

t

z = (z[1], z[2], . . . , z[nL + Tmax ])

which respectively contain the received samples and the noise samples. We can now re-write the discrete-time model (2.7) using the following matrix-vector representation

2.4 Symbol Synchronous Model

r = SAd + z.

21

(2.11)

This is the canonical linear-algebraic model for the linear-multiple-access channel. Using the notation established so far, there are two ways to refer to elements of these matrices. In cases where we wish to emphasize user and time indexes, we will use the sequence notation sk [i]. When we wish to emphasize the linear algebraic structure, we will use matrix-vector indices, sil . As explained above, the conversion between user/time indexes k, i and row/column indexes is via (2.9) and (2.10). Example 2.2. The equation below shows the structure of this linear model, for four users, K = 4 and two symbols n = 2 with T1 = 0, T2 = 1, T3 = 3, T4 = 0 and L = 4. All zero elements are omitted for clarity. ⎛ ⎞ ⎞ ⎞ ⎛ z[1] r[1] s4 [1] s1 [1] ⎞ ⎞⎛ ⎛√ ⎜ ⎜ r[2] ⎟ ⎜s1 [2] s2 [2] ⎟ ⎟ P1 √ d1 [1] s4 [2] ⎜ z[2] ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜d2 [1]⎟ ⎜ z[3] ⎟ ⎜ r[3] ⎟ ⎜s1 [3] s2 [3] s3 [3] s4 [3] ⎟⎜ P2 √ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜d3 [1]⎟ ⎜ z[4] ⎟ ⎜ r[4] ⎟ ⎜s1 [4] s2 [4] s3 [4] s4 [4] ⎟⎜ P3 √ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜d4 [1]⎟ ⎜ z[5] ⎟ ⎜ r[5] ⎟ ⎜ ⎟⎜ P s [5] s [5] s [5] s [5] 4 2 3 1 4 ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ √ ⎟ ⎜d1 [2]⎟ + ⎜ z[6] ⎟ ⎜ r[6] ⎟ = ⎜ ⎟⎜ s [6] s [6] s [6] s [6] P 2 1 3 4 1 ⎟ ⎜ ⎟⎜ ⎟ ⎟⎜ ⎜ ⎟ ⎜ √ ⎟ ⎜d2 [2]⎟ ⎜ z[7] ⎟ ⎟⎜ ⎜ r[7] ⎟ ⎜ [7] s [7] s [7] s [7] s P 1 2 3 4 2 ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ √ ⎟ ⎜ r[8] ⎟ ⎜ ⎝ s1 [8] s2 [8] s3 [8] s4 [8]⎟ P3 √ ⎠ ⎝d3 [2]⎠ ⎜ ⎜ z[8] ⎟ ⎟ ⎜ ⎟ ⎜ ⎝ z[9] ⎠ ⎠ ⎝ r[9] ⎠ ⎝ d4 [2] s2 [9] s3 [9] P4 z[10] s3 [10] r[10] ⎛

2.4 Symbol Synchronous Model In the case τ1 = τ2 = · · · = τK = 0, the system is symbol synchronous. This is obviously a less general model to the asynchronous model presented above, but it allows us to write a per-symbol matrix model that is very convenient for the development of the main ideas of this book. The degree to which symbol synchronism can be achieved is a technological issue, and in practice it is typically difficult to achieve perfect synchronism across all users. For the symbol synchronous model we have the following symbol-wise discrete-time matrix model. r[i] = S[i]Ad[i] + z[i],

(2.12)

where t

r[i] = (r[(i − 1)L + 1], r[(i − 1)L + 2], . . . , r[iL])

t

z[i] = (z[(i − 1)L + 1], z[(i − 1)L + 2], . . . , z[iL]) t

d[i] = (d1 [i], d2 [i], . . . , dK [i]) A = diag( P1 , . . . , PK )

22

2 Linear Multiple-Access

and S[i] is a L × K matrix with column k being the modulation waveform samples for user k corresponding to symbol i, t

sk [i] = (sk [(i − 1)L + 1], sk [(i − 1)L + 2], . . . , sk [iL]) . It is important to note that the specific waveform samples are different to those used in the asynchronous model. Since the users are completely synchronous, τk0 = 0 for each user and   j−1 , (2.13) s[j] = s 2W which are different sampling instants compared to the asynchronous model (2.11) (compare (2.13) to (2.6)). Note that the symbol-synchronous (2.12) and the general asynchronous (2.11) models share the same basic mathematical form. In both cases, the observation (whether it is for a single symbol, or and entire block of symbols) is a noise-affected linear transformation of the corresponding input. The only modification required is in the specification of the matrices and vectors involved. In the case of the symbol synchronous model, it is common practice to drop the symbol indexing when it does not cause confusion, and write r = SAd + z.

(2.14)

With this convention, we can use row/column indexing as explained above to write expressions that apply to both the synchronous (2.14) and asynchronous (2.11) models. This is very useful, since it allows easy translation of results between the two cases. in particular, detection algorithms developed for the synchronous channel can usually be directly applied to the asynchronous channel simply via a re-definition of the linear model. The symbol-synchronous discrete-time channel model is shown in Figure 2.5, where the summation is now a vector sum. Example 2.3. The equation below shows the structure of the symbol synchronous linear model at symbol i, for three users, K = 3 and L = 4. All zero elements are omitted for clarity. ⎞ ⎛ ⎞ ⎞ ⎛ ⎞ ⎞⎛ r[(i − 1)L + 1] z[(i − 1)L + 1] s1 [(i − 1)L + 1] s2 [(i − 1)L + 1] s3 [(i − 1)L + 1] s4 [(i − 1)L + 1] ⎛√ P1 √ d1 [i] ⎜z[(i − 1)L + 2]⎟ ⎜r[(i − 1)L + 2]⎟ ⎜s1 [(i − 1)L + 2] s2 [(i − 1)L + 2] s3 [(i − 1)L + 2] s4 [(i − 1)L + 2]⎟ ⎟⎝ ⎟ ⎜ ⎟ ⎜ P2 √ ⎠ ⎝d2 [i]⎠ + ⎜ ⎝z[(i − 1)L + 3]⎠ ⎝r[(i − 1)L + 3]⎠ = ⎝s1 [(i − 1)L + 3] s2 [(i − 1)L + 3] s3 [(i − 1)L + 3] s4 [(i − 1)L + 3]⎠ d3 [i] P3 s2 [iL] s3 [iL] s4 [iL] s1 [iL] z[iL] r[iL+] ⎛

2.5 Principles of Detection So far we have developed sample-level discrete time models for both the asynchronous and the symbol synchronous linear multiple-access channel. Using these models, we may apply existing results from detection theory to develop a variety of detection techniques with different levels of performance, at the expense of different levels of implementation complexity. In almost all cases

2.5 Principles of Detection √ Baseband Modulator

P1 s1 [i]

d1 [i] √

Baseband Modulator

23

P2 s2 [i]

d2 [i] r[i]

Receiver

z[i] √ Baseband Modulator

PK sK [i]

Noise

dK [i]

Fig. 2.5. Synchronous model.

of interest, implementation of the optimal detector is prohibitively complex – its complexity increases exponentially with the number of users. The high cost of implementation for the optimal detector motivates the development of reduced-complexity sub-optimal detectors. See Appendix A for a brief introduction to Bayesian estimation and detection. Most existing multiple-access strategies have been designed with a particular sub-optimal detection method in mind, namely single-user correlation detection, also known as the single-user matched filter. Historically, the correlation detector pre-dates the optimal detector, and was the motivation behind the development of many different multiple-access techniques, for example direct-sequence code-division multiple-access. The purpose of this section is to give a brief introduction to the principles of detection as they apply to the linear multiple-access channel. In particular, we will introduce the single-user correlation receiver as motivation for the access strategies to be developed in Section 2.6. Chapters 4 – 6 develop in further detail a whole range of detection and decoding strategies, providing a trade-off between performance and complexity. As explained in Sections 2.3 and 2.4, the discrete time asynchronous and symbol-synchronous models share the mathematical formulation r = SAd + z

(2.15)

under different definitions of these matrices and vectors. In the asynchronous case, the modulation matrix S is nK × (nL + Tmax ) and has as column l the zero-padded portion of the sampled modulation waveform of user ((l − 1) mod K) + 1 associated with symbol ⌈l/K⌉. The nK × nK diagonal matrix A has l-th diagonal element P((l−1) mod K)+1 . The length nK data vector d has as l-th element d⌈l/K⌉ [((l − 1) mod K) + 1].

24

2 Linear Multiple-Access

In the symbol synchronous case, (2.15) is a symbol-wise model and at each symbol i, the L × K modulation matrix S has as columns the users’ sampled modulation waveforms √ associated with symbol i. The K × K amplitude matrix A = diag Pk and the length K data vector contains the users’ data symbols for time i. Subject to the Gaussian noise model, in both cases, the noise vector z contains either real or circularly symmetric complex i.i.d Gaussian noise samples with  N0 I for complex noise, ∗ E [zz ] = N0 /2I for real noise. We will assume complex modulation. In cases where it causes no confusion, we shall simply refer to the linear model without specifying the degree of synchronism. In this way, the principles can be clearly explained without the need to describe a proliferation of special cases. This is one of the main advantages of this linear model. The basic design objective for the multiple-user detector is to find estimates dˆk [i] satisfying various optimality criteria. 2.5.1 Sufficient Statistics and Matched Filters According to (2.15), the observation r, conditioned on knowledge of S, A, d and N0 is Gaussian, with mean SAd and covariance N0 I, denoted r ∼ N (SAd, N0 I). The matched filter output y, obtained by multiplying the channel output with the Hermitian transpose of the modulation sequence matrix, y = S∗ r = S∗ SAd + S∗ Az ˜ = Rd + z is a sufficient statistic for the detection of the transmitted symbols d (see Section A.2). The matrix R = S∗ S is called the correlation matrix. The ˜ affecting the matched filter output is correlated according to resulting noise z ˜∗ ] = RN0 . E [˜ zz The matched-filter model is shown schematically in Figure 2.6 (for the symbol synchronous case). A sufficient statistic preserves the statistical properties of the observation. Hence the statistic y is an information lossless, reduced dimension representation of the original observation r. In the asynchronous case, R is nK × nK. For the synchronous model, R is K × K. The correlation matrix is of particular importance in multiuser detection and its structure will be described in further detail in Section 2.5.2.

2.5 Principles of Detection

25

Matched Filter Front End

√ Baseband Modulator

d1 [i] √

Baseband Modulator

s1 [i]

P1 s1 [i]

s2 [i]

P2 s2 [i]

d2 [i]

Reset

P y1 [i] P y2 [i]

r[i]

Receiver

z[i] √ Baseband Modulator

PK sK [i]

Noise

dK [i]

sK [i] P

yK [i]

Fig. 2.6. Symbol synchronous matched filtered model.

Using either r or y we can find minimum probability of error, or minimum variance estimates of d (see Section A.2). In many cases, it is more convenient to work with y, but this is not necessary. In fact, any invertible linear transformation Ar is also a sufficient statistic. We shall later see other transformations that are useful for coded systems. 2.5.2 The Correlation Matrix The matrix R = S∗ S is called the correlation matrix, and appears frequently in the design and analysis of multiple-user detectors. We shall therefore consider this matrix in a little more detail. Asynchronous Case In the asynchronous case, column l of S is the spreading sequence for user k = ((l − 1) mod K) + 1 at symbol period i = ⌈l/K⌉ (reproducing (2.8) for reference), sk [i] = ( 0, . . . , 0 , sk [L(i − 1) + Tk + 1], sk [L(i − 1) + Tk + 2], . . . , sk [Li + Tk ], 0, . . . )t . | {z } L(i−1)+Tk

(2.16)

The correlation matrix R of an asynchronous system has elements

26

2 Linear Multiple-Access

Rlm = s∗k [i]sk′ [i′ ], where l = K(i − 1) + k m = K(i′ − 1) + k ′ .

If symbol i for user k does not overlap in time with symbol i′ for user k ′ then the resulting Rlm = 0. Thus the correlation matrix is band-diagonal with bandwidth 2K + 1, since each symbol can only overlap with at most two symbols from any one other user. Example 2.4. Figure 2.7 shows the structure of R for a K = 3 system and i = 1, 2, 3. We have without loss of generality assumed τ1 ≤ τ2 ≤ τ3 . Hence R is a 9 × 9 band-diagonal matrix with bandwidth 5. The columns of the matrix are labeled with values for k and i. The rows are labeled with values for k ′ and i′ . The entries correspond to the cross-correlation between user k symbol i and user k ′ , symbol i′ . Note that the matrix consists of a block-diagonal part, together with lower and upper triangular matrices to fill in the band-diagonal structure. The square matrices on the diagonal are the cross-correlations between all the users for a given symbol interval. The upper-triangular matrices contain the inter-symbol interference terms from users at the previous symbol interval. The lower-triangular matrices contain the terms for the ISI from users at the next symbol interval. i k i'

1 1

2

2 3

1

2

3 3

1

2

3

k' 1

1

2 3 1

2

2 3 1

3

2 3

Fig. 2.7. Structure of the cross-correlation matrix.

2.5 Principles of Detection

27

Symbol Synchronous Case For a symbol-synchronous model, there is no inter-symbol interference and the correlation matrix R is block-diagonal R = diag (R[1], R[2], . . . , ) where R[i] is a K × K symmetric matrix and has elements Rkk′ [i] = s∗k [i]sk′ [i]. With reference to Figure 2.7, if the system is symbol synchronous, the upperand lower-triangular components of R are zero, and the remaining square matrices on the diagonal are exactly R[1], R[2], etc. 2.5.3 Single-User Matched Filter Detector The correlation receiver, also known as the single-user matched filter receiver holds a special place in the history of multiple-user detection. It is an essentially single-user technique that forms its decision dˆk for user k based only on the matched filter output for user k. Now the matched filter output for user k consists of the symbol of user k as well as additive multiple-access interference, yk [i] = s∗k [i]r = dk [i] +

(2.17)

s∗k [i]sk′ [i′ ]dk′ [i′ ]

+ z˜k [i]

(2.18)

k′ =k,i′ =i′

= dk [i] +



Rlm dk′ [i′ ]

k′ =k,i′ =i′







Multiple-access Interference

+

z˜k [i]

.

(2.19)



Thermal Noise

The last line above shows how the entries of the correlation matrix affect the amount of multiple-access interference (where as before, l = Ki + k and m = Ki′ + k ′ ). The correlator receiver is motivated by a Gaussian assumption regarding the distribution of the multiple-access interference. The folklore argument in favor of the correlator receiver goes as follows. “The matched filter output is optimal (in terms of minimizing each user’s error probability) when the multiple-access interference is uncorrelated Gaussian noise. If there are many users, the central limit theorem ensures that this interference is in fact asymptotically Gaussian distributed.” Invoking the Gaussian assumption on the multiple-access interference, the correlator receiver applies standard detection methods individually to the output of each matched filter. In the case of binary antipodal modulation, Dk = {−1, +1}, each user’s symbol estimate is obtained by hard-limiting the matched filter output as follows.

28

2 Linear Multiple-Access

dˆk [i] = sgn(yk [i]) . Figure 2.8 shows the structure of the correlator receiver for the symbolsynchronous channel and antipodal modulation. In the case of other modulation alphabets, the threshold device is replaced by the appropriate decision device (based on the Gaussian assumption).

s1 [i]

s2 [i]

r[i]

Reset

P

y1 [i]

P

y2 [i]

dˆ1 [i]

dˆ2 [i]

sK [i] P

yK [i]

dˆK [i]

Fig. 2.8. Symbol synchronous single-user correlation detection for antipodal modulation.

Largely due to the pioneering work of [121, 142, 147], the Gaussian assumption argument has been refuted, and since then, a proliferation of multiple-user detection strategies, exploiting the inherent structure of the multiple-access interference has occurred. The sub-optimality of the single-user correlator is basically due to the fact that although the vector y is a sufficient statistic for the detection of d, the single element yk is not (in general) a sufficient statistic for the detection of dk . By ignoring the other matched filter outputs, information essential for optimal detection is lost. In Chapter 3 we shall see that not only is the correlator sub-optimal from an uncoded bit-error probability point of view, but if each user performs single-user decoding based on the output of the correlator, a penalty is paid in terms of spectral efficiency. Nevertheless, the simplicity of implementation of the correlator makes it an attractive choice for system designers, and it is the method of choice for most contemporary existing systems.

2.5 Principles of Detection

29

2.5.4 Optimal Detection We shall now develop the optimal detector for d, given the observation r. Whenever we talk about optimality, we need to clearly specify what function is being optimized. Of particular interest in this detection problem is the probability of error, and there are two main criteria of interest, corresponding respectively to joint or individual optimality. The jointly optimal detector minimizes the probability of a sequence error,   ˆ = d[i] , Pr d[i]

(2.20)

whereas the individually optimal detector minimizes the individual error probabilities,   Pr dˆk [i] = dk [i] .

Let us now develop the jointly optimal detector [121, 142] (the individually optimal detector will be developed in the next section). According to the discussion of minimum error probability detection in Section A.5.1, the jointly optimal detector outputs the data vector with maximum a-posteriori probability. Using Bayes rule, the jointly optimal detector1 is given by ˆ MAP = arg max Pr (d) f (r | d) , d d∈D K

(2.21)

where Pr(d) is the prior probability distribution on the transmitted data and (in the case of complex noise),   1 2 −L r − SAd2 f (r | d) = (2πN0 ) exp − N0 is the conditional density of the channel output. Alternatively, this detector could have been defined as a function of the matched filter output, since y is a sufficient statistic. The development of this detector, as well as its extension to the asynchronous channel will be given in Chapter 4. Note that the estimate of the entire vector d is based on the entire vector r, or equivalently the output of every matched filter, [142], as depicted in Figure 2.9. In the case that Pr(d) is the uniform distribution, as might be the case for uncoded data, the joint MAP detector becomes the joint maximum likelihood detector. For Gaussian noise, the joint ML detector minimizes Euclidean distance, ˆ ML = arg min r − SAd2 . (2.22) d 2 d∈D K

The general problem of MAP or ML estimation is NP-hard and brute force evaluation is exponentially complex in K, [149]. The complexity is due 1

Also assuming identical modulation alphabets Dk = D for each user.

30

2 Linear Multiple-Access √ Baseband Modulator

Baseband Modulator

P1 s1 [i] y1 [i]

d1 [i] √

P2 s2 [i]

d2 [i]

y2 [i] r[i]

Matched Filter Front End

z[i] √ Baseband Modulator

dˆ1 [i]

PK sK [i]

dK [i]

Noise

dˆ2 [i] Optimal Detector

S∗

yK [i]

dˆK [i]

Fig. 2.9. Optimal joint detection.

to the discrete nature of the support which has |D|K elements. Brute force evaluation involves calculation of f (y|d) for each element of the support. The ML estimator may be implemented using an exponentially complex trellis search. This prohibitive level of complexity motivates many sub-optimal reduced-complexity detection methods, which is the subject of most of the remainder of this book. In particular Chapter 5 describes reduced complexity methods, mostly based on reduced tree searches for near-optimal detection, which approximate the action of either the jointly optimal, or individually optimal detectors. 2.5.5 Individually Optimal Detection The individually optimal detector minimizes the symbol error probability,   Pr dˆk [i] = dk [i] ,

rather than the sequence error probability (2.20). This is accomplished by outputting the symbol which maximizes the marginal a-posteriori probability, Pr(dk [i] | y). For the symbol synchronous system, this corresponds to

Pr(d)f (y[i] | d). (2.23) dˆk [i] = arg max d∈D

d:dk =d

There are DK−1 terms in the above summation. Brute-force evaluation of the individually optimal decision is exponentially complex in the number of users. Note also that the individually optimal decision is still a function of the entire channel output.

2.6 Access Strategies

31

2.6 Access Strategies There are many different existing multiple-access strategies. Each strategy can be described by a specific choice of modulation waveforms, usually intended to have properties particularly suitable for the application of the single-user matched filter receiver. In this section we describe how to formulate several well-known multiple-access schemes within the generic linear multiple-access framework (2.2). We will focus on symbol synchronous discrete-time versions of each channel, and hence the various access strategies will be parameterized by their choices of the modulation matrix S. 2.6.1 Time and Frequency Division Multiple-Access Perhaps the most obvious way to share a given bandwidth is to use timedivision or frequency division multiple-access. The basic idea is to allocate non-overlapping subsets of the entire time/bandwidth resource to each user. Time division multiple-access (TDMA) allows each user to transmit using the entire available bandwidth, restricted however to non-overlapping time intervals. Assuming, without loss of generality that each user transmits one complex baseband symbol per allocated time interval, the resulting modulation matrix for the K-user symbol synchronous TDMA channel is STDMA = IK . This is the most obvious example of an orthogonal modulation matrix, i.e. RTDMA = IK . With reference to (2.19), this means that there is no multiple-access interference. In this case the single-user matched filter receiver is indeed optimal. Frequency division multiple-access (FDMA) allows each user to transmit continuously in non-overlapping frequency bands. Assuming again without loss of generality that each user transmits a single symbol per allocated frequency band, the resulting modulation matrix for the symbol synchronous channel has elements √ 1 (SFDMA )jk = √ exp 2π −1 jk/K . K

This is in fact the Fourier transform matrix, F, and once again RFDMA = IK , resulting in optimality of the correlation receiver. Time and frequency-division multiple-access are duals of each other via the Fourier transform, indeed (somewhat trivially)

32

2 Linear Multiple-Access

SFDMA = FSTDMA STDMA = F ∗ SFDMA . The TDMA and FDMA methods just described are two examples of orthogonal multiple-access, in which the modulation matrix is orthogonal, S∗ S = I. Dividing time or frequency between the users makes a certain amount of “intuitive” sense, especially for engineers who are familiar with time-frequency representations of signals. More generally however, what is going on is that there are 2W T signaling dimensions to be shared among K users and there are many more ways to do this. 2.6.2 Direct-Sequence Code Division Multiple Access Direct-sequence code-division multiple-access (DS-CDMA) uses modulating waveforms sk (t) that are built from component waveforms ϕ(t) called chip waveforms. These chip waveforms are of short duration compared to the length of the modulating waveforms themselves. We now specialize the linear multiple-access model to the case of chip-synchronous direct-sequence modulation. Let ϕ(t) be a chip waveform satisfying  ϕ(t)ϕ∗ (t − jTc ) = 0 0 = j ∈ Z (2.24)  |ϕ(t)|2 dt = 1, (2.25) i.e. the chip waveform is unit energy and is orthogonal to any translation of itself by integer multiples of Tc , the chip period. We assume an integer number of chips L per symbol period, T = LTc . For theoretical purposes, we may consider rectangular chip waveforms, with support [0, Tc ), which however results in an infinite bandwidth. In practice, band-limited pulses, such as pulses with a raised cosine spectrum may be used. The modulating waveforms for each user is composed out of copies of the chip waveform, nL

sk [j] ϕ(t − jTc ), (2.26) sk (t) = j=0

where the integer j denotes the chip index and the chip-rate sequences sk [j] are the real or complex chip amplitudes. Note that we consider the case in which each user has the same chip waveform, and therefore these waveforms are not indexed by the user number k. The resulting modulating waveforms are made different through the multiplication of the chip waveforms by the chip amplitudes, which can be different from user to user. Code-division multiple-access was motivated by spread-spectrum communications and typically, each user’s modulation waveform sk (t) occupies a

2.6 Access Strategies

33

bandwidth considerably larger than that required by Shannon theory for transmission of the data alone, [83]. This is the case when Tc ≪ T , or equivalently, L ≫ 1. These sequences sk [j] are sometimes referred to as spreading sequences, signature sequences or spreading codes (we will however reserve the word code to mean forward error control codes). The elements of the spreading √ sequence sk [j] are usually chosen from a finite alphabet, e.g. {−1, +1}/ L. The concept of spectrum spreading can be viewed more abstractly as the scenario when each user occupies only a small fraction of the total available signaling dimensions. In the context of multiple-access communications however, the entire signal space may be occupied, even though each user spans only a small subspace. These concepts are better understood from an information theoretic point of view, and in Chapter 3 we will discuss this issue in greater depth. Multiplication of the symbol-rate data dk by the spreading sequence sk is called direct-sequence spreading and the resulting form of multiple-access is known as direct-sequence code-division multiple-access. Example 2.5. Figure 2.10 shows an example modulating waveform built from chips, corresponding to a Nyquist chip waveform with roll-off 0.3 and the chip amplitude sequence sk [1] , . . . , sk [10] = 1, 1, 1, 1, −1, −1, 1, 1, −1, 1.

Fig. 2.10. Modulating waveform built from chip waveforms.

With the above definitions, each user’s undelayed noise-free contribution is given by nL

sk [j] ϕ(t − jTc ), xk (t) = dk [i] Pk j=0

where i = ⌈t/T ⌉. Under the convenient (albeit unrealistic) assumption of chip synchronism (but not symbol synchronism) across the users, we assume that the integers τk measure the delay in whole number of chip periods. Although this assumption would rarely hold true in practice, it is a model commonly used in the

34

2 Linear Multiple-Access

literature, and suffices for the purposes of the development of the material in this book. The output of the chip-synchronous direct-sequence modulated multipleaccess channel is therefore given by r(t) =

K

k=1

=

K

k=1

xk (t − τk Tc ) + z(t) dk



t − τk Tc T



nL Pk sk [j] ϕ(t − (j + τk )Tc ) + z(t). j=1

Previously, we obtained a discrete-time model (2.11) from the continuous time model (2.2) using the sampling functions (2.3) as a complete orthonormal basis. Of course, any orthonormal basis will do, and in the case of chip synchronous DS-CDMA, it is convenient to use the set of delayed chip functions, φj (t) = ϕ (t − jTc ) , which according to (2.24) and (2.25) are orthonormal. Furthermore, since we have by design constructed the modulation waveforms as linear combinations (2.26) of φj (t), and the users are chip synchronous, they are a complete orthonormal basis. We can therefore obtain a discrete-time model for the chip synchronous channel by applying a chip matched filter (assuming that the receiver knows the chip boundaries and therefore the optimal sampling instance). The resulting sequence r[j] is a sufficient statistic for d. This set-up is shown in Figure 2.11. √ Baseband Modulator

Baseband Modulator

P1 s1 (t)

d1 [i] √

τ1

x1 (t)

P2 s2 (t)

d2 [i]

τ2

Chip-rate sampling

x2 (t) r(t)

ϕ∗ (−t)

z(t)

Chip Matched Filter

√ PK sK (t) Baseband Modulator

dK [i]

Noise

τK

xK (t)

Fig. 2.11. Chip match-filtered model.

r[i]

Receiver

2.6 Access Strategies

35

From (2.26) we can see that the modulation waveform samples resulting from the chip matched filter are precisely the chip amplitudes sk [j]. Application of the chip match filter and a sampling at a frequency of 1/Tc samples per second results in the following discrete-time chip synchronous model. r[j] =

K

k=1

dk



j − τk L

 Pk sk [j − τk ] + z[j],

(2.27)

where z[j] is a sampled Gaussian noise sequence. Note that this model only applies in the case of no inter-chip interference, which in addition to chip synchronism, requires that the chip waveforms ϕ(t) and ϕ(t + Tc ) offset by and integer number of chip periods are orthogonal. This is true by our assumption (2.24). Note that this discrete-time CDMA channel model (2.27) is identical in form to the generic discrete-time asynchronous model (2.7) presented earlier. The only difference is in the definition of the modulation waveform samples. This discrete time model can be re-written in the familiar matrix-vector form (2.11) r = SAd + z, where the definitions of A, d and z are identical to those given in Section 2.3. The modulation matrix has exactly the same structure as (2.8), except that the entries are the chip amplitudes rather than the sampled waveforms (this is a somewhat pedantic difference, since here we are in fact using the chip matched filter to achieve the same goal as sampling). If the delays τk are known to the transmitter, and do not change over time, the modulation matrix S can be chosen (via selection of the sk [j]) to have desired properties. For example, it may be possible to choose an orthogonal S such that R = I, resulting in zero multiple-access interference and optimality of the single-user matched filter. We have already encountered two possible orthogonal matrices, I and F, corresponding to time- and frequency-division multiple-access. There is nothing particularly special about these choices, and it is clear that any orthogonal (or unitary) matrix can be used with the same result. Rather than assigning specific time or frequency dimensions to individual users, an arbitrary orthogonal modulation matrix assigns orthogonal signaling dimensions, from a total of 2W T dimensions. All of these orthogonal multiple-access strategies are equivalent via change of basis. One problem with orthogonal modulation however is maintaining synchronism. A modulation matrix S designed for one set of user delays may no longer have the desired properties if the delays change. In particular, orthogonality may be hard to maintain over a wide range of possible delays. There are also many cases in which it is not reasonable to expect that the transmitter either knows the user delays, or that it has the option to adapt the modulation sequences (it may be desired that the modulation sequences are fixed for all time).

36

2 Linear Multiple-Access

We shall distinguish between two broad classes of spreading sequences: periodic sequences, where the same sequence is used to modulate each data symbol for a given user (but different sequences for each user); and random sequences, in which the spreading sequence corresponding to each symbol is randomly selected. Definition 2.1 (Periodic Spreading). The spreading sequence sk [j] is periodic if the same sequence is used for each symbol period sk [j] = sk [j + L] Periodic spreading may be used when symbol synchronism can be enforced and the sequences may be designed to have desired properties, such as low cross-correlation. Definition 2.2 (Random Spreading). The spreading sequence is random if each element sk [j] is selected i.i.d. from a fixed distribution with zero mean, unit variance, and finite higher moments.2 Note that this definition includes uniform selection of chips from an appropriately normalized binary (or any other finite cardinality) alphabet. It also includes such choices as Gaussian distributed chips. Random spreading may be approximated by using long pseudo-random sequences, such as m-sequences. The use of the term “random spreading” does not mean that the sequences are unknown to the receiver. In fact we usually assume that the receiver also knows the sequences. This is possible when the sequences are only really pseudo-random (i.e. the receiver can generate the same pseudo-random sequence and the transmitter). Random spreading is a useful model for systems in which the period of the spreading is much greater than the symbol duration. As we shall also see, random spreading is a useful theoretical tool for system analysis. 2.6.3 Narrow Band Multiple-Access Rather than assigning wide-band signaling waveforms, we can consider a system in which each user transmits using the same modulation waveform s(t). In the symbol synchronous case the signal model is   K

t r(t) = Pk s(t) + z(t), d T

(2.28)

k=1

2

This mild restriction on higher moments is a technical requirement for largesystem capacity results.

2.6 Access Strategies

37

and the corresponding discrete-time matrix model is r[i] = S[i]A[i]d[i] + z[i] where the modulation matrix S[i] has identical columns. In that case, the correlation matrix R is the all-ones matrix, ∗

y[i] = 1A[i]d[i] + S[i] z[i] and the output of each user’s matched filter is identical and is given by K

Pk dk [i] + z˜[i]. y[i] =

(2.29)

k=1

Note that we could have obtained (2.29) directly from (2.28) via matched filtering for s(t). 2.6.4 Multiple Antenna Channels In certain scenarios, such as in the presence of multipath propagation, it may be advantageous for the receiver and each transmitter to use multiple antennas. Let us develop an idealized channel model for a K user system in which each user has M transmit antennas, and the receiver has N antennas. Extension to different number of antennas for each transmitter is straightforward. Figure 2.12 shows a two-user example in which each user has two transmit antennas, and there are three receive antennas. The figure is simplified to concentrate on the multiple antenna aspect of the system. x11 (t) h11 1 User 1

h31 1 12 x21 (t) h1 h32 1

r1 (t) h21 1

h22 1 r2 (t)

h11 2 x12 (t) User 2

h31 2 h12 2

x22 (t)

Receiver

h21 2 h22 2

r3 (t)

h32 2 Fig. 2.12. Multiple transmit and receive antennas.

38

2 Linear Multiple-Access

To allow for the most general transmission strategies, let each user employ a different modulation waveform for each transmit antenna and transmit different data symbols over each antenna. To this end, let sμk (t) and dμk [i] be the respective modulation waveforms and data symbols for user k = 1, 2, . . . , K, and antenna μ = 1, 2, . . . , M . Each user may also transmit using different symbol energies from each transmit antenna. Let Pkμ denote the symbol energy for antenna μ of user k. We can therefore write the undelayed transmitted signals for antenna μ of user k as    t μ μ xk (t) = dk Pkμ sμk (t). T Let hνμ k be the channel gain between transmit antenna μ of user k and receive antenna ν = 1, 2, . . . , N . Then the signal received at antenna ν is rν (t) =

M K



k=1 μ=1

=

M K



k=1 μ=1

μ ν hνμ k (t)xk (t − τk ) + z (t) μ hνμ k (t)dk



t − τk T

  Pkμ sμk (t − τk ) + z ν (t).

The received signal at each antenna shares the same formulation as a KM user system in which each user has a single transmit antenna (consider the mapping (k, μ) → K(μ − 1) + k). Sampling the output of each receive antenna leads to a discrete-time model which may be represented in matrix form. Under the assumption of symbol synchronism we can write V[i] = S[i]AD[i]H[i] + Z[i]

(2.30)

where the various matrices in (2.30) are defined as follows. The L × N matrix V[i] contains the received samples for symbol period i. Element (V)jν of this matrix is sample j from antenna ν. Thus each column is the output of one antenna. The L × KM matrix S[i] is defined in a similar way to (2.12). Its columns are the sampled modulation waveforms for each transmit antenna of each user. More specifically, column l is the sampled waveform for transmit antenna μ of user k, according to k = ⌈l/M ⌉ μ = ((l − 1) mod M ) + 1,

(2.31) (2.32)

i.e. we get each transmit antenna of user 1 followed by each antenna of user 2 and so on. The KM × KM matrices A[i] and D[i] are both diagonal. The diagonal elements of A are the transmit amplitudes for each transmit antenna of each

2.6 Access Strategies

39

user. The diagonal elements of D are the transmitted symbols for each antenna of each user. The ordering is the same as that used for S, i.e. subject to (2.31) and (2.32) (D[i])ll = dμk [i]  (A[i])ll = Pkμ [i].

Subject to the assumption that the channel gains hμν k are constant for each symbol interval, but that they may change from symbol interval to symbol interval, the KM × N matrix H[i] contains the channel gains, ordered according to (2.31) and (2.32), (H[i])lν = hμν k [i]. With the introduction of appropriate statistical models, the channel gains may model effects such as fast or slow flat fading, or they may be used to model phased arrays. Note that we may recover the single antenna model of (2.12) via M = N = 1 and H[i] = 1, a K × 1 all-ones vector. With these definitions, (2.30) is identical to (2.12) since d[i] = D[i]1. 2.6.5 Cellular Networks Cellular networks spatially re-use signaling dimensions (time and frequency) in order to increase the overall system capacity. The idea is that given enough spatial separation, the signal attenuation due to radio propagation will be sufficient to limit the effects of multiple-access interference from other cells operating in the same signal space. An idealized model of a narrow band cellular multiple-access channel was developed in [170]. In this model, a user contributes to the received signal of a base station only if it is in that base station’s cell, or if it is in an immediately adjacent cell. This is shown schematically in Figure 2.13. This figure shows ten hexagonal cells each with a base station represented by a circle. Two mobile terminals are shown. Each transmits an intended signal to its own base station (solid lines) and interference to adjacent base stations (dashed lines). Consider the up-link of a cellular network in which there are L cells, each with a single base station. A total of K users populate the network. Assume that each user transmits using identical modulation waveforms and that the system is symbol synchronous. Then (according to the narrow-band model developed in Section 2.6.3), the discrete-time signal model is r[i] = S[i]Ad[i] + z[i]. In this model, rj [i], j = 1, 2, . . . , L is the signal received at base station j. According to Wyner’s idealized model, the matrix S defines the connection gains between user k and base station j according to

40

2 Linear Multiple-Access

1

2

3

4

5 2

1 6 7

8 9

10

Fig. 2.13. Simplified cellular system.

sjk

⎧ ⎪ ⎨1 = α ⎪ ⎩ 0

User k is in cell j User k is in a cell adjacent to cell j otherwise.

Thus each user is received in its own cell with symbol energy P and in adjacent cells with symbol energy P α2 . In this simplified model, the modulation matrix acts as a connection matrix. This is a crude model for perfect power control within each cell, and a fixed path loss between cells. In a more general model, the gains sjk would be random variables and would include the effect of path loss, shadowing and fading. Example 2.6. The cellular arrangement shown in Figure 2.13 is modeled by ⎞ ⎛ α0 ⎜α 0 ⎟ ⎟ ⎜ ⎜α α⎟ ⎟ ⎜ ⎜ 1 α⎟ ⎟ ⎜ ⎜ 0 α⎟ ⎟. ⎜ S=⎜ ⎟ ⎜α 0 ⎟ ⎜α 1 ⎟ ⎟ ⎜ ⎜α α ⎟ ⎟ ⎜ ⎝ 0 α⎠ 0α

2.6 Access Strategies

41

2.6.6 Satellite Spot-Beams Channels Satellite spot-beam channels, [89, 90] share a similar formulation to cellular networks. With reference to Figure 2.14, terrestrial user terminals communicate with a ground station via a satellite which employs a multi-beam antenna (usually implemented using a phased array). The satellite spatially re-uses time and frequency resources by using a spot beam antenna. Each beam of the spot beam antenna illuminates a geographical region and the antenna is designed to provide isolation between each beam. It is however difficult to achieve perfect isolation between beams and multiple-access interference results. In the cellular model, the average gain between each user and base station pair is determined largely by path loss considerations. In the spotbeam channel, the gain between each user and beam is determined by the antenna roll-off. Under the assumption of symbol synchronism and identical modulating waveforms for each user, the satellite spot-beam channel is the same as the cellular network described in Section 2.6.5. The modulation matrix now contains the gain from each mobile each mobile earth terminal to each antenna beam.

Ground Station

Satellite

Fig. 2.14. Satellite spot beam up-link.

42

2 Linear Multiple-Access

Example 2.7. The seven-beam example shown in Figure 2.14 is modeled by ⎞ ⎛ 1αααααα ⎜α 1 α 0 0 0 α⎟ ⎟ ⎜ ⎜α α 1 α 0 0 0 ⎟ ⎟ ⎜ ⎟ S=⎜ ⎜α 0 α 1 α 0 0 ⎟ ⎜α 0 0 α 1 α 0 ⎟ ⎟ ⎜ ⎝α 0 0 0 α 1 α ⎠ αα0 0 0α1

2.7 Sequence Design Most modulation sequence design is motivated by the use of sub-optimal detection strategies, in particular the correlation receiver described in Section 2.5.3. Conditioned on this choice of sub-optimal detection strategy, it is natural to try to optimize the performance of the receiver by carefully choosing the modulation sequences. To this end, one could try to choose sequences that minimize the average error probability or that minimize the total cross correlation. 2.7.1 Orthogonal and Unitary Sequences If L ≥ K, it is always possible to find modulation matrices S that result in zero multiple-access interference. In the case of real matrices, this means R = St S = I and S is called orthogonal. In the case of complex matrices, R = S∗ S = I, and S is called unitary. Obviously orthogonal matrices are also unitary if we consider the real elements to be complex numbers with zero imaginary part. Unitary matrices have the property that each column is orthogonal to every other column, and these columns are the modulating vectors for each user. Starting from an given vector, it is possible to find K − 1 other vectors while maintaining orthogonality, via a Gram-Schmidt process. In the case of an unitary S, the matched-filter output is y = S∗ r = S∗ Sd + S∗ z ˜. =d+z It is a property of the Gaussian density that it is invariant to unitary transformations (i.e. it is isotropic). This means that unlike most transformations, ˜ has the same distribution as z. Therefore each user k may be detected based z only on its own matched filter output. Some examples of unitary matrices that we have already encountered include I, the identity matrix and F, the Fourier transform matrix. Use of these matrices result respectively in time and frequency-division multiple-access.

2.7 Sequence Design

43

Although it is always possible to construct unitary matrices for L ≥ K, it is not always possible to ensure that the elements of S are members of a given modulation alphabet. It may be required for instance that the elements are binary, sk [j] ∈ {−1, +1}. 2.7.2 Hadamard Sequences Let us now consider the problem of designing modulation sequences to optimize the performance of the correlation receiver. The total variance σk2 of the multiple-access interference as seen by user k is the k-th row (or column) sum of R, minus the diagonal term σk2 =

K

j=1

2 Rjk − 1,

and a reasonable goal for sequence design is to find binary modulation sequences S that minimize the total squared cross-correlation 2 = σtot

K

σk2 ,

i=1

2 /K. subject to a fairness condition σk2 = σtot 2 Now if K ≤ L, it is obvious that we can achieve σtot = 0 by using orthogonal modulation sequences, since in that case R = I. The correlation receiver is in fact optimal for orthogonal sequences. Binary matrices S with the property S∗ S = I are called Hadamard Matrices, and can only exist for L = 1, 2 or a multiple of 4. (It is not know whether they exist for every multiple of 4). For arbitrary given K and L (i.e. K not necessarily smaller than L), how close to orthogonal can we get? The following theorem gives a bound on the total squared cross-correlation [159].

2

Theorem 2.1 (Welch’s Bound). Let sk ∈ CL and sk  = 1, k = 1, 2, . . . , K then K

K

i=1 j=1

2 Rij ≥

K2 . L

Equality is achieved if and only if S has orthogonal rows. Equality implies K

K 2 Rij = . L j=1

44

2 Linear Multiple-Access

It has been shown in [85] that sequences that meet the Welch bound with equality are optimal if correlation detection is being used. We also see in Section 3.4.2 that such sequences are also are optimal from an information theoretic point of view. Example 2.8 (Welch bound equality sequences). The following non-orthogonal sequence set achieves Welch’s bound (where for clarity, + denotes a +1 and − denotes −1). ⎞ ⎛ ++++++++ ⎜+ + + + − − − −⎟ ⎟ S=⎜ ⎝+ + − − + + − −⎠ +−+−+−+−

The requirement for equality in Welch’s bound is the orthogonality of the rows of S. Recall that the modulation sequences themselves form the columns of S. In the case K ≤ L, we can have orthogonality both of the rows, and the columns. If however K > L, the sequences themselves cannot be orthogonal, yet the correlator performance is optimized by having orthogonal rows.

3 Multiuser Information Theory

3.1 Introduction In this chapter we consider some of the Shannon theoretic results and coding strategies for the multiple-access channel. The underlying assumption is that signals are encoded, and that no restrictions will be placed on the capability of the receiver processing. Information theory provides fundamental limits on data compression, reliable transmission and encryption for single user or point-to-point channels. In the context of multiple-user communications, there are many interesting and relevant questions that we could ask: • What are the fundamental limits for multiple-user communications? • What guidance can information theory give for system design? • What are the impact of system features such as feedback, or synchronism? • What modulation and/or coding techniques should we use? • How should we deal with multiple-access interference from an engineering point of view? • What is the cast of using sub-optimal reception techniques? We will approach some of these questions from an information theoretic perspective, concentrating on the multiple-access channel. We begin in Section 3.2 by formalizing our definition of a multiple-access channel. From this basis, we introduce in Section 3.2.2 the capacity region, which is the multiple-user analog of the channel capacity. In Section 3.3 we consider some simple binary-input channels, namely the binary adder channel and the binary multiplier channel. These serve to illustrate some of the main ideas regarding the computation of the capacity region for discrete channels. Multiple-access channels with additive Gaussian noise are developed in Section 3.4. Of particular interest is the direct-sequence code-division multipleaccess channel, which is the subject of Section 3.4.2. We discuss the relationship between the selection of signature sequences and the channel capacity,

46

3 Multiuser Information Theory

and give an overview of large systems analysis, applicable to random signature sequences. This section also serves as an introduction to Chapters 4 – 6, which discuss the design of multiple-user receivers for DS-CDMA. In Sections 3.5.1 and 3.5.2 , we discuss the design of uniquely decodeable block and trellis codes, which are error-control codes specifically designed for the multiple-access channel. Another coding strategy for the multiple-access channel is superposition coding, discussed in Section 3.6. This approach allows the multiple-access channel to be viewed as a series of single-user channels, with no theoretical loss in channel capacity. Various modifications of the channel assumptions have different effects upon the size and shape of the capacity region. In Section 3.7, we discuss the effect of feedback, and in Section 3.8 we discuss asynchronous transmission.

3.2 The Multiple-Access Channel 3.2.1 Probabilistic Channel Model In Chapter 2, we developed signal-based mathematical models for the linear multiple-access channel. In order to obtain an information theoretic understanding of the multiple-access channel, we must first define an appropriate probabilistic model (the underlying probabilistic models were implicit in Chapter 2). Let us begin by introducing some concise notation to describe many user systems. For a K user system, let K = {1, 2, . . . , K} be the set of user index numbers. For an arbitrary subset of users, S ⊆ K, the subscript S denotes objects indexed by the elements of S. For example, XS = {Xk , k ∈ S} are those users indexed by the elements of S. Likewise, letting Rk be the information rate of user k, RS = {Rk , k ∈ S} . For objects with a suitably defined addition function (rates, powers), let the functional notation

R (S) = Rk k∈S

be the sum over those users indexed by S. We are now ready to define the discrete memoryless multiple-access channel.

3.2 The Multiple-Access Channel

47

Definition 3.1 (Discrete Memoryless Multiple-Access Channel). A discrete memoryless multiple-access channel (DMMAC) consists of K finite input alphabets, X1 , X2 , . . . , XK (denoted XK in the shorthand notation introduced above), a set of transition probabilities, p(y | xK ) = p(y | x1 , x2 , . . . , xK ) and an output alphabet Y. We shall refer to such a channel by the triple (XK ; p(y | xK ) ; Y), which emphasizes the entities that completely define the channel. Figure 3.1 shows a schematic representation of a two-user multiple-access channel. Transmission over the DMMAC takes place as follows. Time is discrete, requiring symbol transmissions to be aligned to common boundaries. At each symbol interval, every user k = 1, 2, . . . , K transmits a symbol xk , drawn from Xk according to a source probability distribution pk (xk ). The received symbol, y ∈ Y is observed according to the transition probabilities p(y | xK ).

x1 ∈ X1 x2 ∈ X2

p(y | x1 , x2 )

y∈Y

Fig. 3.1. Two-user multiple-access channel.

Although we have only formally defined a discrete memoryless multiple access channel, it is straightforward to extend this definition to some other types of channel. For example, we can model the discrete-input continuousoutput MAC by allowing the received symbols to be real-valued, Y ⊆ R. Accordingly, the conditional transition probabilities must now be density, rather than mass functions. Similarly, continuous-input continuous-output channels (usually subject to input power constraints), allow the sources to take values from the reals, XK ⊆ RK according to some density function pk (xk ), yielding now a joint probability density, p(y | xK ). Although we have defined the individual user’s (marginal) source probabilities, we have not mentioned any statistical dependence between users. The form of the joint source distribution p(xK ) depends on the degree of cooperation allowed between the users. If the users cannot cooperate, p(xK ) is restricted to be a product distribution,

48

3 Multiuser Information Theory

p(xK ) =

K #

pk (xk ) .

k=1

Such absence of cooperation between users can be caused for example by physical separation of the users, who have no direct communication links with each other. This is the case in most mobile radio networks. Alternatively, if full cooperation between users is allowed (which implies the existence of a communication link between the users), there is no restriction on the joint distribution p(xK ). Such full cooperation of sources allows the complete channel resource to be used as if by a single “super-source” with alphabet X1 × X2 × · · · × XK , and can therefore be used as a benchmark for systems with independent users. Note that our definition of a DMMAC did not include the source distributions. The system designer is typically free to choose the most advantageous set of source distributions (for example to maximize the rate of transmission). A discrete memoryless multiple-access channel may be schematically represented in a fashion similar to single-user channels. Example 3.1. Figure 3.2 shows the transition probability diagram for an example two-user channel with X1 = X2 = {0, 1}. Each user transmits independently from uniformly distributed binary alphabets. The arrows on the graph show the probability of receiving a particular y ∈ Y = {0, 1, 2}, corresponding to a matrix of transition probabilities (the outputs index the columns) ⎤ ⎡ 0.9 0.1 0 ⎢0.05 0.8 0.15⎥ ⎥ ⎢ ⎣ 0 0.3 0.7 ⎦ . 0 0 1 Note that when the users transmit independently, the resulting multipleaccess channel model is very similar to that for a single-user discrete multipleaccess channel with memory, e.g. an inter-symbol interference (ISI) channel. Instead of the source labels referring to the current, and previous transmitted bit (in an ISI channel with memory one), they now refer to user 1 and user 2. 3.2.2 The Capacity Region The capacity region is the multiple-user equivalent of the channel capacity for a single-user system. In a single-user system, there is a single number, C, which is the channel capacity.1 The channel capacity is the rate of transmission below which there exist encoders and decoders that allow arbitrarily low probability of decoding error. In a multiple-user system with K users, there are K possibly different rates, one for each user and the capacity region is the set of allowable rate 1

Provided the channel has a capacity.

3.2 The Multiple-Access Channel p(x1 , x2 ) (x1 , x2 ) 1 4

(0, 0)

0.9

Y

49

p(Y )

0 0.2375

0.1 1 4

(0, 1)

0.05 0.8 0.15

1 4

(1, 0)

1

0.3

0.3 0.7

1 4

(1, 1)

1.0

2 0.4625

Fig. 3.2. Example of a discrete memoryless multiple-access channel.

combinations, which we represent geometrically in K-dimensional Euclidean space as rate vectors, R = (R1 , R2 , . . . , RK ). These rate combinations may be used such that arbitrarily low error probabilities can be theoretically achieved. First, we must define the concept of a channel code, and the probability of error, before any capacity theorem can make sense. With reference to Figure 3.3, a code for a multiple-access channel consists of K encoding functions, one for each source. Since the sources are independent, we can describe the multiple-access code in terms of the individual codes for each user (rather than a single joint encoder). Let Mk = {1, 2, . . . , Mk } represent the set of possible messages that user k may wish to transmit. Thus the message for user k is a random variable Uk ∈ Mk , and the encoding function fk for user k maps from this set of Mk messages onto a n dimensional code vector, whose elements are drawn from the source alphabet Xk : fk : Mk → Xkn . The parameter n is called the codeword length. Assuming symbol synchronous transmission of the users, we can represent the entire process as a multipleaccess encoder fK mapping as follows n fK : M1 × M2 × · · · × MK → X1n × X2n × · · · × XK ,

where in this instance, × denotes the Cartesian product. We shall abbreviate this mapping by fK : MK → XKn . For user k, define 1 Rk = log2 Mk n to be the code rate, in bits per symbol.

50

3 Multiuser Information Theory

The decoding function for the multiple-access code is a mapping g : Y n → MK . from the set of possible received vectors onto the message sets of the individual users. This definition implies the use of receiver cooperation, i.e. joint ˆk . decoding. The K messages output by the decoder are random variables, U

U1

Encoder

X1n

ˆ1 U

X2n

ˆ2 U

f1 U2

Encoder

f2

Channel

Yn

Decoder

p(y|xK )

UK

Encoder

g

ˆK U

n XK

fK

Fig. 3.3. Coded multiple-access system.

A decoding error is said to have occurred if the message set at the output of the decoder does not agree completely with the input message set. In other words an error occurs if the decoded message for any of the users is in error. The error probability conditioned on a particular set of transmitted messages mK = m1 , m2 , . . . , mK is therefore Pr (Error | UK = mK ) = Pr (g (Y n ) = mK ) Assuming equi-probable transmission of messages, the average error probability for a multiple-access code is given by P¯e =

1 2nR(K)



mK ∈MK

Pr (g (Y n ) = mK ) .

Definition 3.2 (Achievable Rate Vector). A particular rate vector R = (R1 , . . . , RK ) is achievable if, for any ǫ > 0 there exists a multiple-access code with rate vector R such that P¯e ≤ ǫ. Otherwise it is said to be not achievable.

3.2 The Multiple-Access Channel

51

Now we have formalized our description of the multiple-access system, we can present the theorem which describes the entire set of achievable rates. We describe the capacity region in terms of the achievable rate regions for fixed source distributions. Theorem 3.1 (Achievable Rate Region). multiple-access channel

Consider the

(XK ; p(y | xK ) ; Y) . For a fixed product distribution on the channel input symbols, # π (xK ) = pk (xk ) k

every rate vector R in the set * + R [π (xK ) , p(y | xK )]  R : R (S) ≤ I XS ; Y | XK\S , ∀S ⊆ K (3.1) is achievable in the sense of Definition 3.2. Polytopes which are defined by a system of constraints such as (3.1) are known as bounded polymatroids [138]. The following example writes out the achievable rate region for the two-user case. Example 3.2 (Two-user achievable rate region). Consider a two-user DMMAC. For a fixed input distribution π (x1 , x2 ) the following region is achievable. ⎫ ⎧ R1 ≤ I (X1 ; Y | X2 )⎬ ⎨ R2 ≤ I (X2 ; Y | X1 ) R [p1 (x1 ) p2 (x2 ) , p(y | xK )] = (R1 , R2 ) : ⎭ ⎩ R1 + R2 ≤ I (X1 , X2 ; Y )

This region is shown in Figure 3.4.

Example 3.3 (Three-user achievable rate region). Consider a three-user DMMAC. For a fixed input distribution π (x1 , x2 , x3 ) the following region is achievable. ⎫ ⎧ R1 ≤ I (X1 ; Y | X2 , X3 )⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ R2 ≤ I (X2 ; Y | X1 , X3 )⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ R3 ≤ I (X3 ; Y | X1 , X2 )⎪ ⎬ ⎨ ≤ I (X1 , X2 ; Y | X3 ) R [π (xK ) , p(y | xK )] = (R1 , R2 , R3 ) : R1 + R2 ⎪ ⎪ ⎪ R1 + R 3 ≤ I (X1 , X3 ; Y | X2 )⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ≤ I (X2 , X3 ; Y | X1 )⎪ R2 + R 3 ⎪ ⎪ ⎪ ⎪ ⎭ ⎩ R1 + R2 + R3 ≤ I (X1 , X2 , X3 ; Y ) This region is shown in Figure 3.5.

52

3 Multiuser Information Theory R2 R1

I (X2 ; Y | X1 )

+ R2 = I( X ,X

1

;Y

2

)

I (X2 ; Y )

R1 I (X1 ; Y | X2 )

I (X1 ; Y )

Fig. 3.4. Two-user achievable rate region. R2

I(X2 ;Y |X1 ,X3 )

R1 +R2 =I(X1 ,X2 ;Y |X3 ) R2 +R3 =I(X2 ,X3 ;Y |X1 ) R1 +R2 +R3 =I(X1 ,X2 ,X3 ;Y )

I(X3 ;Y |X1 ,X2 )

I(X1 ;Y |X2 ,X3 ) R1 +R3 =I(X1 ,X3 ;Y |X2 )

R3

R1 Fig. 3.5. Three-user achievable rate region.

3.2 The Multiple-Access Channel

53

In general there are 2K − 1 rate constraints, one for each non-empty subset of users. Note that in some cases, there may be a smaller number of effective constraints, since some inequalities may be implied by others. The system designer can arbitrarily select the source product distribution π (·) and hence the union of achievable regions (over the set of all source product distributions) is achievable. Furthermore, suppose R1 and R2 are both achievable rate vectors. Every point on the line connecting R1 and R2 is also achievable, by using the codebook corresponding to R1 for λn symbols and that corresponding to R2 for the remaining (1 − λ)n, 0 ≤ λ ≤ 1 symbols. This process is known as time-sharing and implies that the convex hull 2 of an achievable region is achievable. A common time reference3 is however required so the users know when to change codebooks. Theorem 3.2 (Multiple-Access Capacity Region). Assuming the availability of a common time reference to all users, the capacity region is given by the closure of the convex hull (denoted co) of the union of all achievable rate regions over the entire family of product distributions on the sources: / C [p(y | xK )] = co R [π (xK ) , p(y | xK )] . (3.2) π(xK )

The capacity region was first determined by [2], [129] and [76]. More accessible proofs of the multiple-access channel coding theorem may be found in [51] using the concept of jointly typical sequences, see also [24] and in [50], using random coding arguments. The DMMAC is rather special in that it admits a single-letter description of its capacity region, i.e. can be expressed in terms of quantities such as I (X; Y ), rather than limiting expressions4 such as lim I (X[1], X[2], . . . , X[n]; Y [1], Y [2], . . . , Y [n]) .

n→∞

There are not many other examples of multi-terminal channels in which the general solution is single-letter. Although single-letter achievable regions may be easily constructed for most multi-terminal channels, it appears difficult in general to obtain single-letter converse theorems. 2

3 4

The convex hull (convex cover) of a set of points, A is defined in [32] as the set of points which is the intersection of all the convex sets that contain A. This qualification shall be examined in more detail in Section 3.8 It is possible to obtain “single-letter” expressions at the expense of unbounded increase in the source cardinality, but this is cheating.

54

3 Multiuser Information Theory

3.3 Binary-Input Channels In order to illustrate the concepts introduced above, let us now consider some simple examples, concentrating on binary-input channels. In particular, we shall consider the binary adder channel and the binary multiplier channel. 3.3.1 Binary Adder Channel The binary adder channel is a K-user DMMAC in which each user transmits from a binary alphabet Xk = {0, 1}, k = 1, . . . , K and the output is given by the the real addition K

Y = Xk k=1

The channel output will therefore be a member of Y = {0, 1, . . . , K}. What are the allowable rates of transmission for the binary adder channel? A simple upper bound based on the cardinality of Y is R(K) ≤ log(K + 1). We can however say a lot more than this. Two Users First, we shall find the capacity region for the two-user case. Let user 1 and user 2 transmit a 0 with probabilities p1 (0) = p and p2 (0) = q respectively. The transition diagram of the channel is shown in Figure 3.6. By Theorem 3.1, the p1 (x1 ) p2 (x2 ) (x1 , x2 ) pq

(0, 0)

p(1 − q)

(0, 1)

(1 − p)q

y

pY (y)

0

pq

1

p + q − 2pq

2

(1 − p)(1 − q)

(1, 0)

(1 − p)(1 − q) (1, 1)

Fig. 3.6. Two-user binary adder channel.

3.3 Binary-Input Channels

55

achievable rate region for a given p and q is given by the rate vectors, (R1 , R2 ), which satisfy R1 ≤ I (X1 ; Y | X2 ) R2 ≤ I (X2 ; Y | X1 ) R1 + R2 ≤ I (X1 , X2 ; Y )

(3.3) (3.4) (3.5)

Consider the first inequality (3.3). I (X1 ; Y | X2 ) = H (X1 | X2 ) − H (X1 | X2 , Y ) = H (X1 ) = h (p) where h (p) is the binary entropy function. The first step follows from the independence of the two users (no transmitter cooperation), and the fact that if we know both Y and X2 , we can always determine X1 . Similarly, (3.4) is found to be I (X2 ; Y | X1 ) = h (q) The sum constraint (3.5) is determined as follows:

I (X1 , X2 ; Y ) = H (Y ) − H (Y | X1 , X2 ) = H (Y ) = −pq log pq − (1 − p)(1 − q) log(1 − p)(1 − q) − (p + q − 2pq) log(p + q − 2pq) since the two users’ inputs uniquely determine the output. It is now necessary to take the closure of the convex hull over the union of all such regions, where 0 ≤ p, q ≤ 1. Fortunately, there is a trick to be employed, but for the moment, let us see what this statement means. Figure 3.7 shows an example. The region bounded by the thick line is due to p = 0.1, q = 0.6. The second region, bounded with a thin line is due to p = 0.5, q = 0.8. The shaded region shows the additional rates found by taking the convex hull over these two regions. In principle, one continues this way for all p, q combinations. It is not hard to see that this processes would be rather tedious, since p and q have uncountable support. Instead, note that for p = q = 0.5, we obtain the maxima h (p) = h (q) = 1.0. Fortunately, it is precisely these distributions which also maximize the third constraint, giving maxp,q H (Y ) = 1.5. Since it is possible to simultaneously maximize all constraints, the resulting region must contain all other regions. The resulting capacity region is therefore R1 < 1 R2 < 1 R1 + R2 < 1.5.

56

3 Multiuser Information Theory R2

1.0

(p, q) = (0.1, 0.6)

0.5

(p, q) = (0.5, 0.8)

R1 0.5 1.0 Fig. 3.7. Convex hull of two achievable rate regions for the two-user binary adder channel.

This region is a pentagon and is shown bounded by the thick line in Figure 3.8.

Single-user Decoding and Time Sharing Two other regions are shown on Figure 3.8, corresponding to different transmission and decoding strategies. The triangle defined by R1 + R 2 < 1 and labeled no cooperation is the set of rates achievable with single-user decoding, in which the decoder for each user operates independently and treats the unwanted user as noise. This region is sometimes referred to as the time-sharing region, since it may be achieved by time sharing the channel between the two users. To achieve a rate point R1 = λ, R2 = 1−λ via time sharing, the input distribution p = 1/2, q = 0 is used λ of the time, and the remaining 1 − λ of the time the input distribution p = 0, q = 1/2 is used. Note that such time-sharing implies a degree of cooperation between the users, namely that they have access to a common time reference. It is perhaps less well-known that time sharing is not really necessary to achieve points with R1 + R2 = 1. If each user is decoded independently and treats the other user as noise, the rate constraints are

3.3 Binary-Input Channels

57

R2

1.5

Full cooperation B

1.0

Joint decoding

0.5

A

No cooperation

R1 0.5 1.0 1.5 Fig. 3.8. Capacity region of the two-user binary adder channel.

R1 ≤ H (Y ) − H (Y | X1 ) = H (Y ) − h (q) R2 ≤ H (Y ) − H (Y | X2 ) = H (Y ) − h (p) . Setting q = 1/2 results in 1 1 h (p) ≤ 2 2 1 1 R2 = 1 − h (p) ≥ . 2 2 R1 =

and the portion of the line R1 + R2 = 1 from R = (0, 1) to R = (1/2, 1/2) may be achieved by varying p. Alternatively, setting p = 1/2 results in 1 1 R1 = 1 − h (q) ≥ 2 2 1 1 R2 = h (q) ≤ . 2 2 which yields the remaining portion of the line R1 +R2 = 1 from R = (1/2, 1/2) to R = (1, 0). The full cooperation region is bounded by the sum rate R1 + R2 = log2 3, which is achieved if the users cooperate in their transmissions to ensure a

58

3 Multiuser Information Theory

uniform distribution over the three output symbols. This region is outside the multiple-access channel capacity region, because without cooperating in their transmissions, the required uniform distribution of output symbols cannot be obtained. Cancellation and the Corner Points It is worthwhile discussing the corner points of the joint decoding capacity region, marked A and B on the figure. Consider the first of these points, A which is at (I (X1 ; Y | X2 ) , I (X2 ; Y )) = (1, 0.5). The form of the expression for this point is suggestive of a method for transmission on the channel, namely interference cancellation. Let user 2 transmit at rate R2 = I (X2 ; Y ), which is the mutual information between user 2 and the output, treating user 1 as noise. Recall that p = q = 0.5, hence user 2 sees the single-user channel shown in Figure 3.9. This is the binary erasure channel, with erasure probability 1 2

0◗

◗ 1◗ 2 1 2

✑ ✑ 1 ✑

✲0 ◗ ◗ ◗ s 1 ✑ ✸ ✑ ✑ ✲ 2

1 2

Fig. 3.9. Channel as seen by user two.

δ = 0.5. The capacity of this channel is known to be 1 − δ = 0.5 [24, p. 187], hence user 2 may indeed transmit with arbitrarily low error probability (using a separate single-user decoder) at 0.5 bits/channel use. Assuming that we have decoded user 2 error free, user 1 may transmit at R1 = I (X1 ; Y | X2 ) = 1 bit per channel use, which is the maximal mutual information between user 1 and the output, given complete knowledge of user 2. Thus we can achieve the point A, using only single-user coding techniques, combined with interference cancellation. Point B may be achieved in a similar way, decoding user 1 first. Any point on the line connecting A and B may be achieved by time-sharing between these two schemes. K > 2 Users For the general case of K > 2 users, the capacity region is given [18, 19] as the closure of the convex hull of the union of the following regions: ⎫ ⎧ |S| |S| ⎬ ⎨ |S|

2 k , ∀S ⊆ K . log (R1 , R2 , . . . , RK ) : 0 ≤ R (S) ≤ 2 |S| ⎭ ⎩ 2|S| k=1

k

3.4 Gaussian Multiple-Access Channels

59

Often, we are interested in the maximum total transmission rate, or sum capacity. A good approximation to the sum rate constraint, R (K) ≤ Csum for this channel is [19, 165] Csum ≈

πeK 1 log2 bits. 2 2

(3.6)

In fact the following bounds show the asymptotic tightness of this approximation [18]  1 log2 πeK Keven, πK 1 2 log2 ≤ max R (K) ≤ 12 πe(K+1) 2 2 log Kodd. 2 2 2 From (3.6), the total throughput is proportional to log K. As the system grows larger, the sum capacity increases, with individual users transmitting less. It is interesting to note that log2 πeK 1 2 = . k→∞ log2 (K + 1) 2 lim

1 2

In the limit, the lack of transmitter cooperation reduces the available sum capacity by a factor two. As the number of users grows, the distribution of the channel output is becoming Gaussian (for independent users), rather than uniform (with transmitter cooperation). 3.3.2 Binary Multiplier Channel The previous example showed how joint detection could be used to increase the region of achievable rates (and how full cooperation increases this region even further). In the next example, we see that this is not always the case. The two-user binary multiplier channel performs the real multiplication Y = X1 X2 , where X1 , X2 ∈ {0, 1}. Hence Y ∈ {0, 1}. The cardinality of the output alphabet places a restriction on the maximum sum rate, R1 + R2 ≤ 1. The capacity region is shown in Figure 3.10. In this case, there is nothing to be gained over time-sharing of the channel, or even by complete transmitter cooperation.

3.4 Gaussian Multiple-Access Channels 3.4.1 Scalar Gaussian Multiple-Access Channel The Gaussian multiple-access channel (GMAC) differs from the channels examined so far, in that each user transmits from an infinite alphabet.

60

3 Multiuser Information Theory R2 1.0

0.5

R1 0.5 1.0 Fig. 3.10. Capacity region of the two-user binary multiplier channel.

Definition 3.3 (Gaussian Multiple-Access Channel). The Gaussian multiple-access channel is a K-user multiple-access channel in which each user transmits from an infinite 0 1alphabet, Xk ∈ R, subject to an average power constraint E |Xk |2 ≤ Pk . The channel output Y ∈ R is given by the real addition Y =

K

Xk + z,

k=1

where z is a zero mean, σ 2 , white Gaussian random vari variance 2 able, denoted z ∼ N 0, σ . The above definition is easily extended to a complex channel, in which the users transmit complex numbers and the additive noise is a zero mean circularly symmetric complex Gaussian random variable (i.e. independent real and imaginary parts, each with variance σ 2 /2). The capacity region of the Gaussian multiple-access channel is found by extending the work of [2], [129] and [76], which was for finite alphabets, to the case of infinite alphabets. This was done independently by [22] and [169]. The resulting capacity region (in the real case) is given by the following constraints   P (S) 1 R (S) ≤ log2 1 + bits, ∀S ⊆ K. 2 σ2 For complex channels, the factor of 1/2 is omitted. From this we can see that although the users transmit independently, the total rate is equal to that obtained by a single user, with average power

3.4 Gaussian Multiple-Access Channels

61

P (K), with exclusive access to the channel. This implies that no rate loss is suffered due to the distributed nature of the various users. As was the case for the binary adder channel, the total sum rate is proportional to log K, with ever increasing capacity as the numbers of users increases, with each user transmitting a smaller and smaller amount. It is convenient to introduce the following notation for Gaussian channel capacity,  1 real channel log 1 + P/σ 2 2 2 C P, σ = (3.7) 2 log 1 + P/σ complex channel Using this notation, Figure 3.11 shows an example of a capacity region, in which P1 > P2 . R2 R R +

1

C =

2

(P

` ´ C P2 , σ 2

2

,σ P2 +

1

)

` ´ C P2 , P 1 + σ 2

` ` ´ ´ C P 1 , P 2 + σ 2 C P1 , σ 2

R1

Fig. 3.11. Example of Gaussian multiple-access channel capacity region.

Orthogonal Multiple-Access What rate vectors are achievable with only single-user decoding on the GMAC? Consider the use of an orthogonal multiple-access technique, such as time sharing (i.e. time-division multiple-access). For the purposes of illustration, consider a two-user channel and suppose user 1 has exclusive access to the channel λn symbols out of n. Correspondingly, user 2 has exclusive access for (1 − λ)n symbols. User 1 transmits only a fraction λ of the time, and hence when it does transmit it may do so with power P1 /λ while still maintaining the long-term rate of transmission during the active periods average power P1 . The resulting is therefore C P1 /λ, σ 2 (noting that there is no interference since user 2 does not transmit at the same time). The overall transmission rate for user 1 is

62

3 Multiuser Information Theory

reduced by the factor λ. Similar arguments apply to user 2 and the achievable rate pair is   P1 2 ,σ R1 = λ C λ   P2 , σ2 . R1 = (1 − λ) C 1−λ Figure 3.12 shows the region obtained by varying 0 ≤ λ ≤ 1 compared to the capacity region (in this example P1 = P2 = σ 2 = 1). Orthogonal multiple-access touches the outer boundary of the capacity region at the trivial points R = (0, C P2 , σ 2 ) and R = (C P1 , σ 2 , 0), in which case either user 1 or user 2 has exclusive access to the channel. The more interesting case is achieved by setting λ=

P1 , P1 + P2

which results in the non-trivial boundary point P1 C P1 + P2 , σ 2 P1 + P2 P2 C P1 + P2 , σ 2 . R2 = P1 + P2 R1 =

Figure 3.12 shows how almost the entire capacity region of the two-user GMAC can be achieved, with the only cooperation required being a common time reference to schedule the transmissions. Similarly, a point on the boundary of the K-user capacity region of a GMAC with user powers P1 , P2 , . . . , PK can be achieved with user k having exclusive channel access a proportion λk =

Pk P (K)

of the time. The resulting rate for user k is Pk C P (K) , σ 2 , P (K) and the sum of these rates is clearly C P (K) , σ 2 . There is of course nothing special about the use of time-division as the orthogonal accessing method. The same rates result from the use of any orthogonal access method, where the variable λ accounts for the proportion of degrees of freedom used by user 1. For example, in a band-limited system with bandwidth W , we could consider allocating orthogonal frequency bands. Allocating λW Hz to user 1 and (1 − λ)W Hz to user 2 results in Rk =

3.4 Gaussian Multiple-Access Channels

R1 = λW C (P1 , λW N0 ) = λW C



P1 , W N0 λ



R1 = (1 − λ)W C (P2 , (1 − λ)W N0 ) = (1 − λ)W C



63

 P2 , W N0 . 1−λ

R2 0.5

0.4

0.3

0.2

0.1

0.1

0.2

0.3

0.4

0.5 R 1

Fig. 3.12. Rates achievable with orthogonal multiple-access.

3.4.2 Code-Division Multiple-Access The remaining chapters of this book are devoted to the the design of multiuser receivers for linear multiple-access channels, typically exemplified by direct-sequence code-division multiple-access. Therefore let us now consider the CDMA channel, from an information theoretic point of view. The approach that we take is to model the spreading as part of the channel specification, and to calculate the resulting capacity. Using this approach we can gain insight into several important questions concerning CDMA channels: • •

What are the fundamental limitations of such systems? We know from the data processing theorem that spreading cannot increase capacity. However spreading is useful for reasons other than capacity. Are there cases when spreading does not decrease capacity? • How much spectrum spreading should be used? • What sort of performance improvement can be expected through the use of receiver cooperation, i.e. joint decoding? • How do we design sequences to maximize capacity? There are two main cases of interest: Firstly, time-invariant spreading, according to Definition 2.1. Secondly, we can consider random spreading,

64

3 Multiuser Information Theory

according to Definition 2.2, which models aperiodic sequences (or sequences with period much longer than the symbol interval). Time-invariant spreading sequences S[i] = S, model periodic, synchronous spreading sequences. The following theorem gives the capacity of the timeinvariant CDMA channel (normalized by the number of signaling dimensions) as a function of the spreading sequences [112, 148]. We will concentrate on the complex channel (with complex spreading). Real channel results are obtained via a factor 1/2. Theorem 3.3 (Synchronous CDMA). The capacity region of the time-invariant CDMA channel with spreading sequence matrix S, average energy constraints W and noise variance σ 2 is the closure of the region  4 2 3 1 1 ∗ C [S] = . R : R (T ) ≤ log det I + 2 ST WT ST L σ T ∈K T =∅

ST is the matrix S, with the columns whose indices do not belong to T omitted. WT is formed from W by removing rows and columns whose indices do not belong to T . Capacity is achieved for Gaussian inputs X. Using our notation (3.7), we can also write the sum rate constraint Csum (S) for real or complex channels in terms of λ1 , λ2 , . . . , λL , the eigenvalues of SWS∗ L 1 C λl , σ 2 . (3.8) R (K) ≤ Csum (S) = L l=1

We shall see that the latter expression is useful for calculating the capacity for randomly spread systems. An upper bound on Csum is the capacity of the channel with L = 1, namely the K user Gaussian multiple-access channel with average energy constraint wtot = P (K). It turns out that there exist choices for the sequences S that achieve this bound [111, 112]. Theorem 3.4 (Optimal Sequences for Time Invariant CDMA). Let W = P I. The Gaussian multiple-access channel upper bound Csum ≤ CGMAC = C wtot , σ 2 nats per chip

is achieved by use of sequences that satisfy SSt = IL .

3.4 Gaussian Multiple-Access Channels

65

Now the spreading sequences are the columns of S. The condition for equality in the above theorem requires the rows to be orthogonal. It is not required that the sequences themselves be orthogonal. A requirement for row orthogonality is that K ≥ L, otherwise S will be rank deficient. Sequences satisfying this condition also satisfy the Welch lower bound on total squared correlation [159]. Hence, by restricting the amount of spreading to L ≤ K, no capacity penalty need be suffered. Note that these are the same sequences that are optimal for the correlation detector, as described in Section 2.5.3. Let us now consider the effect of using randomly selected sequences, according to Definition 2.2. Now, according to (3.8), for any particular set of spreading sequences S, we can write the sum capacity Csum (S) in terms of the eigenvalues of SSt (we shall for the moment consider systems with equal powers for all the users, W = I). This motivates interest in the eigenvalue distribution of random matrices of the form SSt . Now since the spreading sequences are being chosen randomly, the eigenvalues corresponding to any particular S[i] (and the corresponding capacity) are also random. The random nature of these spreading sequences therefore makes it difficult to make statements about the capacity. However, it turns out that if we instead consider a large systems limit, we can calculate the asymptotic capacity. By large systems we mean that we take the limit K → ∞ such that K/L → β, a constant. In this scenario, results from random matrix theory state that the distribution of eigenvalues tends to a non-random, known limit. This in turn enables us to compute the limiting capacity to which the random capacity for a finite system converges, as we increase the system dimensions. The following theorem shows how the eigenvalue spectrum for a large random matrix converges to a non-random limit distribution, see [64, 127, 156, 172] Theorem 3.5 (Asymptotic Spectral Distribution). Let Fˆ (x) be the empirical cumulative distribution of a randomly se1 lected eigenvalue of K SSt , where S is selected according to Definition 2.2. For large systems Fˆ (x) → F(x), where √ (x−a(β))(b(β)−x) a(β) ≤ x ≤ b(β) 2πβx F ′ (x) = f(x) = 0 otherwise 2 a(β) = ( β − 1) b(β) = ( β + 1)2 . Example 3.4 (Convergence of Eigenvalue Distribution). Figure 3.13 shows how the eigenvalue distribution of SSt converges to the theoretical limit. The figure compares simulated eigenvalue histograms to the limit distribution f(x) for K = L = 5, 10 and 50. We see that the convergence is quite fast. Although

66

3 Multiuser Information Theory

the histogram and density do not match very well at K = L = 5, they do match well already at K = L = 10.

K=L=5

K = L = 10

K = L = 50 Fig. 3.13. Convergence of spectral density.

Using Theorem 3.5, it is possible to find the large-systems sum capacity Cr of the randomly spread CDMA channel [150]. Theorem 3.6 (Spectral Efficiency of Random CDMA). For large systems, the sum capacity of random CDMA where the users have identical powers wtot /K is  Cr = log (1 + γx) f (x) dx     1 1 1 β = log 1 + γ − F(γ, β) + log 1 + γβ − F(γ, β) 2 4 2 4 log e F(γ, β) − 8γ bits per chip, where γ = wtot /(Lσ 2 ) and F(x, z) =



x(1 +



z)2 + 1 −



x(1 −



2 z)2 + 1

3.4 Gaussian Multiple-Access Channels

67

The convergence is in probability, i.e. as we increase the system dimensions, Csum (S)/L → Cr . A direct solution for a closed-form expression for this capacity is also given in [106]. Using similar techniques, one may also find the capacity of the randomly spread CDMA channel with no cooperation (i.e. each user independently decodes based only the output of its own matched filter) [150]. Theorem 3.7 (No Cooperation Spectral Efficiency). For large systems, the sum capacity of match-filtered Random CDMA with no cooperation is   γ β CMF = log 1 + 2 1 + γβ

Finally, if no spreading (or Welch Bound Equality sequences) is used, the spectral efficiency is the solution to the following equation.   1 Eb COPT = log 1 + 2COPT 2 N0 If orthogonal sequences are used (K ≤ L), we have C⊥ = βCOPT . These results are illustrated on Figure 3.14 which is plotted for Eb /N0 = 10 dB. For a fixed E b/N0 , the spectral efficiency of either the optimal detector or the correlation detector is maximized by letting β → ∞. In the case of correlation detection,  −1 log2 e 1 Eb − , lim CMF = K→∞ 2 2 N0 which converges to log2 e/2 ≈ 0.72 as Eb /N0 → ∞ [62, 150]. Figure 3.15 shows the spectral efficiency for both optimal detection and correlation detection as a function of Eb /N0 for K → ∞. The spectral efficiency of the optimal system grows without bound as the SNR increases. In fact, comparing to (3.6), we see that for L = 1 and a fixed, large K, the spectral efficiency behaves like 1 πeK log2 2 2 with increasing SNR. This is the same limit obtained for the K-user binary adder channel in Section 3.3.1. In the limiting case of many users or high signal-to-noise ratios, it it possible to write a closed-form expression for the sum capacity of the randomly spread channel [56].

68

3 Multiuser Information Theory Spectral Efficiency bits/chip No spreading/WBE Spreading 3 Random Spreading Joint Decoding

Orthogonal 2

Random Spreading Matched Filter

1

0

1

2

β

Fig. 3.14. Spectral efficiency of DS-CDMA with optimal, orthogonal and random spreading. Eb /N0 = 10 dB. Spectral Efficiency bits/chip

2

Joint Decoding

1

Matched Filter

0

Eb /N0 (dB)

0

2

4

6

8

10

Fig. 3.15. Spectral efficiency of DS-CDMA with random spreading.

3.4 Gaussian Multiple-Access Channels

69

Theorem 3.8 (Limiting Capacity). For large systems (many users) or for high signal-to-noise ratios, the average sum capacity of a direct-sequence CDMA channel with equal powers is given by     K K −L ln − 1 nats, K ≥ L. lim Cr = log γ + γ→∞ L K −L Furthermore  0 lim CGMAC − Cr = γ→∞ 1

L K L K

→ 0, → 1.

From this limiting expression we can see that the penalty for using randomly selected, rather than optimal spreading sequence in large systems is at most 1 nat. As L/K decreases, random spreading becomes optimal. All of the large systems results that we have presented have been for the case when all the users transmit with equal powers. If unequal powers are used, the analysis may be extended by using appropriate results from random matrix theory. The results end up being expressed in terms of Stieltjes transforms. Definition 3.4 (Stieltjes Transform). The Stieltjes transform m(z), z ∈ C+ of a cumulative distribution F (x) with probability density function F ′ (x) = f (x) is given by  1 m(z) = f (λ) dλ, λ−z and possesses the inversion formula 

b

f (x) dx = a

1 lim π η→0+



a

b

ℑ [m(ξ + iη)] dξ

(3.9)

We shall continue using the convention of using upper-case letters for cumulative distributions and the corresponding lower case letter for densities.5 As a special case of the result presented in [128] we have the following. 5

Most of the following results are usually presented in measure-theoretic terms. We assume the existence of probability density functions in order to avoid measuretheoretic concepts.

70

3 Multiuser Information Theory

Theorem 3.9 (Limiting Spectrum for Unequal Powers). Let the elements of S ∈ CL×K be chosen according to Definition 2.2. Let W = diag(w1 , w2 , . . . , wK ), wi ∈ R be independent of S and let the empirical distribution of {w1 , w2 , . . . , wK } converge almost surely to a cumulative distribution function H with density H ′ = h as K → ∞. Then for large systems, the empirical distribution function of the eigenvalues of the matrix SWSt /spreadlength L converges to a non-random probability distribution whose Stieltjes transform m(z) ∈ C+ is the unique solution to −1   τ h(τ ) 1 dτ . m(z) = − z − β 1 + τ m(z)

(3.10)

We can now describe the large-systems capacity result for the unequal power case. Theorem 3.10. At each symbol interval, let the matrix of spreading sequences S be randomly selected as in Theorem 3.9 and let the empirical distribution function of the users’ energies converge to a non-random cumulative distribution function H. Then    xL Cr = log 1 + 2 f (x) dx, (3.11) σ where the Stieltjes transform of the cumulative distribution function F (x) satisfies (3.10). This theorem shows that for fixed β, the sum capacity depends only upon the noise variance and the limiting distribution of the users’ energies, albeit through the rather awkward (3.10), combined with the inverse Stieltjes transform. Example 3.5. Let the power distribution be discrete, consisting of J point masses at powers 0 < P1 < P2 < · · · < PJ , h(τ ) =

J

j=1

αj δ(τ − Pj )

5 where αj = 1. Substituting into (3.10) and using the sifting properties of the Dirac delta function results in ⎛ ⎞−1 J

Pj αj 1 ⎠ . m(z) = − ⎝z − (3.12) β j=1 1 + Pj m(z)

3.4 Gaussian Multiple-Access Channels

71

The resulting equation for m(z) is a degree J + 1 polynomial. For a single point mass (the equal power case), solution of the corresponding quadratic, together with the inversion formula (3.9), results in the expression for F given in Theorem 3.5. Capacity Computation for Arbitrary Power Distributions In all but the simplest of cases, direct computation of the Stieltjes transform m(z) of the limiting eigenvalue distribution F (λ) using (3.10) is difficult, if not intractable (for example four point masses results in a quintic for m). It is possible however to find a parametric relation for the random sequence capacity that side-steps the problem of solving (3.10). This parametric equation is in terms of the capacity of a parallel (orthogonal modulation) channel with the same power distribution. Definition 3.5 (Shannon Transform). Let h(x) be a probability density on the positive real numbers. Define  1 ∞ log (1 + γx) h(x) dx. ηH (γ) = 2 0

Theorem 3.11. Retaining the definitions and assumptions of Theorem 3.10, the spectral efficiency of the randomly spread multiple-access channel is given parametrically via s s 1 ηH (s) − ln + − 1 β γ γ −1  s dηH (s) γ =s 1− β ds

Cr = ηF (γ) =

(3.13) (3.14)

where for convenience γ = L/σ 2 . Example 3.6 (Equal Powers). Let h(x) = δ(x − 1) be a single point mass, corresponding to the unit equal power case. Consider β = 1. Then ηH (s) = ln(1 + s) and hence γ = s(1 + s), from (3.14). Solving for s in terms of γ gives s=

 1 −1 + 1 + 4ρ , 2

which when substituted into (3.13) gives √   4ρ + 1 − 1 ηF (ρ) + 2 ln 4ρ + 1 + 1 − 1 − 2 ln 2 2ρ

72

3 Multiuser Information Theory

which is the same result obtained in [150, (9)], which was given in Theorem 3.6 above. Although this parametric form appears to simplify the problem of finding closed-form capacity results, it suffers from the same basic problem as the fixed-point equation (3.10). For instance, the J-level discrete distribution described in Example 3.5 results again in a degree J + 1 polynomial. Example 3.7 (Two Point Masses). Consider a two-level power distribution with point masses at powers P1 > 1 > P2 , normalized such that the average power over all users is 1, h(x) =

P1 − 1 1 − P2 δ(x − P1 ) + δ(x − P2 ). P1 − P2 P1 − P2

Then ηH (s) and γ(s) are given by (P1 − 1) log (sP2 + 1) + (1 − P2 ) log (sP1 + 1) P1 − P2 s (sP1 + 1) (sP2 + 1) γ(s) = . (P1 + P2 ) s − s + 1

ηH (s) =

(3.15) (3.16)

From (3.16) it can be seen that solution for s in terms of γ (which would yield a closed-form expression for ηF (γ)) involves solution of the same cubic equation arising in (3.12). Nevertheless, it is straightforward to parametrically obtain plots of ηF (γ) using (3.15) and (3.16) in Theorem 3.11. Thus the polynomial is still present, there is just no need to solve it. Example 3.8 (Rayleigh Fading). Let each user’s transmission experience independent Rayleigh fading, namely e−x/2 h(x) = √ √ . 2π x Then ηH (x) may be easily found via numerical integration and  1 e− 2s π2 s3/2  . γ(s) = 1 − erf √21√s

The resulting random sequence capacity can therefore be numerically determined. The result is shown in Figure 3.16, as a function of γ. Furthermore, it is possible to find ηF (γ) directly from ηH (s), in the following way. For each point (s, ηH (s)) there is a single corresponding point (γ, ηF (γ)), which can be obtained directly from the plot of ηH (s) by a simple point-wise geometric construction. Plot the curve β −1 ηH (s) and extend a tangent line

3.5 Multiple-Access Codes

73

0.8

Equal Power 0.6

ηF (γ)

0.4

Rayleigh

0.2 0 0

2

4

6

8

10

12

γ Fig. 3.16. Random sequence capacity with Rayleigh fading.

back from s to the s = 0 intercept, say η0 . Denote the vertical drop of this line by α = β −1 ηH (s) − η0 , which is always less than 1. The point (s, ηH (s)) corresponds to a point (γ, ηF (γ)) = (s + Δs, ηH (s) + Δη) above and to the right, according to the following theorem, illustrated in Figure 3.17. Theorem 3.12. The following point-wise geometrical construction obtains ηF (γ) from ηH (s). 1 ηH (s) + Δη β γ = s + Δs,

ηF (γ) =

where the coordinate increments Δη and Δs are given in terms of α=s

dηH (s) ds

(3.17)

as Δη(α) = − ln(1 − α) − α αs Δs(α) = . 1−α The utility of this theorem is due to the fact that typically, ηH (s) and its derivative is easily computed and in some cases may be found in closed form.

3.5 Multiple-Access Codes Although it is important to know the ultimate information rate limits for the multiple-access channel, practical systems require channel coding schemes

74

3 Multiuser Information Theory

Δs

ηF (γ) Δη ηH (s) /β

α

s γ Fig. 3.17. Finding the asymptotic spectral efficiency via a geometric construction.

to achieve these rates. In addition to providing error detection/correction capability in the presence of noise, multiple access channel codes should also have the property that they separate the users’ transmissions (reduce MAI). A code that can perform this task without error is called uniquely decodeable. Definition 3.6 (Uniquely Decodeable). A code for the multiple-access channel is uniquely decodeable (UD) if, in the absence of noise, the mapping MK → Y n is one-to-one. Codes that fulfill this definition have the property that the combination of the users’ encoding function and the channel function is uniquely invertible. Both block and trellis uniquely decodeable codes have been constructed for a variety of multiple-access channels. In particular, much effort has been spent on the binary adder channel. In the following sections, we shall summarize some of the currently available techniques. For a survey of channel codes up to 1980, one is directed to Farrell [40]. Note that the requirement of unique decodeability is related to the concept of the zero-error capacity region. If we require codes that can be decoded with zero probability of error, then we should compare their performance to the zero-error capacity region, rather than the ǫ-error region. In general, the zeroerror capacity region for the multiple-access channel is unknown (apart from special cases such as [84]. It has in fact been shown for the two-user binary

3.5 Multiple-Access Codes

75

adder channel, the zero error capacity region is strictly smaller than the ǫ-error region [141]. Much of the literature on coding is for the binary adder channel, and as such, is a useful basis for comparison of various schemes. Figure 3.18 summarizes graphically various codes of historical interest, including the current best known code (in terms of sum rate).

R2

✻ 1.0

•7•8•9•10

••1112 6



0.5

0

••1314

❅ ❅ ❅ 15 1 ••3 •5

4



•2

❅ ❅ ❅

❅ ❅ ❅ ❅

0.5

1.0

✲ R1

Fig. 3.18. Rates achieved by some existing codes for the BAC

3.5.1 Block Codes Let us begin by introducing some notation for block codes. Denote the codebook for user k by Ck , the number of codewords by Mk = |Ck |, and the rate Rk = n1 log2 Mk . Block Codes for the Binary Adder Channel Let us take a look at our first multiuser code for the two-user BAC, and in doing so, we shall illustrate the principle of unique decodeability. Example 3.9 (Uniquely Decodeable Code for the Two-User BAC). Until 1985, the best known code for the 2-BAC was also one of the simplest. It assigns

76

3 Multiuser Information Theory Point number R1 1 0.5 2 0.571 3 0.512 4 0.5 5 0.75 6 0.33 7 0.100 8 0.143 9 0.178 10 0.222 11 0.288 12 0.307 13 0.414 14 0.434 15 0.512

R2 0.792 0.558 0.793 0.5 0.5 0.878 0.999 0.999 0.998 0.994 0.972 0.961 0.885 0.869 0.798

R1 + R2 1.292 1.129 1.306 1.0 1.25 1.208 1.099 1.143 1.177 1.216 1.260 1.268 1.300 1.303 1.310

Reference [67] [67] [14] [96] [20] [27] [14] [14] [14] [73] [14] [14] [14] [14] [141]

Type Block Block Block Convolutional Trellis Block asynch. Block Block Block Block Block Block Block Block Block

Table 3.1. Coding schemes shown in Figure 3.18.

to user X1 the codebook C1 = {00, 11}, and to user X2 the words C2 = {00, 01, 10}. This output words resulting from C1 × C2 are shown in Table 3.2. This code was first given in [67]. For this code, R1 = 0.5, and R2 = 0.792. The sum rate R1 + R2 = 1.29. This code is represented by point 1 in Figure 3.18. From the table, one can verify that all output codewords are distinct, hence the code is uniquely decodeable. Let us now assume that we want to increase the C2 C1 00 01 10 00 00 01 10 11 11 12 21 Table 3.2. Uniquely decodeable rate 1.29 code for the two-user BAC.

rate of the code, by including another codeword for user 1, say 01. Table 3.3 shows the new codebook pair, and the corresponding channel outputs. We now have the problem that two channel outputs are ambiguous: the output 01 could have been caused by the transmission (user 1, user 2) of either (01, 00) or (01, 00). Similarly, the output 11 is ambiguous. The best that the decoder could do in such cases would be to choose the message pair at random, which would of course result in an irreduceable error rate. It is interesting to compare the sum rate of the code in the previous example, R1 + R2 = 1.2925, this was the best code in terms of sum rate, until 1985, when it was beaten by a R1 + R2 = 1.306 code [14]. More recently, this record was pushed to R1 + R2 = 1.30999 [141]. The difficulty in improving the sum

3.5 Multiple-Access Codes

C1 00 11 01

00 00 11 01

C2 01 01 12 02

77

10 10 21 11

Table 3.3. Non uniquely decodeable code for the two-user BAC.

rate of such codes may turn out to be a limitation of the zero-error capacity of the channel. Kasami and Lin were the first to consider code construction for the twouser BAC [67]. They considered code constructions in which C1 is taken to be a linear (n, k) block code [77], which contains the all ones codeword 11 . . . 1. The members of C2 are then chosen from the cosets of C1 . One particular UD construction is to select the members of C2 as the coset leaders of the cosets of C1 which are not equal to C1 , their one’s complements and the all zeros codeword 00 . . . 0. Using this technique one can construct codes with rate vector   k 1 R= , log2 2n−k+1 − 2 . n n

They also provide bounds on the number of vectors that can be taken from the cosets of C1 .

Example 3.10 (Kasami-Lin Construction). Let us now use this method to construct a two-user code. Let the codebook for user 1, C1 be the (7, 4) Hamming code. This code is perfect [77], which means that it has as coset leaders every error pattern of weight 0 or 1. We include these eight coset leaders in the codebook for user 2. According to the construction, we also include the one’s complement of each of these codewords, except for the all-one word, which is already in C1 . This gives |C1 | = 15. Therefore this code has rate vector R = (0.571, 0.558) and is point 2 on Figure 3.18. The two codebooks are shown in Table 3.4. Coebergh van den Braak and van Tilborg have described another construction for the 2-BAC [14]. They construct code pairs at sum rates above 1.29, but only marginally. The highest rate code presented has n = 7, R = (0.512, 0.793), giving sum rate 1.30565. This is shown as point 3 in Figure 3.18. Apart from codes with high sum rate, they also construct many codes achieving point very close to the single-user constraint (i.e. with rate for one of the users very close to 1, and a non zero rate for the other user). Kasami and Lin also investigated codes for the noisy BAC, and in [68], give bounds on the achievable rates for certain block coding schemes for the noisy channel. In [69] they present a reduced complexity decoding scheme, which however is still exponentially complex as the code length increases. Van Tilborg gives a further upper bound on the sum rate for the two-user

78

3 Multiuser Information Theory C1 0000000 1101000 0110100 1011100 1110010 0011010 1000110 0101110 1010001 0111001 1100101 0001101 0100011 1001011 0010111 1111111

C2 0000000 0000001 0000010 0000100 0001000 0010000 0100000 1000000 1111110 1111101 1111011 1110111 1101111 1011111 0111111

Table 3.4. Rate R = (0.571, 0.558) uniquely decodeable code for the two-user binary adder channel.

BAC, where C1 is linear [136]. Kasami et al. [70] have used graph theoretic approaches to improve upon lower bounds for the Kasami–Lin codes. They relate the code design problem to the independent set problem of graph theory, and use the Tu´ran theorem, which gives a lower bound on the independence number, in terms of the numbers of vertices and edges of the graph. It is interesting however to note that if R1 > 0.5, the best code pairs require both codes to be non-linear [160]. K-user binary adder channel Chang and Weldon [18] have constructed codes using an iterative method for the K-BAC. Their construction for the noiseless case (they also present a similar construction for the noisy channel) is based on a linearly independent difference matrix d ∈ {−1, 0, 1}K×n . User k is assigned two codewords, ck,1 and ck,2 obtained from dk , the kth row of d in the following way. ⎧ ⎪(0, 0) if (dk )l = 0 ⎨ (ck,1 )l , (ck,2 )l = (1, 0) if (dk )l = 1 ⎪ ⎩ (0, 1) if (dk )l = −1 The iteration is on the matrix d. Let d0 = [1]. Then ⎡ ⎤ dj−1 dj−1 dj = ⎣dj−1 −dj−1 ⎦ I2j−1 02j−1

3.5 Multiple-Access Codes

79

defines a K = (j + 2)2j−1 user code of length 2j , where I2j−1 and 02j−1 are the 2j−1 × 2j−1 identity and all-zero matrices respectively. Example 3.11 (Chang-Weldon Construction). Let us consider the ChangWeldon construction for 3 users. We generate the difference matrix ⎡ ⎤ 1 1 d1 = ⎣1 −1⎦ 1 0

This gives the codebooks C1 = {11, 00}, C2 = {10, 01}, C3 = {10, 00}. The code has rate vector R = ( 21 , 21 , 12 ). This construction is interesting because of the following theorem, which says that these codes are asymptotically optimal in a certain sense. Theorem 3.13. The Chang-Weldon construction is asymptotically good, in the sense that as the number of users increases, the ratio of the sum code rate to the sum capacity approaches unity. R (K) lim = 1. K→∞ Csum Although the ratio of the code rate to the sum capacity approaches one, the absolute difference increases without bound, with the logarithm of j Ferguson [41] generalizes these codes, by replacing the identity and all-zero matrices in the iteration of (3.5.1) with arbitrary matrices A, B with entries from {−1, 0, 1} such that the modulo 2 reduction of the sum A + B is an invertible matrix. Ferguson also determines the size of the equivalence classes of the Chang–Weldon codes. Hughes and Cooper [61] have investigated codes for the K-user binary adder channel, where the users have different rates. They modify the ChangWeldon construction [18] to distribute the codewords among as few users as possible. The main result is that K-user codes may be constructed for any rate point within the polytope derived from the capacity region by subtracting 1.09 bits/channel use from each constraint. In particular they construct family of K-user code with sum rate R (K) ≥ Csum − 0.547. Frame-Asynchronous Binary Adder Channel Block codes for the frame-asynchronous two-user BAC have been considered by Deaett and Wolf [27, 165]. One such simple code assigns two codewords of length n1 to user 1: the all one word 11 . . . 1 and the all zero word, 00 . . . 0. The codebook for the second user is all binary n2 -tuples such that the first symbol is 0, and the word does not contain n1 consecutive ones. The maximum

80

3 Multiuser Information Theory

rate pair is achieved for n1 = 3: R = (0.33, 0.878). This is point 6 on the graph. Plotnik [98] considers the code construction problem for the frameasynchronous channel, with random access to the channel. Example 3.12 (Deaett-Wolf Construction). Let use now construct a code for the two-user frame-asynchronous channel, with n1 = 2 and n2 = 4. To user 1 we assign the codebook C1 = {00, 11}. According to the construction, we assign to user 2 the codebook {0000, 0001, 0010, 0100, 0101}. K Users, N -ary Orthogonal Signal Channel Chang and Wolf [19] have constructed codes for a generalization of the KBAC, to larger input alphabets. In particular they consider a channel in which each user may at each symbol interval, activate one of N orthogonal frequencies, {f1 , f2 , . . . , fN }. Of course, we can consider the use of any set of orthogonal signals, but for convenience, we shall refer to frequencies. The receiver observes the output of each frequency. Depending upon what type of observation is made of each frequency, two channels are defined: the channel with intensity information, and the channel without intensity information. For the former case, the number of users transmitting on each frequency is available to the receiver. The latter case provides only a binary output of active/not active for each frequency. Three code constructions are given for the channel without intensity information. All of these constructions are characterized by the use of unique frequencies, or frequency patterns as markers for each user. This has the effect of transforming the system into a frequency division multi-access system. The first construction, for K < N assigns two codewords of length one to each user, Ck = {f1 , fk+1 }. This construction has sum rate N − 1. The remaining constructions are for the K = 2 channel, for which the reader is referred to the original work. Wilhelmsson and Zigangirov construct codes for this channel with polynomial decoding complexity [162]. The construction for the channel with intensity information is a generalization of the approach of Chang and Weldon [18]. As for the Chang–Weldon codes, their construction is asymptotically optimal, if one increases K → ∞, while fixing N . For further results concerning coding for the K-BAC, the interested reader is referred to [17]. Mathys considers the case when random access to the channel is allowed [86]. Binary Switching Channel Vanroose [144] considers coding for the two-user binary switching MAC, which is a counterpart to the two-user BAC. Each user transmits symbols from {0, 1}, but the output is Y = x1 /x2 . Division by 0 results in the ∞ symbol. This is of interest, as it is the only other ternary output, binary input MAC form. The capacity region is determined, and is found to touch the total

3.6 Superposition and Layering

81

cooperation line. In addition, the capacity region is also shown to be the zero-error capacity region. 3.5.2 Convolutional and Trellis Codes The subject of designing trellis codes for multiple-access channels has received considerably less attention than for block codes. Work has focused entirely on the binary adder channel. Peterson and Costello have investigated convolutional codes for the twouser binary adder channel [96]. They introduce the concept of a combined twouser trellis (see Figure 3.19), and define a distance measure, the L-distance between any two channel output sequences. They go on to prove several results. Among these are conditions for unique decodeability and conditions for catastrophicity. In a rather striking theorem, they show that no convolutional code pair for the two-user BAC exists at a sum rate greater than 1, which could have been achieved anyway with no cooperation. Figure 3.19 shows a combined trellis for a two-user uniquely decodeable code. The branch labels on the trellis are of the form u1 u2 | v1 v2 . If user 1 and user 2 input u1 and u2 respectively to their separate convolutional encoder inputs, the channel output is the symbol v1 followed by the symbol v2 . The state labels on the combined trellis are of the form s1 s2 , where sk is the state of user k’s individual code. Decoding takes place by simply applying the Viterbi decoding algorithm [152] to the combined trellis, using L-distance as the metric. This code achieves the upper bound set for convolutional codes on this channel, and is shown as point 4 in Figure 3.18. Chevillat has investigated trellis coding for the K-BAC [20]. He finds convolutional code pairs with large dL,free and gives a two-user non-linear trellis code for the BAC, found by computer search. This code is shown in Figure 3.20. It possesses sum rate 1.25, and is point 5 on the comparison figure. Peterson and Costello compute bounds on error probability and free distance for arbitrary two-user multiple-access channels [95]. Sorace gives an algebraic random coding performance bound [130].

3.6 Superposition and Layering We shall now see that the vertices of achievable rate regions possess special properties that provide a “single-user” interpretation of the capacity region. First, let us formally describe these vertices, or extreme points of the achievable rate region. Vertices are rate points R that have (after a possible re-indexing of the users) elements with the following form6 : 6

See [57] for the one-to-one correspondence between such points and vertices of the rate region

82

3 Multiuser Information Theory

0

0 | 00

❅ 1 | 11❅ ❅ 0 | 01 1

0 00 | 00

❅ ❅ ❅ 1

0 | 00

0



1 | 11❅



0 | 10 1

00

11

11

✟ ❅ ❏ ❍❍01 | 11 ✟✟✡ ✟ ❍ ❅ ❏ ✡ 11 | 22 ❍ ✟✟ ❏❅10 | 11 ❍ ✡ ✟❍ ✟ ❍❍✡ ❏❅ ✟✟ 00 | 10✟ ❏ ❅ ✡ ❍❍ ❍ 01 ✟ 01 | 01❏ ❅ ✡ 01 ❍ ✟ ✟ ❅❍❍ 10 | 21 ❏ ❅✡ ✟✟ 11 | 12 ❅ ❍❍❏ ✡ ✟ ✟❅ ❍ ❅ ✟ ❅ ❏ ✡ ❍ ❅✟✡ ❏ ❍❍❅ ✟ 00 | 01 ✟ ❍❅ ✟ 01 | 12❅ ❍ ❍ ❅ ✟ 10 | 10 ✡ ❅ ❏ 10 ❍ ✟ 10 ❏ ✟ ✡ 11 | 21❍ ✟ ❅ ❏ ❍✡❍ ✟ ✟ ❏ ✟❅ ✡ ❍❍ | 02 ✟ ❍ ❅❏ ✡ 01 00 | 11 ❍ ❅ ✟✟ ❍❍❏ ✡✟✟10 | 20 ❅ ❍ ❏ ✟ ✡ ❍ ❅

1 | 10 Single user trellis for user 1

0

00 ❍

❅ ❅ ❅ 1

11 | 11

Combined trellis 1 | 01 Single user trellis for user 2 Fig. 3.19. Combined 2 user trellis for the BAC 0001,0010,0011,0100

0 ❅ ❅

0001,1110

0

❅❅ ❅ ❅ 0000,1011 ❅ ❅ 0100,1000 ❅❅ ❅❅ ❅ ❅ 1001,1011 ❅ ❅ ❅ ❅ 0101,0110 ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ 1 1 ❅ 0101,1001,1010,1101

User 1

0 ❅

0

❅❅ 0000,1111 ❅❅ ❅❅ ❅❅ 0011,1100 ❅❅ ❅ ❅ 1 ❅ 1 0011,1100

User 2

Fig. 3.20. Two user nonlinear trellis code for the BAC

Ri = I Xi ; Y | X{1,2,...,i−1} ,

(3.18)

that is R looks something like

R = (I (X1 ; Y ) , I (X2 ; Y | X1 ) , . . . , I (XK ; Y | X1 , X2 , . . . , XK−1 )) . Note that by definition of the achievable rate region (3.1), such points are on the boundary of the region. In Section 3.3.1, we saw that the vertices of the achievable rate region (in that case coincident with the capacity region) were

3.6 Superposition and Layering

83

achievable by interference cancellation. This property holds for all vertices, and is in fact due to the form of the vertices (3.18). At a vertex, the users may be ordered, such that the maximum rate is given by the mutual information between each user and the output, conditioned only on knowledge of “previous” users, where previous is defined by the ordering.

X1

Source 1

Encoder 1

Source 2

Encoder 2

Source 3

Encoder 3

Linear Multiple-Access Channel

Y

X2

+

X3 Noise

X1

Decoder 1 I(X1 ;Y )

X2 X3 interference

X2

Decoder 2 I(X2 ;Y |X1 )

X3

Decoder 3 I(X3 ;Y |X1 , X2 )

+ X3 interference

+ no interference

Fig. 3.21. Successive cancellation approach to achieve vertex of capacity region.

Wither reference to Figure 3.21, the achievability of the vertices is proved using a successive decoding argument as follows. The first user treats all other users as noise, and sees a discrete memoryless channel. By the standard random-coding argument, this user may transmit at rate R1 ≤ I (X1 ; Y ) with arbitrarily low error probability, which is the rate point required by (3.18). We now assume that we have used such a coding scheme for this user, and that we know perfectly the transmitted data, which is now available for subsequent users. The second user now treats the remaining users, k > 2 as noise,

84

3 Multiuser Information Theory

and once again from the single-user random coding argument, may transmit at R2 ≤ I (X2 ; Y | X1 ), recalling that the users transmit independently. We continue this argument down the chain for all users, each transmitting using single user codes which are decoded one by one, using knowledge of all previous users. For linear channels, such as the binary adder channel, or the Gaussian multiple-access channel, the previous users’ data may be incorporated into the decoding process by subtracting it from the channel output, hence the name interference cancellation. A common objection to such schemes is the “error propagation” argument, whereby errors in one decoding step lead to even more errors in the next. However in proving the achievability of these rate points by successive cancellation, we do not suffer from this problem, as the single-user coding theorem guarantees the existence of codes with vanishing error probabilities. In fact it is possible to bound the probability that an error is made anywhere in the step-by-step process by a function which tends to zero exponentially with the codeword length. The error propagation is only a concern for a practical implementation of the scheme. In order properly understand the effect of asynchronism in the next section, we first need to better understand the role of the convex hull operation in Theorem 3.1. This convex hull operation is due to the idea of time-sharing. Given that the two rate vectors, R1 and R2 are achievable, then every point on the line connecting R1 and R2 is also achievable, simply by using the codebook corresponding to the the point R1 for λn symbols and that corresponding to R2 for the remaining (1 − λ)n, 0 ≤ λ ≤ 1. In general, we can achieve any additional point that is the convex combination of any number of achievable points. Note that in order to implement this time-sharing scheme, a common time reference must be available, in order for the users to agree upon when to change transmission strategies. Carath´eodry’s theorem [32] states that every point in the convex closure of a connected compact set A is contained within at least one simplex7 which takes its vertices from A. This implies that every point within the capacity region for a K user channel may be achieved by time-sharing between at most K + 1 vertices, requiring each user to have access to K + 1 codebooks. We now see that we may achieve any point in the capacity region by timesharing between at most K + 1 successive cancellation schemes, each with K cancellation steps.

3.7 Feedback In this section, we will consider some results for the multiple-access channel under the assumption of perfect feedback, by which we mean that the encoder for each user has available all previous channel outputs. Figure 3.22 7

A simplex in d-dimensional space is a polytope with d + 1 affinely independent vertices, e.g. a tetrahedron for 3-dimensional Euclidean space.

3.7 Feedback

85

shows a two-user channel with feedback. Is is a somewhat surprising result

Source 1

Encoder 1

Channel

Source 2

Encoder 2

Fig. 3.22. Two-user MAC with perfect feedback.

of single-user information theory, that noiseless feedback from the receiver to the transmitter does not increase the capacity of a memoryless channel, a fact proved by Shannon in 1956 [124]. Feedback does however increase the capacity of channels with memory, by aiding the transmitter in predicting future noise, using the time-correlation properties of the channel. It is therefore not altogether surprising that feedback can increase the set of achievable rates for the multiple-access channel. Such feedback essentially enables the users to cooperate in their transmissions to some degree. We shall also see that the use of feedback can simplify the coding schemes required for transmission. As a motivation, let us see how we can use feedback in a simple way to increase the set of achievable rates. The following example is due to Gaarder and Wolf [48], which was historically the first example of feedback increasing the set of achievable rates. Example 3.13 (Gaarder-Wolf Feedback Scheme). Consider the two-user binary adder channel, X1 , X2 ∈ {0, 1}, Y = X1 +X2 ∈ {0, 1, 2}. At the output, Y = 1 is the only ambiguous symbol. Call this output an erasure. Transmission will take place in two stages. Stage 1: In the first stage, each user transmits at rate 1. At the output of the channel, the decoder will not be able to successfully decode, since there are erasures. However each user observes the channel output, which combined with the knowledge of its own transmitted data, allows the deduction of the transmitted data of the other user. Stage 2: The users can now cooperate to re-transmit those bits of one user, say user 1 which suffered erasure. This can be done at a rate of log2 3, since both users know from stage 1 what must be transmitted. The output can easily determine the second user’s erased bits from those of the first.

86

3 Multiuser Information Theory

What rates can be achieved using this method? Consider a block of n transmissions. Let the first stage use λn symbols. With probability exponentially approaching 1 with n, there will be 12 λn erasures in the first stage. For each erasure, we must provide 1 bit of information in stage 2, i.e. we must have (1 − λ)n log2 3 =

1 λn. 2

Solving for λ, which is the rate for each user, we find R1 = R2 = λ =

2 log2 3 . 1 + 2 log2 3

The resulting sum rate 1.52 exceeds the sum constraint of 1.5 for the channel with no feedback. This two-stage approach can be thought of in the following way. In the first stage, the users transmit at the highest possible rate to each other via the feedback link. The output however cannot fully decode, but may be able to partially decode, e.g. with a list decoder. During the second stage, the users cooperate to resolve any ambiguity at the output. For the case of list decoding, they would transmit the index of the correct codeword in the list. This listdecoding method was in fact used to show the following achievable rate region for the two user discrete memoryless MAC, where either one, or both users can observe the channel output. Theorem 3.14 (Cover-Leung Achievable Rate Region). The following is achievable rate region for the K = 2 discrete memoryless multiple-access channel (XK ; p(y | xK ) ; Y) with perfect feedback to one, or both users. R [pU,X1 ,X2 ,Y (u, x1 , x2 , y)] = {R : 0 ≤R1 ≤ I (X1 ; Y | X2 , U ) 0 ≤R2 ≤ I (X2 ; Y | X1 , U ) 0 ≤R1 + R2 ≤ I (X1 , X2 ; Y ) , where the joint distribution pU,X1 ,X2 ,Y (u, x1 , x2 , y) is of the form pU,X1 ,X2 ,Y (u, x1 , x2 , y) = pU (u) pX1 |U (x1 | u) pX2 |U (x2 | u) pY |X1 ,X2 (y | x1 , x2 )

(3.19)

and U ∈ U is an arbitrary random variable, with |U | ≤ min{|X1 | |X2 | + 1, |Y| + 2}. The achievability of this region was originally shown in [23], for the case of feedback to both users. It was shown to also be achievable for feedback to only one user in [164].

3.7 Feedback

87

For a certain class of channels, the achievable rate region of Theorem 3.14 coincides with the capacity region. The converse required to show the optimality of the Cover-Leung region for these channels was proved by Willems in [163]. Theorem 3.15 (Optimality of Cover-Leung Region). Consider the class of two-user discrete memoryless multiple-access channels in which at least one input is a deterministic function of the output and the other input, i.e. either H (X1 | Y, X2 ) = 0 or H (X2 | Y, X1 ) = 0. For channels within this class, the rate region of Theorem 3.14 is the capacity region, for the case of perfect feedback to one or both users. The binary adder channel is a member of the class of channels described in Theorem 3.15. Example 3.14 (Binary Adder Channel with Feedback). The capacity region for the binary adder channel with feedback is shown in Figure 3.23. Also shown is the total cooperation line, R1 + R2 = log2 3 = 1.58496. The maximum sum rate achievable with feedback is at the equal rate point, (R1 , R2 ) = (0.7911, 0.7911), giving a sum value of 1.5822, slightly less than that for total cooperation. For the binary adder channel, as the number of users increases, this difference vanishes, and with feedback, the total cooperation rate may be achieved at the equal rate point. We now describe a simple scheme due to Kraemer [74], which achieves the capacity region for the two-user BAC (and in fact any channel for which H (X1 | X2 , Y )=0). The system diagram is shown in Figure 3.24. X1 , X2 generate new information such that Pr(X1 = 0) = Pr(X2 = 0) = p. The random variable V ∈ {0, 1}, which is common to both transmitters is generated from the feedback signal in the following way. The encoder for V takes as its input the feedback output Y , via a mapping, which outputs 1 if Y = 1, and 0 otherwise (i.e. the mapping is an erasure detector). This feedback signal is encoded, using an identical random block code for each user at a rate RV . Each user transmits the modulo-2 sum Xk ⊕ V . The cooperative random variable V serves to resolve the ambiguity about the erased Y = 1 channel outputs. In order for the scheme to work, note that for each Y = 1 erasure, V must successfully transmit one bit of information to the receiver, i.e. we must have RV ≥ Pr(Y = 1),

(3.20)

with arbitrarily low error probability for V . Under these conditions, the rate point achieved will be (R1 , R2 ) = (h (p) , h (p)). Let us now calculate these rates. V sees the binary symmetric erasure channel shown in Figure 3.25. The capacity of this channel can be found as follows,    0 1 p2 2 2 , (3.21) RV ≤ max I (V ; Y ) = (1 − p) + p · 1 − h (1 − p)2 + p2 Pr(V =0)

88

3 Multiuser Information Theory R2

R1 Fig. 3.23. Capacity region for the two-user binary adder channel with feedback.

for Pr(V = 0) = 21 . Note also that Pr(Y = 1) = 2p(1 − p).

(3.22)

Substituting the maximum value for RV (3.21) and the erasure probability (3.22) into (3.20) and solving for p, we find p = 0.23766, hence (R1 , R2 ) = (0.7911, 0.7911), which is on the boundary of the capacity region. This scheme is a special case of the Cover-Leung list decoding scheme. For each received codeword of length n, we have a certain number of erasures, e. The decoder for Y generates a list with 2e entries, consisting of each possible codeword, assuming all combinations of values for the erased symbols. However superimposed upon this “fresh” information is the resolution information V , which can be thought of as the index into the list (for the previous codeword). Example 3.15 (Gaussian Multiple-Access Channel with Feedback). The capacity region for the white Gaussian noise multiple-access channel with feedback was found by Ozarow [93]. Consider the two-user white Gaussian multiple-access channel, in which for k = 1, 2, each source Xk ∈ R, is subject to an average power constraint E Xk2 ≤ Pk . The channel output is given by

3.7 Feedback

89

Encoder 1

V Souce 1

map

+

X1

+

X2 Source 2

map

+

V Encoder 2

Fig. 3.24. Simple feedback scheme for the binary adder channel. V 0 ❍

(1 − p)2

Y

✲ 0 ✒

❅❍❍ 2p(1 − p) ❅ ❍❍ ❍ p2 ❅ ❍❍ ❅ ❍ ❍ ❥ 1 ❅ ✟ ✯ ❅ ✟✟ 2 ❅ ✟ p ✟ ✟ ❅ ✟✟2p(1 − p) ❅ ✟ ❅ ❘ ✲ 2 1 ✟

Pr(Y = 1) = 2p(1 − p)

(1 − p)2

Fig. 3.25. Channel seen by V .

Y = X1 + X2 + z, where z ∼ N 0, σ 2 . The capacity region for this channel under the assumption of perfect feedback to both users is given by

90

3 Multiuser Information Theory

C=

/

0≤ρ≤1



 P1 2 (1 − ρ ) , σ2  P2 2 (1 − ρ ) , σ2 √  6 P1 + P2 + 2ρ P1 P2 1 . 0 ≤ R1 + R2 ≤ log 1 + 2 σ2  1 log 1 + 2  1 0 ≤ R2 ≤ log 1 + 2

(R1 , R2 ) : 0 ≤ R1 ≤

The feedback capacity region (for the case P1 = P2 = σ 2 = 1) is shown in Figure 3.26. Also shown for reference is the non-feedback region. In the case R2

R1 Fig. 3.26. Capacity region for the two-user GMAC channel with feedback.

of non-white noise, the capacity region is upper-bounded by the non-feedback region, scaled by a factor of two [92].

3.8 Asynchronous Channels In the preceding discussion of the capacity region, two types of synchronism were assumed. The first was symbol synchronism - the users strictly align their symbol epochs to common time boundaries. The second type of synchronism was frame synchronism, in which the users align their codewords to common time boundaries. We shall see that loss of either type of synchronism changes the region of achievable rates.

3.8 Asynchronous Channels

91

Frame-Asynchronism First, we shall consider the case in which symbol synchronism is maintained, but frame synchronism is not. It has been shown [100] and [63] that the capacity region for this channel is given by the following theorem. Theorem 3.16. The capacity region of the symbol synchronous, frame asynchronous memoryless multiple-access channel (XK ; p(y | xK ) ; Y) is the closure of / R [π (xK ) , p(y | xK )] , C [p(y | xK )] = π(xK )

where R [·] is defined in (3.1) and π (xK ) is the family of product distributions on the sources. In other words, the loss of frame synchronism removes only those extra rate points included by the closure of the convex hull operation which are not already included in the union of achievable rate regions. Intuitively, we can see that if frame synchronism is lost, the users cannot coordinate their transmissions to time-share between two achievable points, and thus we cannot apply the convex hull operation. The proof of Theorem 3.16 however still requires that the remaining “union” points are shown to be achievable without frame synchronism. For many channels, the union region is already convex, for example the binary adder channel and the Gaussian multiple-access channel. For such channels, the removal of frame synchronism has no effect upon the capacity region. We shall see however that there are channels where this is not true. An interesting question is: Does the loss of frame-synchronization destroy our single-user interpretation of the capacity region, which relied on a globally known time reference, as discussed in Section 3.6? The answer is no, but we must slightly change our single-user strategy [57]. Complete Asynchronism In modeling the completely asynchronous case, a new concept must be introduced. Whereas the synchronous and frame-asynchronous channels were discrete time, the completely asynchronous channel must be modeled as a continuous time multiple-access channel8 . The general form of the capacity region for the completely asynchronous channel is at present unknown. We now present two examples, for which the capacity region is known, namely the completely asynchronous Gaussian MAC [148] and the collision channel without feedback [84] 8

It is tempting to try to model the channel as discrete time, with smaller time increments, but this is really the same as the symbol synchronous case.

92

3 Multiuser Information Theory

Example 3.16 (Completely Asynchronous Gaussian MAC). The completely asynchronous two-user Gaussian multiple-access channel has been studied by Verd´ u [148]. The model differs a little from the synchronous GMAC model, since we now have a continuous time waveform channel. Each user k transmits a length n codeword (xk [1], xk [2], . . . , xk [n]) ∈ Rnk , at a rate of one codeword symbol every T seconds, by sending the linear modulation of a fixed “signature” waveform sk (t). The signature waveforms are zero outside the interval [0, T ]. The channel output is therefore represented as n n



y(t) = x1 [i]s1 (t − iT − τ1 ) + x2 [i]s2 (t − iT − τ2 ) + z(t), i=1

i=1

where z(t) is white Gaussian noise with power spectral density σ 2 , and the delays τ1 , τ2 ∈ [0, T ) introduce the symbol asynchronism. It is necessary to assume that the receiver knows these delays, but they are unknown to the transmitters. We must also apply an energy constraint, n

1 2 x [i] ≤ wk . n i=1 k The capacity region is found by considering a “equivalent” discrete time channel, which has the same capacity as the continuous time one. This can be done if the outputs of the discrete time channel are sufficient statistics for the original channel. It can be shown that the output of filters matched to s1 (t) and s2 (t), sampled at iT + τ1 and iT + τ2 respectively are indeed such sufficient statistics. The capacity region is given by C=

[

(

(R1 , R2 ) : „ « S1 (ω) log 1 + dω σ2 −π „ « Z π S2 (ω) 1 log 1 + dω R2 ≤ 4π −π σ2 „ Z π S1 (ω)S2 (ω) S1 (ω) + S2 (ω) 1 inf + · log 1 + R1 + R2 ≤ ρ12 ,ρ21 ∈Γ 4π −π σ2 σ4 « ) ˜ ˆ 1 − ρ212 − ρ221 − 2ρ12 ρ21 cos ω dω , R1 ≤

1 4π

Z

π

where inputs to the channel are stationary Gaussian processes with power spectral densities S1 (ω) and S2 (ω), with Sk (ω) ≥ 0, ω ∈ [−π, π] and the union is over all processes conforming to the energy constraint  π 1 Sk (ω) dω = wk 2π −π The ρij terms are the cross correlations between the signature waveforms (assuming τ1 ≤ τ2 ):

3.8 Asynchronous Channels

ρ12 =



T

s1 (t)s2 (t + τ1 − τ2 ) dt.

0

ρ21 =



0

93

T

s1 (t)s2 (t + T + τ1 − τ2 ) dt.

The infimum in the last constraint is over Γ = {(ρ12 , ρ21 )}, which is the set of possible cross correlations, given the signature waveforms. This represents the fact that the users do not know their relative time offsets, and hence do not know their cross correlations. The capacity region therefore depends upon the cross-correlation between the users’ signature waveforms. It is interesting to see what happens if the users are assigned the same waveform. In this case, Γ contains (0, 1) and 1 − ρ212 − ρ221 − 2ρ12 ρ21 cos ω = 0, which is the minimum value that it can take. It is now easy to see that the resulting capacity region is exactly the same as that of the synchronous channel of example 3.49 . If the waveforms are however different, the resulting capacity region is a larger polytope, with rounded corners. The rounded corners are due to the fact that there is no combination of S1 (ω) and S2 (ω) which simultaneously maximizes all constraints. Figure 3.27 shows the capacity regions for an equal power asynchronous channel. Region A is for the completely correlated waveform case. Region B is for waveforms that are orthogonal when τ1 = τ2 . Note that in either case, the capacity region is still convex. Example 3.17 (Collision Channel without Feedback). The collision channel without feedback was proposed by Massey and Mathys [84]. This channel has a number of distinct features, which makes it an interesting comparison to the channels already described. In particular, it attempts to model random accessing to the channel. The channel is described as follows. User k sends a packet of fixed duration T , with some probability, pk . Users’ transmissions are not synchronized in any way, and there exists no feedback path to the users, so they can never determine the degree of asynchronism, whereas the receiver can. At the receiver, a collision is said to have occurred if two or more packets overlap by any nonzero time duration. Any packet involved in such a collision is destroyed, and its information lost. In the absence of a collision, packets are assumed to be received successfully, and all the information contained is retrieved. It is shown in [84] that the capacity region, and zero-error10 capacity region coincide, and this region further coincides with the corresponding regions obtained if slot (frame) synchronism is allowed. The outer boundary of the capacity region is given by 9

10

If it is not so easy to see, consider each user as a white Gaussian process, which together with (3.23) gives Sk (ω) = wk . By zero-error it is meant that there exists a coding scheme such that P¯e = 0. See [124].

94

3 Multiuser Information Theory R2

B A

R1 Fig. 3.27. Capacity region for symbol-asynchronous two-user Gaussian multipleaccess channel.

Rk ≤ p k

K # j=1 j=k

(1 − pj ) ,

5K where pj ≥ 0 and j=1 pj = 1. This is shown for two users in Figure 3.28. This region is not convex, and in fact the convexity of its first orthant complement was proved by Post in [103]. The symmetric capacity (the maximum sum rate achievable with every user at the same rate) of this channel approaches 1/e for large systems, which is equal to the maximum throughput of slotted ALOHA for an infinite number of identical users [13].

3.8 Asynchronous Channels R2

R1 Fig. 3.28. Capacity region for two-user collision channel without feedback.

95

4 Multiuser Detection

4.1 Introduction In this chapter we explore basic detection techniques for linear multiple-access channels. These techniques exploit the structure of the interference signals and do not assume the interference to be uncorrelated Gaussian noise as is the case in conventional correlation detection. They make use of the fact that the interference signal is information bearing. We will make a distinction between multiuser detection which deals with (uncoded) systems and essentially demodulates the transmitted signals without regard to any time correlation among the data symbols (coding), and multiuser decoding, which explicitly includes data dependencies in the form of forward error control coding (FEC). Multiuser decoding is treated in detail in Chapter 6, but these decoding methods are built on the detection techniques discussed here. Multiuser detection was first proposed for CDMA system in [121, 142, 147]. In [147] multiuser detection was shown to be able to handle the debilitating effect of different received power levels, the so-called near-far problem. This problem occurs when the signal of a close-by user drowns the signals of users further away which are received with less power. With conventional correlation reception, CDMA signals are very susceptible to the near-far problem, see Section 4.5.1. In cellular networks [134, 135], power control assures that all the users’ received powers are kept approximately equal. Since no joint detection is attempted, the near-far problem can only be avoided by carefully monitoring and adjusting the transmission power levels of the transmitting users. The complexity of power control resides in the network. Multiuser detection, however, can alleviate or eliminate this network complexity by translating it into computational receiver complexity. Apart from the promise of higher spectral and power system efficiencies, it is this capability to eliminate the near-far problem which makes multiuser detection so interesting. Given the inexorable tendencies of Moore’s law, however, multiuser detection and decoding have come close to being practical and economically interesting. Avoiding

98

4 Multiuser Detection

unnecessary network complexity and locating it in the receiver instead has the potential to make future networks far more resource efficient. Figure 4.1 lists some the classes of multiuser detectors (and decoders) discussed in this book loosely ordered by computational complexity and performance (not to scale). At the bottom of the diagram we find the conventional correlation detector, which does not use explicit joint detection principles and treats the signals of the interfering users as additive uncorrelated noise. It is a most simple receiver, based on conventional point-to-point transmission techniques, and affords adequate performance as long as the number of users is limited, usually to a fraction of the processing gain of the system [97, 154]. This kind of receiver is based on low-complexity terminals, but requires a sophisticated network system with provides control functions such as power control.

Uncoded Systems

Coded Systems Optimal Decoding Viterbi Algorithm APP Algorithm

Complexity

Optimal Detection Viterbi Algorithm APP Algorithm

Exponential Complexity Iterative Decoders Turbo Decoders

Polynomial Complexity

Linear Front End Decoders

Statistical Interference Cancellation Decorrelator, LMMSE, Multistage

Cancellation/ Bounded Search Tree Search Sphere Detector List Detection

Near-Far Resistant NOT Near-Far Resistant

Correlation Detection IS-95, cdma2000 3GPP

Performance

Fig. 4.1. Classification of multiuser detection and decoding methods.

At the top of the diagram of Figure 4.1, the most complex receivers perform a maximum-likelihood (ML) estimate of the data sequences of all the users. This receiver is near-far resistant, and forms a benchmark for ideal performance. Its computational complexity makes it largely uninteresting as a practical detector. We will discuss optimal detection in detail in Section 4.2. We also note that optimal, or ML-decoding, is not necessary to approach

4.1 Introduction

99

the capacity limit of the multiuser channel, since in the proof of Shannon’s capacity theorems (Chapter 3) an ML detector does not have to be assumed. Between the extremes, we find a number of detectors, which all share the property that they are near-far resistant, or nearly so. We have somewhat arbitrarily grouped them into three classes. The first class are the statistical interference cancelers. The decorrelator and minimum mean-square error detectors discussed in Sections 4.4.2 and 4.4.5, are the main representatives of this class. The term statistical interference cancellation refers to the fact that these receivers do not attempt to perform actual interference signal cancellation, but use statistical properties of the received and interfering signals in order to provide improved performance. These methods work surprisingly well with complexities much lower than that for optimal detection, primarily due to the fact that for the purpose of statistical cancellation, complete receivers for the interfering users are not needed. The next class of multiuser detectors are the actual interference cancelers. Many of these detectors were originally of ad-hoc design. Typically they either decode users via some form of successive cancellation, or they are approximations to the optimum detector. Systems which successively cancel users’ signals start by power-ordering the users, then subtract the influence of stronger users from the received signal and proceed recursively to decode subsequent users. This works well as long as the detection of the different users’ transmitted symbols is correct. In fact, assuming error-free detection at each stage, we show in Section 5.2.2 that such successive cancellation receivers can achieve the capacity of the multiple-access channel. Without accurate decoding, however, such receivers suffer from potentially debilitating error propagation. In Chapter 5 on implementation aspects of multiuser detectors, we show that such interference cancelers can be viewed as variants of iterative implementations of the statistical interference cancelers, making cancellation techniques the most important practical detection and decoding concept for multiuser systems. The interference cancelers approximate the optimal detector. Among other approximations to the optimal detector are limited search algorithms such as branch-and-bound tree search algorithms as well as the sphere decoder. They typically suffer from loosing the correct signal hypothesis, but provide excellent performance for a given computational effort. More recently, principles of iterative decoding have successfully been applied to the problem of joint decoding of CDMA systems. Iterative decoding has become popular with the comet-like ascent of turbo and low-density parity-check codes [12, 81, 120] in recent years. Their application to linear multiple-access channels has lead to very efficient iterative decoders. We will treat such iterative decoding systems in Chapter 6. All of the CDMA multiuser detection methods are based on the particular channel model which arises from CDMA transmission, i.e. on the canonical linear-algebraic channel from Section 2.3 r = SAd + z.

(4.1)

100

4 Multiuser Detection

Recall that S is the modulation matrix whose columns are the spreading sequences of the different users and different transmission time intervals. In the sequel we will set the amplitude matrix A = I in some of the derivations for convenience. We do this only where A does not play an important role and its inclusion is a straightforward exercise. Note that the linearity of (4.1) makes many of the multiuser detectors possible, and indeed feasible complexity-wise.

4.2 Optimal Detection A joint detector considers all available information, either in an optimal or an approximate way. It therefore must jointly decode all the accessing users’ data streams as shown in Figure 4.2, which is a simplified rendition of Figure 2.9. Brute-force optimal joint detection typically translates into a large complexity of the joint detector as we will show below. If the different users employ FEC codes, the complexity of the decoder sharply increases unless cancellation or iterative decoders are used. At any rate, for full exploitation of the CDMA channel capacity the view taken in Figure 4.2 needs to be adopted. u

ˆ ˆ or d u

d

Source

Encoder / Modulator

Source

Encoder / Modulator

Joint Detector for all users

Noise Source

Encoder / Modulator

Fig. 4.2. A joint detector considers all available information.

4.2.1 Jointly Optimal Detection We will now turn our attention to the derivation of the optimal detector for the CDMA channel [121, 142, 147], which is applicable in general to channels

4.2 Optimal Detection

101

of the form (4.1). By “jointly optimal” we mean the detector which produces the maximum-likelihood estimate of the transmitted uncoded symbols d, and, ˆ = d) for the unconsequently, minimizes the (vector) error probability Pr(d coded symbols. The received vector of noise samples z is (mostly) caused by thermal receiver noise, and hence is well described by independent Gaussian noise samples with variance σ 2 = N0 /2, where N0 is the one-side noise power spectral density1 . Armed with these assumptions, we expressed the conditional probability of r given d by a multi-variate Gaussian distribution in Section 2.5.4, and derived the ML estimate as ˆ ML = arg min r − Sd2 . d 2 d∈D Kn

(4.2)

Note that in the synchronous case this minimization can be carried out independently over each symbol period, which, ironically, does not reduce the complexity of the algorithm, as we will see. We may furthermore neglect the term r∗ r in (4.2) and write ˆ ML = arg max (2d∗ y − d∗ S∗ Sd) . d d∈D Kn

(4.3)

Note that the vector y = S∗ r has dimension Kn and its j-th entry, given by (S∗ r)[j] = s∗j r, is the correlation of the received vector with the spreading sequence of the k-th users at symbol interval i, where i = ⌈j/K⌉ and k = j − (i − 1)K. (This correlation can also be computed as the output of a (k) filter matched to sj and sampled at time Ti = jT + τk . That is, S∗ r is the output of a bank of filters matched to the spreading sequences of the K users). The receiver of (4.3) is shown in Figure 4.3, reproduced from Figure 2.6 for convenience. If the spreading sequences which make up the channel matrix S are orthogonal, S∗ S = I, and the channel symbols d are uncorrelated and have constant energy, (4.3) reduces to ˆ ML = arg max (d∗ y) , d d

(4.4)

ˆ ML = sgn(y). This is the correlation receiver which which is easily solved by d performs no joint detection. In general, however, the term d∗ S∗ Sd needs to be considered. This term is the source of the complexity of the joint detector, but also of its performance advantage. The spreading sequences sj = sk [i] may be time-varying, i.e. depend not only on k but also on the index i. This is the case for time-varying CDMA which uses sequences for each of the users with periods much longer than L [135]. This is also referred to as random spreading. The other alternative is to use time-invariant CDMA, i.e. sk [i] = sk , where identical, length-L, spreading sequences are used for all symbol transmissions. There are advantages 1

For an introductory discussion of Gaussian noise see e.g. [49, 104, 166]

102

4 Multiuser Detection Matched Filter Front End

√ Baseband Modulator

d1 [i] √

Baseband Modulator

s1 [i]

P1 s1 [i]

s2 [i]

P2 s2 [i]

d2 [i]

Reset

P P

r[i]

y1 [i]

y2 [i] Optimal Detector

z[i] √ Baseband Modulator

PK sK [i]

sK [i] Noise

dK [i]

P

yK [i]

Fig. 4.3. Matched filter bank serving as a front-end for an optimal multiuser CDMA detector.

and disadvantages to both systems, however, random CDMA is becoming the more popular variant in practical applications [134, 135]. The system model of Figure 4.3 and equation (4.3) is equally applicable to both alternatives as well as synchronous and asynchronous transmission, as discussed in Chapter 2. Since optimal detection based on y is possible, the outputs of the correlation detector provide what is known as a sufficient statistic (see Definition A.5), and, while making hard decisions on y is suboptimal in general, the bank of correlators or matched filters serves as a front-end of the optimal detector in Figure 4.3. No information is lost due to the linear correlation operations. We now show that optimal detection can be performed by a finite-state machine with 2K − 1 states [147], more specifically by a trellis decoder [120]. The observation is based on realizing that the term d∗ S∗ Sd in (4.3) has banddiagonal form with K − 1 off-diagonal terms and can be recursively evaluated by formulating the problem as a finite-state system that evolves through the row index. The algorithm is identical to the trellis decoding algorithms discussed in [120, Chapter 6], with only minor differences in how the branch metrics are generated. The operation that causes the computational complexity of the optimal detector is the evaluation of the quadratic form d∗ S∗ Sd. The correlation matrix R = S∗ S of the spreading sequences has dimensions Kn × Kn and its n, m-th entry is given by (Section 2.5.2) Rlm = s∗l sm = s∗k [i]sk′ [i′ ].

(4.5)

4.2 Optimal Detection

103

Recall that the subscripts k and k ′ refer to users, and the arguments i and i′ to time intervals. Note that the indexing l, m of the spreading sequences is in the order in which they appear in S, since the distinction between users and time units is irrelevant to the optimal detector, but technically sl = sk [⌊l/K⌋]; k = l mod K. Figure 4.4 shows the structure of the cross-correlation matrix R for three asynchronous users. The fact that R is band-diagonal with width 2K − 1 allows us to evaluate d∗ Rd with a trellis decoder with 2K−1 states as follows. We must evaluate (4.3) for every possible sequence d, that is, we need to calculate a sequence metric λ(d) = 2d∗ y − d∗ S∗ Sd = 2 and then select

Kn

i=1

d i yi −

Kn

Kn

di Rij dj

ˆ ML = arg max (λ(d)) . d

(4.7)

d∈D Kn

i k i'

1

2

1

2

2 3

1

2

3 3

1

2

3

k' 1

R11 R12 R13

2

R12 R22 R23

R24

3

R13 R23 R33

R34

1

R24 R34

R44

2

R(l−2)l

3

R(l−1)l

1 3

1

(4.6)

i=1 j=1

R(l−2)l R(l−1)l Rll

2 3

Fig. 4.4. Illustration of the correlation matrix R for three asynchronous users.

To do this, let us define the partial metric at time l − 1 as

104

4 Multiuser Detection

λl−1 (d) = 2

l−1

i=1

d i yi −

l−1 l−1



di Rij dj ,

(4.8)

i=1 j=1

and write the entire sequence metric in recursive form as λl (d) = λl−1 (d) + 2dl yl −

l−1

i=1

7 72 2di Ril dl − Rll 7dl 7 ,

(4.9)

where we have used the crucial fact that R is symmetric. The key observation, first made by Ungerb¨ ock in [140] in the context of inter-symbol interference channel equalization, is to note that λl (d) depends only on di , i ≤ l, and not on “future” symbols di , i > l – see Figure 4.4. Furthermore, since we7 are 7 assuming binary modulation, dk ∈ {+1, −1} we may neglect all terms 7dl 72

in the metrics since their contributions are identical for all metrics. We now may modify our partial metrics to t0

  

term 1

   l−1

λl (d) = λl−1 (d) + 2dl yl − 2di Ril dl = λl−1 (d) + bl ,

(4.10)

i=1

(4.11)

where bl , implicitly defined above, plays the role of the branch metric in a trellis decoder [120]. Figure 4.5 illustrates the recursive nature of the computation of λl (d). Ateach time index l, the previous metric λl−1 (d) is updated by two terms, t0 and term 1. The first term t0 depends only on the received signal at time l, i.e., yl , and the data symbol at time l. Term 1 depends on K −1 previous data symbols di ; i = l − K, · · · , l, which will need to be stored. The incremental terms in (4.11) are represented by the wedge-like slice highlighted in Figure 4.5. At the next time interval, a new such slice is added, until the entire quadratic form is computed. Note that it makes no difference if the system is synchronous or asynchronous. In a synchronous system, the triangular components are simply zero, as discussed in Section 2.5.2, however, at its widest point in the matrix, there are still K − 1 symbols which need to be stored to compute the increment. From Figure 4.5 it is evident that the branch metric bl can be calculated requiring only the K − 1 most recent values (dl−1 , . . . , dl−K ) and the present value dl from the symbol vector d. We therefore define a decoder state sl−1 = (dl−1 , . . . , dl−K ) and note that there are a total of 2K−1 such states. The sequence metrics λl (d) can now be calculated for all d by a trellis decoder as illustrated in Figure 4.6 for three users, where each state has two branches leaving, corresponding to the two possible values of dl , and two branches entering, corresponding to the two possible values of dl−K .

105

term 1

time l

4.2 Optimal Detection

time l

term 1

t0

Fig. 4.5. Illustration of the recursive computation of the quadratic form in (4.11).

q1 State at l: (dl , dl−1 , dl−2 ) s

q2

Fig. 4.6. Illustration of a section of the CDMA trellis used by the optimal decoder, shown for three interfering users, i.e. K = 3, causing 8 states. Illustrated is the merger at state s, where each of the path arrives with the metric (4.12).

106

4 Multiuser Detection

Since we will be working with state metrics we rewrite (4.11) as λl (sl ) = λl−1 (ql−1 ) + bl (ql−1 → sl ),

(4.12)

where sl ranges over all possible 2K−1 states at time l, and ql−1 is a state one time unit earlier in the trellis for which a connection to sl exists. There are two such states, since there are two paths merging at each state sl . Of these two paths we may eliminate the partial path with the lesser partial metric without compromising (4.7). The resulting optimal detection algorithm has long been known in the error control community as the “Viterbi” algorithm [45, 77, 120, 152]. The ML algorithm, or Viterbi algorithm which computes the complete sequence metrics, for optimal decoding of correlated signal sets is described in Algorithm 4.1. Algorithm 4.1 (Viterbi decoder for optimal detection). Step 1: Initialize each of the S = 2K−1 states s of the detector ˆ (s) = (). Initialize with a metric m0 (s) = −∞ and survivor d the starting state of the encoder, state s = (0, · · · , 0), with the metric m0 (0) = 0. Let l = 1. Step 2: Calculate the branch metric b l = d l yl −

K−1

dl−j R(l−j)l dl ,

(4.13)

j=1

for each branch stemming from each state s at time l − 1 for each extension dl . Step 3: (Add-Compare-Select) For each state s at time l form the sum ml−1 (q) + bl for both previous states q which connect to l, and select the larger to become the new state metric, i.e. ml (s) = maxq (ml−1 (q) + bl ). Retain the path with the largest metric as survivor for state s, and append the symbol dl ˆ (s) = (d ˆ (q) , dˆl ). on the surviving branch to the state survivor d Step 4: If l < Kn, let l = l + 1 and go to Step 2. ˆ (s ) = (d1 , . . . , dKn )(s ) correStep 5: Output the survivor d m m sponding to the state sm = arg maxs (mKn (s)) which maximizes mKn (sm ) as the maximum-likelihood estimate of the transmitted sequence. The proof in Step 3 above that the merging partial path with the lesser metric can be eliminated without discarding the maximum likelihood solution is standard and can be found, e.g. in [77, 120]. This fact is referred to as the Theorem of Irrelevance.

4.2 Optimal Detection

107

In an asynchronous system with large frame length n it is not necessary to wait until l = Kn before the decisions on the decoded sequence can be made. We may modify the algorithm and obtain a fixed-delay decoder by adding Step 4b and changing Step 5 above as follows: Step 4b: If l ≥ nt , where nt is a delay taken large enough, typically around 5K, called the truncation length, output dl−nt as the estimated symbol at ˆ (s) = (d1 , . . . , dl )(s) with the largest partial time l − nt from the survivor d metric ml+1 (s). If l < Kn, let l = l + 1 and go to Step 2. Step 5: Output the remaining estimated symbols dl ; Kn − nt < l ≤ Kn from ˆ (s ) , sm = arg maxs (mKn (s)). the survivor d m 4.2.2 Individually Optimal Detection: APP Detection While we derived an optimal decoder for the entire set of transmitted symbols, there is no guarantee that the output symbols dˆj in (4.3) are optimal for any given user k. They may even be poor estimates for some of the users. Individually optimal detection calculates dˆk [i] = arg max Pr(dk [i]|y) d∈D Kn

(4.14)

as the marginalization dˆk [i] = arg max

d∈DKn



Pr(y|d)Pr(d)

(4.15)

d:dk [i]=d

as shown in Section 2.5.5. The a priori probabilities Pr(d) are identically and uniformly distributed, meaning that no prior information about d is available, unless iterative decoding systems make repeated use of (4.15), in which case Pr(d) is computed by external soft-decision decoders, as is the practice with turbo decoding systems [120]. The marginalization sum in (4.15) grows exponentially with the number of users. It can, however, be calculated relatively efficiently by a bi-directional trellis search algorithm, the BCJR or APP algorithm [9, 120], which operates in the same trellis as the Viterbi algorithm discussed. The state space complexity is still exponential in K − 1, and exact calculation of (4.15), however, is rarely an option. The purpose of the bi-directional a posteriori probability (APP) algorithm is the calculation of (4.14), by carrying out the marginalization (4.15) in an efficient manner using the trellis description of the correlation between symbols. To accomplish this, the algorithm first calculates the probability that the trellis model of the CDMA system traversed a specific transition, i.e. the algorithm computes Pr[sl−1 = q, sl = s|y], where sl is the state at time l, and sl−1 is the state at time l − 1. The algorithm computes this probability as the product of the three terms, i.e.

108

4 Multiuser Detection

Pr[sl−1 = q, sl = s|y] =

1 Pr[sl−1 = q, sl = s, y] Pr(y)

∝ αl−1 (q)γl (s, q)βl (s).

(4.16)

The α-values are internal variables of the algorithm and are computed by a forward recursion through the CDMA trellis

αl−2 (p)γl−1 (q, p). (4.17) αl−1 (q) = states p

This forward recursion evaluates α-values at time l − 1 from previously calculated α-values at time l − 2, and the sum is over all states p at time l − 2 that connect with state q at time l − 1. The α values are initiated as α0 (0) = 1, α0 (s = 0) = 0. This enforces the boundary condition that the encoder starts in state s = (0, · · · , 0). The β-values are calculated by an analogous procedure, called the backward recursion

βl (s) = βl+1 (t)γl+1 (t, s) (4.18) states t

which is initialized as βKn (0) = 1, βKn (s = 0)) = 0 to enforce the terminating condition of the trellis representing the CDMA system. The sum is over all states t at time l + 1 to which state s at time l connects. The forward and backward recursions are illustrated in Figure 4.7.

time l

Forward Recursion −→

Backward Recursion ←−

p2 s

q p1

t2 t1

Fig. 4.7. Illustration of the forward and backward recursion of the APP algorithm for individually optimal detection.

4.2 Optimal Detection

109

The γ values are conditional transition probabilities, and are the inputs to the algorithm. In order to compute the γl (s, q) values, recall that the algorithm needs to compute (4.15), where we now focus on the sequence d that carries the trellis path through the states q and s as illustrated in Figure 4.7. The probability Pr(d) breaks into a product of individual probabilities, and the one affecting the transition in question is Pr[sl = q|sl+1 = s] = Pr(dl ). The term Pr(dl |y) ∝ exp (2d∗ y − d∗ S∗ Sd) (4.19) can be broken into three terms for the path through (q, s), and decomposed using the partial sequence metric formulation (4.12) from the preceding section into ⎛ ⎞ Kn

bj ⎠ (4.20) Pr(sl−1 = q, sl = s|y) ∝ exp (λl−1 (q)) exp (bl (q → s)) exp ⎝      j=l+1 αl−1 (q) γl (s,q)    βl (s)

which are the three factors from (4.16). From (4.20) we see that γl (s, q) = exp (bl (q → s)) Pr(dl ) ⎛

= exp ⎝dl yl −

K−1

j=1

(4.21) ⎞

dl−j R(l−j)l dl ⎠ Pr(dl ).

(4.22)

This factor Pr(dl ) can be used to account for a priori probability information on the user data d in iterative systems. The a posteriori symbol probabilities Pr(dk [i] = d|y) can now be calculated from the a posteriori transition probabilities (4.16) or (4.20) by summing over all transitions corresponding to dl = 1, and, separately, by summing over all transitions corresponding to dl = −1 (with dl = dk [i]). A formal description of this algorithm can be found in [120], where a rigorous derivation is presented. This derivation was first given by Bahl et. al. [9]. 4.2.3 Performance Bounds – The Minimum Distance Determining the performance of a general trellis decoder analytically is a known difficult problem, and one is often satisfied with finding bounds or approximations. The minimum squared Euclidean distance between two possible sequences (signal points) can serve as a measure of performance for an optimal detector. It is defined as d2min =

min

d′ ,d∈D Kn d=d′

2

S(d − d′ )2 .

(4.23)

110

4 Multiuser Detection

A coarse approximation of the error performance for small noise values can be obtained as (see [104, Page 256]) ⎛8 ⎞ 2 dmin ⎠ , (4.24) Pe ≈ Admin Q ⎝ 2N0 where Q



 d2min /(2N0 ) is simply the probability of error between two

equally likely signals with squared Euclidean distance (energy) d2min in additive white Gaussian noise of complex variance N0 , and Admin is the number of such minimum distance neighbors, which is known as the multiplicity of the minimum distance signal pair. Unfortunately, the calculation of d2min is computationally intensive via (4.23) except for small values of K, since the search space grows exponentially with K. In fact, the calculation of (4.23) is known to be NP-complete [149]. However, this does not necessarily mean that the calculation of d2min is impossible [118]. We need to calculate d2min =



min

d′ ,d∈D Kn d=d′

(d − d′ ) AS∗ SA(d − d′ )

(4.25)

where we have re-introduced the matrix A of amplitudes. If S is not rank deficient, S∗ S is positive definite, and there exists a unique lower-triangular2 , non-singular matrix F of dimension Kn × Kn, such that S∗ S = F∗ F. This is known as the Cholesky decomposition [55]. Using this decomposition and (4.25) we obtain d2min =

min

2

d′ ,d∈D Kn d=d′

FA(d − d′ )2 .

(4.26)

Now let εj = dj − d′j for j = 1, . . . , Kn. Then equation (4.26) can be rewritten as the sum of squares given by d2min =

min

Kn

d′ ,d∈D Kn l=1 d=d′

δl2 ,

(4.27)

where, due to the lower triangular nature of F, 2

Sometimes the Cholesky factorization 3 yields an upper-triangular matrix F, but 2 1 6 1 7 7 is a permutation matrix, is lower-triangular the matrix PF, where P = 6 5 4 · 1 and also complies with the decomposition, i.e. R = (PF)∗ PF = F∗ F.

4.2 Optimal Detection



δl2 = ⎝

l

j=1

⎞2

Pj Flj εj ⎠ .

111

(4.28)

Since δl2 depends only on ε1 , . . . , εl , we can use a bounded tree search to evaluate the minimum value of (4.27). This branch-and-bound algorithm starts at a root node and branches into two new nodes at each level. The nodes at level l are labeled with the symbol differences (ε1 , . . . , εl ) leading to them, 5l and the node weight is Δ2l = j=1 δl2 . The branch connecting the two nodes 2 (ε1 , . . . , εl ) and (ε1 , . . . , εl+1 ) is labeled by δl+1 . The key observation now is that only a small part of this tree needs to be explored. Due to the fact that δl2 is positive, the node weights can only increase, and most nodes can quickly be discarded from future consideration if their weight exceeds some threshold. This threshold can initially be chosen to be an estimate of the minimum distance. For instance, from the single-user bound, achieved by orthogonal spreading sequences, we know that d2min ≥ 2 min(Pj ). If we are interested in the minimum distance of a specific symbol k, we modify (4.27) to Kn

d2min,k = min δl2 . (4.29) ′ d ,d l=1 dj =d′k

Algorithm 4.2, illustrated in Figure 4.8, finds the minimum distance of user k. The algorithm is an adaptation of the T -algorithm [120] to this search problem. This basic concept of a tree search, combined with branch and bounding methods, forms the basic for a large number of decoding algorithms, such as the sphere decoders, to be discussed in Chapter 5. Algorithm 4.2 (Finding d2min of a CDMA Signal Set). Step 1: Initialize l = 1 and activate the root node, denoted by (), at level 0 of the search tree with Δ20 = 0. Step 2: Compute the node weight Δ2l = Δ2l−1 + δl2 for all extensions from active nodes at level l − 1. Step 3: Deactivate nodes whose weight exceeds the preset threshold. Step 4: Let l = l + 1, and if l < Kn go to Step 2, otherwise stop min = Δ2Kn . and output d2min = active nodes

Note that since nodes whose weight increases above the threshold are dropped, the number of branches searched by this algorithm is significantly smaller than the total number of tree branches of the full tree (3Kn ).

112

4 Multiuser Detection δ12 δ22 depth of tree

δ32

δ42 δ52 δ62 width of tree

Fig. 4.8. Bounded tree search.

However, the problem (4.28) is still NP-complete, as shown in [118]. The key lies in the fact that the above algorithm finds d2min very efficiently in most cases. In fact, Schlegel and Lei [118] performed searches for a synchronous (n = 1) CDMA system with length-31 random spreading sequences, and found the following empirical distribution for the minimum distances, shown in Figure 4.9. The figure also shows the width of the search tree at each depth for the worst cases found for both search experiments. The maximum width of the active tree of 48,479 for 31 users is significantly less than the total number of possible tree branches, which is 331 = 6.2 × 1014 .

4.3 Sub-Exponential Complexity Signature Sequences While optimal detection is, in general, NP-hard, meaning that an exact optimal detector, whether jointly or individually optimal, must expend a computational effort which is exponential in the number of users, we have seen that there exist clever search algorithms which obtain “near-optimal” results with much reduced complexity. It is difficult to precisely gauge the complexity for such algorithms as a function of the number of users K, and many different claims can be found in the literature. However, there exist specific signature sets which have a provably lower complexity, no matter how large the set. For example, if the cross-correlation of the signature sets are all non-positive, then there exist an optimal detection algorithm with a complexity of order O(K 3 ) [113]. Another set for which sub-exponential optimal detection is possible, will be presented here. We assume that the cross-correlation values are all equal (or

50

10

5

40

10

4

Width of search tree

Relative frequency of signature waveforms

4.3 Sub-Exponential Complexity Signature Sequences

30

20

10

3

10 2

31 Users 20 Users

10

10

dmin 0

113

| 0.4

|

| 0.6

|

| 0.8

|

|

1

|

1

|

|

|

|

|

|

|

|

0

5

10

15

20

25

30

35

l

Fig. 4.9. Histograms of the distribution of the minimum distances of a CDMA system with length-31 random spreading sequences, for K = 31 (dashed lines), and K = 20 users (solid lines), and maximum width of the search tree.

maybe nearly so), and given by ρ. Such a class of signature sequences has been used as a benchmark [88] for general CDMA systems, and includes signature sequence sets of practical interest, such as synchronous CDMA systems using cyclically shifted m-sequences [139]. Given our assumption of equal cross-correlations, i.e. 3 1, i = j , (4.30) Rij = ρ, i = j we can rewrite (4.3) as (note we are focusing on a synchronous example with n = 1.) 2d∗ y − d∗ S∗ Sd = 2 =2

K

i=1

K

i=1

d i yi −

K

K

di dj Rij

(4.31)

i=1 j=1

d i yi − ρ

K

K

i=1 j=1

di dj − K(1 − ρ).

(4.32)

5K 5K The key observation is that i=1 j=1 di dj depends only on the number of negative (or positive) elements in d, and not their arrangement. Therefore, let N (d) be the number of elements in d which are negative, i.e. 9 : K

1 N (d) = K− (4.33) di . 2 i=1 With this definition, further define the functions

114

4 Multiuser Detection K

K

i=1 j=1

di dj = T1 (d) = ρ(2d − K)2 T2 (d) = −2

K

d i yi .

(4.34)

(4.35)

i=1

Now, since we can ignore the constant term K(1 − ρ), we see that maximizing (4.31) is equivalent to ˆ = arg min (T1 (N (d)) + T2 (d)) . d d∈D K

(4.36)

Upon inspection, the following observations about the functions T1 and T2 can be made. T1 (N (d)) is convex and possesses a unique extreme point at N (d) = K/2, which is a minimum for ρ > 0, and a maximum for ρ < 0. The second term, T2 (d), is minimized by d = sgn(y), and although it depends on the arrangement of the negative elements in d, we may form a lower bound that depends only on the number of negative terms that appear. For N (d) = 0, 1, · · · , K define ⎧ K

⎪ ⎪ ⎪ yi sgn(yi ), N (d) = η −2 ⎪ ⎪ ⎪ ⎪ i=1 ⎪ ⎪ ⎪ N (d) ⎨

|yπ(i) |, N (d) > η σ(η) + 4 (4.37) σ(N (d)) = ⎪ ⎪ i=η+1 ⎪ ⎪ η ⎪

⎪ ⎪ ⎪ ⎪ σ(η) + 4 |yπ(i) |, N (d) < η ⎪ ⎩ i=N (d)+1

where we have introduced the notational simplification η = N (sgn(yi )), and π is a permutation among the yi , such that yπ(1) ≤ yπ(2) , · · · ≤ yπ(K) . Now, we conclude that σ(N (d)) is convex with a minimum at η, and it is furthermore clear that T2 (d) ≥ σ(N (d)). (4.38)

Equality is achieved in (4.38) if the elements of d are arranged such that dπ(i) = −1for i −1, 2, · · · , N (d), and the remaining K − η elements are +1.

4.4 Signal Layering

Algorithm 4.3 (Optimal Correlations).

Decoding

for

Equal

115

Cross-

Step 1: Let η = N (sgn(y)) Step 2: Let π be the permutation of the indices 1, 2, · · · , K such that yπ(i) is non-decreasing with i. Step 3: Calculate the functions T1 and σ according to (4.34) and (4.37). Also, find j = arg

min

m=1,··· ,K

(T1 (m) + σ(m)) .

(4.39)

j is the number of negative elements in the minimizing vector ˆ d. ˆ given by Step 4: Output the vector d  −1, i = 1, · · · , j ˆ dπ(i) = (4.40) +1, i = j + 1, · · · , K. The optimality of Algorithm 4.3 is a direct consequence of minimizing the lower bound (4.38) by the permutation π, and the fact that the output vector meets this bound with equality. The complexity of this detection algorithm is dominated by the operation in Step 2, which, in the worst case is of order O(K log K). However, in many cases a full sort and search will not be required. Schlegel and Grant [117] generalize this decoding algorithm to the case where blocks of signature sequences have a fixed cross-correlation, and show that the algorithm is exponential only in the number of unique cross correlation values. We have seen that in many cases optimal detection algorithms with subexponential complexity can be found, or that approximate algorithms can obtain solutions with error probabilities close to those of optimal detectors. Nonetheless, a general practical solution to optimal decoding of CDMA is both elusive, and not necessary, as we will see in subsequent sections and chapters.

4.4 Signal Layering Preprocessing of the received signal r has two functions. A preprocessor can act as a multiuser detector and suppress multiple access interference as is the case in the linear filter receivers we discuss in subsequent sections. A preprocessor, however, also acts by conditioning the channel of a single user to generate an “improved” channel for that user. This viewpoint is illustrated

116

4 Multiuser Detection

in Figure 4.10, where a linear preprocessor is used to create a single-user channel for a given user.

Source

Encoder / Modulator

Source

Encoder / Modulator

Multiple Access Channel

Single User Channel

Linear Preprocessing

Decoder

Noise Source

Encoder / Modulator

Fig. 4.10. Linear preprocessing used to condition the channel for a given user (shaded).

This single-user channel has an information theoretic capacity which we will calculate for a number of popular preprocessors. From the results of conventional error control coding it is well known that this single-user capacity can be approached closely by powerful forward error control coding systems [120], that is, single-user capacity achieving codes can be used to approach the “layered capacity”. Calculation of these layered capacities allows us therefore to determine how much capacity is lost by converting the multiple-access channel into parallel single-user channels. This process shall be called channel layering. It can be very effective in reducing the complexity of the joint decoding problem, and, as we will see, in many cases can achieve a performance which rivals or equals that of a full joint detector. Figure 4.11 shows the layered information theoretic capacities for several cases as a function of the signal-to-noise ratio Eb /N0 . More precisely, the curves are Shannon bounds which relate spectral efficiency in bits/Hz (y-axis) to power efficiency (x-axis). The Shannon bound for an AWGN channel is quickly computed from the capacity formula, i.e. from CAWGN = 1/2 log2 (1 + 2P/N0 ), we obtain 22CAWGN − 1 Eb ≥ , (4.41) N0 2CAWGN by realizing that at the maximum rate Eb CAWGN = P . Equation (4.41) is the well-known Shannon bound for the AWGN channel.

4.4 Signal Layering

117

The Shannon bounds calculated in Figure 4.11 are for random CDMA systems. The CDMA channel capacity was calculated in Chapter 3, where we showed that the loss w.r.t. the AWGN capacity is minimal and vanishing for large loads β → ∞, or large P/N0 → ∞. The two filters used are the minimumsquare error (MMSE) filter discussed in Section 4.4.5, which produces the best linear layered channel, and matched filtering, which represents conventional correlation detection. It is important to note that the capacity bounds are calculated for equal received powers of all the different users.

Capacity bits/dimension

10

city Capa N G AW =2 A: β CDM = 0 .5 A: β M = 0.5 D C E: β MMS

1

MMSE: β = 2 Matched Filter: β = 2 Matched Filter: β = 0.5

0.1

-2

0

2

4

6

8

10

12

14

Eb /N0 16 [dB]

Fig. 4.11. Information theoretic capacities of various preprocessing filters.

The figure3 distinguishes between two system loads, defined as β = K/L, i.e. the ratio of (active) users to available dimensions, also known as the processing gain. For lower loads, β < 0.5, the layered capacities of linear filters with equal received power are virtually equivalent to the optimal system capacity, and very little is lost by layering. In contrast, for large loads β > 1, 3

The range of Eb /N0 is chosen to represent typical values in wireless communications systems. Note that Eb /N0 values in excess of 16dB are rare, since they would imply constellations with 256 points per complex symbol, or larger.

118

4 Multiuser Detection

layering via linear filters becomes inefficient, and significant extra capacity can be gained by going to more complex joint receivers. Some interesting conclusions can be drawn from these results. First, if the signal-to-noise ratio (SNR) is small, the system is mostly fighting additive noise, and multiuser detection has only a limited effect on capacity and its complexity may not be warranted. Also, in the low SNR regime, a loss of about 1dB is incurred by using random spreading codes versus Welsh-bound equivalent codes. Furthermore, it becomes evident that not only are linear filters not capable of extracting the channel’s capacity at higher loads, but their performance actually degrades below the capacity of a lighter loaded system. Both of these effects are due to the fact that the linear filter is suppressing signal space dimensions in the absence of more detailed information about the transmitted signal. This will be explored later in this chapter, as well as the influence of unequal powers on the system capacity. 4.4.1 Correlation Detection – Matched Filtering We have encountered the correlation or matched filter bank as a front-end of the optimal detector in Section 4.2. If we make simple sign decisions on the received matched filter output vector y, we have the conventional correlation receiver. Interference in the correlation receiver is only suppressed by the processing gain of the spreading sequences used. This suppression is not ideal, unless the sequences are orthogonal, and, on average, each user in the system contributes an amount of interference which is approximately equal to its power divided by the processing gain [154]. More precisely, the detection process is given by ˆ = sgn (y) . d

(4.42)

For a given user k this means ⎛

dˆk = sign ⎝dk +

j =k



dj Rkj + zk ⎠ .

(4.43)

Since Rkj is the product of two spreading sequences according to (4.5), its variance is straightforward to calculate if we assume that the chips of these spreading sequences are chosen randomly and with equal probability to be √ ±1/ L. The interference term in (4.43) is then the sum of a (large) number of random variables with bounded variance, which allows us to apply the central limit theorem and replace

dj Rkj + zk (4.44) Ik = j =k

by a Gaussian noise source with zero-mean and variance

4.4 Signal Layering

var(Ik ) =

K −1 P + σ2 L

K,L→∞

−→

βP + σ 2 .

119

(4.45)

The channel for a single user now simply looks like an AWGN channel with noise variance (4.45), and therefore with capacity [166]   P 1 CMF = log2 1 + bits/use. (4.46) 2 βP + σ 2 The capacity per dimension of the overall system is K times this, divided by the processing gain, i.e.   β P bits/dimension. (4.47) CMF = log2 1 + 2 βP + σ 2 Substituting the limiting rate CMF P/β = Eb into the equation above gives the Shannon bound curve shown in Figure 4.11. The deleterious effect of unequal received power can also easily be seen from (4.43). A user whose signal arrives at the receiver κ times stronger, will show up as κ virtual users. This can quickly lead to a significant loss in system capacity. 4.4.2 Decorrelation Decorrelation has intuitive appeal since it is a linear filtering operation which completely eliminates the multiple-access interference, and one is tempted to conclude that its performance should therefore be close to optimal. We will however see that this is not the case, since in the process of eliminating the multiple access interference significant and detrimental enhancement of the additive channel noise can occur. This limits what is theoretically possible by decorrelation, and this limitation is especially severe for large system loads. We start by considering the output signals of the matched filter outputs, which were calculated as y = S∗ r = S∗ SAd + S∗ z.

(4.48)

If the spreading sequences in S are linearly independent, the correlation matrix R = S∗ S has an inverse, which can be applied to y to obtain ˆ = R−1 y d = Ad + R−1 S∗ z.

(4.49)

ˆ is an interference-free estimate of d. The matrix From (4.49) we see that d H† = R−1 S∗ is the pseudo-inverse of S [55, 131] and we define the decorrelator as

120

4 Multiuser Detection

Definition 4.1 (Decorrelator). The decorrelator detector outputs dDEC = sgn R−1 S∗ r The decorrelating detector, or the decorrelator for short, has been studied in the literature in [65, 79, 80]. While the pseudo-inverse H† always exists, a complete separation of the interfering users is only possible whenever R−1 exists, which is the case whenever S has full column rank. That is, all the users employ linearly independent spreading sequences. This assumption is reasonable, since if the signals of two or more users form a linearly dependent set, they become inseparable by linear operation, and the detection of the users’ data sequences is much complicated. In fact, sets of linearly dependent users would remain correlated after decorrelation and joint detection would have to be applied to these subsets. An interesting derivation of the decorrelator proceeds as follows. Let   2 dDEC = sgn arg min r − SAd , (4.50) d

where d ∈ RKn , in contrast to (4.2) where d was restricted to be from the set {+1, −1}Kn . This simplification in the search space turns an NP-complete discrete minimization problem into a continuous quadratic minimization problem. We define v = Ad and minimize (4.50) by taking partial derivatives, i.e. ∂ 2 (r − Sv) = 2S∗ r − 2S∗ Sv = 0 ∂v ˆ = (S∗ S)−1 S∗ r, ⇒v

(4.51)

obtaining the same solution as (4.49). If the amplitudes A of the transmitˆ is the maximum ted symbols are unknown, then we have just proven that v likelihood estimate of v. 4.4.3 Error Probabilities and Geometry Let us calculate the probability of error for the decorrelator. Assume that dk = −1. For binary modulation, an error occurs if dDEC,k = 1, and is given by     7 Pr dˆDEC,k = 17dk = −1 = Pr (R−1 S∗ z)k > Pk . (4.52) The quantity ηk = (R−1 S∗ z)k is a Gaussian random variable since it is the sum of weighted Gaussian components. Its expectation and variance are easily computed as

4.4 Signal Layering

E [ηk ] = 0; 0 1 0 1 var(ηk ) = E R−1 S∗ zz∗ SR−1 kk = σ 2 R−1 kk .

121

(4.53) (4.54)

With this, the error probability is given by the standard Gaussian integral 98 :   P k ˆ DEC,k = 1|dk = −1 = Q Pr d , (4.55) σ 2 [R−1 ]kk where Pk is the energy per symbol of user k. As can be seen from (4.55), the only difference between antipodal signaling in AWGN and the error probability of the decorrelator is the noise enhancement factor [R−1 ]kk , which can be shown to be always larger or equal to unity, that is, the decorrelator will always increase the noise and the bit error probability. Furthermore, the performance of user k is independent of the powers of any of the other users, since multiple-access interference is completely eliminated. The actual error rate depends very strongly on the signature sequences, but as P/σ 2 → ∞, the error probability P¯e → 0. This has been referred to as near-far resistance. Furthermore, as the minimum eigenvalue of R, λmin → 0, the noise enhancement coefficient [R−1 ]kk → ∞, as can be seen by writing the inverse in terms of the spectral decomposition of R. Hence, any two spreading sequences with a high cross-correlation will degrade the performance of the decorrelator substantially. Since σ 2 [R−1 ]kk ≥ σ 2 , there is an inherent loss of performance suffered by decorrelation w.r.t. interference-free transmission. For example, consider a synchronous two-user system with ⎤ ⎡ 1ρ ⎥ ⎢ρ 1 ⎥ ⎢ ⎥ ⎢ .. (4.56) R=⎢ ⎥, . ⎥ ⎢ ⎣ 1 ρ⎦ ρ1 for which the noise variances can be calculated as follows: E [η ∗ η] = σ 2 R−1 , and from there σ 2 [R−1 ]kk = σ 2 /(1 − ρ2 ). Hence, a correlation of 50% (ρ = 0.5) implies a loss of 1.25dB in signal-to-noise ratio. If we do not ignore the noise correlation, the users could share information about the noise and better performance would be possible using an optimal detector. Figure 4.12 shows a geometrical perspective of the decorrelator. Let us start with the decorrelator output as the pseudo-inverse of the spreading matrix (using a synchronous system for illustration) multiplying the received chip-sampled vector ˆ DEC = R−1 S∗ r = H† r. (4.57) d The rows j of the pseudo-inverse H† are orthogonal to the spreading sequences i = j in S, as can be seen from

122

4 Multiuser Detection

⎤ h∗1 1 0 ∗ −1 ∗ 1 ⎢ .. ⎥ 0 ⎣ . ⎦ s1 , · · · , sK = (S S) S S = I. ⎡

(4.58)

h∗K

The vector vj in Figure 4.12 is the projection of sj onto hj , which is calculated as hj hj ˆ vj = (h∗j sj ) (4.59) 2 = dDEC,j 2. hj  hj 

It must be orthogonal to all si , ∀i = j from (4.58). The length of the projection vector equals dˆDEC,j multiplied by 1/ hj . This dependence of the projected length of hj and dˆDEC,j is unimportant for symbol by symbol detection, but is very important for multiuser decoders using FEC coding. There the metrics need to be adjusted to this scaling factor (see Section 6.2.1).

span{sk }k=j

hj vj

vj sj

span{sk }⊥ k=j

Fig. 4.12. Geometry of the decorrelator.

4.4.4 The Decorrelator with Random Spreading Codes As discussed above, the actual performance of the decorrelator depends heavily on the set of spreading sequences used, and few general conclusions can be drawn. If we consider the use of random spreading sequences, however, very precise results exist. As derived in (4.57) ff., the output of the decorrelator for a given user 2 k consists of a signal power component: Pk (h∗k sk ) and a noise component σ 2 s∗k sk , giving a signal-to-noise ratio of the k-th user of 2

SNRk =

Pk (h∗k sk ) . σ 2 s∗k sk

We will show that for large systems the following theorem holds:

(4.60)

4.4 Signal Layering

123

Theorem 4.1 (Decorrelator Performance in Random CDMA). If K → ∞, and L → ∞, such that the load β = K/L is constant, then SNRk

K,L−→∞

−→

Pk L − K + 1 σ2 L

(4.61)

Proof. Assume that hk  = 1 and let Q = [q1 , · · · , qK−1 ] be a basis for span{sk }k=j of dimension K − 1. Since sk is random, i.e. each component is randomly chosen, E [q∗l sk ] = 0 (4.62) and

1 ql Iq∗l = . (4.63) L L That is, the length of the average projection onto an arbitrary basis vector is 1/L, irrespective of whether the system is synchronous or not. The average total length of the projection onto span{sk }k=j is therefore (see Figure 4.12) ⎤ ⎡ K−1 K K

0 ∗ 1 K −1 ⎢ ⎥

¯j = E ⎣ ¯j v (4.64) E [ql Iq∗l ] = q∗l sk s∗k qm ⎦ = E v L m=1 E [q∗l sk s∗k ql ] =

l=1 l=k m=k

l=1

¯ j + vj∗ vj = si s∗i = 1, ¯ j∗ v and, since, v

0 1 L−K +1 E vj∗ vj = . (4.65) L L→∞ It remains to show that var vj∗ vj −→ 0, which is accomplished in a similar fashion. From this theorem we can calculate the information theoretic capacity of the decorrelator layered single-user channel using random CDMA. Equation (4.61) states that the symbol energy, or the signal-to-noise ratio, of an arbitrary user k is reduced by the factor 1−β+1/L with respect to interference-free transmission. At the same time, the system has K channels in L dimensions. Therefore, the decorrelator (system) capacity is given as    Eb β 1 CDEC = log 1 + 2 bits/dimension. (4.66) 1−β+ 2 N0 L We recall the Shannon bound for an AWGN channel (4.41), and, from (4.66), we can analogously derive the Shannon Bound for the decorrelator layered channel as

124

4 Multiuser Detection

22C/β − 1 Eb , ≥ N0 2C(1 − β)

(4.67)

which is shown in comparison to the AWGN Shannon bound of (4.41) in Figure 4.13 for a few different system loads β.

Decorrelator per-user Capacity CDEC

10

AWGN Capacity β = 0.75 Decorrelator Capacities 1 β = 0.25 β = 0.5

β = 0.9 0.1

-2

0

2

4

6

8

10

12

14 Eb /N0 [dB]

Fig. 4.13. Shannon Bounds for the AWGN channel and the random CDMA decorrelator-layered channel. Compare with Figure 4.11.

As can be seen from Figure 4.13, system loads in the range of 0.5 ≤ β ≤ 0.75 are most efficient. For larger loads the energy loss of the decorrelator is too big, and for smaller loads the spectrum utilization is inefficient. 4.4.5 Minimum-Mean Square Error (MMSE) Filter The MMSE detector is an estimation theory concept transplanted to the field of (multiuser) detection. The MMSE filter minimizes the variance at the output of the filter taking into account both channel noise and interference, but ignoring the data structure. As long as this structure is disregarded,

4.4 Signal Layering

125

the MMSE filter provides the best possible preprocessing. Since the residual noise is asymptotically Gaussian [101], the minimum variance filter is also the best overall minimum variance estimator, and maximizes the capacity of the resulting layered channels. The MMSE filter minimizes the squared error between the transmitted signals and the output of the filter. It is found as ; < 2 M = arg min E d − Mr . (4.68) M

Carrying out this minimization is standard [60], and makes use of the orthogonality principle: ; < ∂ 2 E d − Mr = 2 E [(d − Mr)r∗ ] = 0. (4.69) ∂M Evaluating the various terms we obtain

E [dr∗ ] = M E [rr∗ ] AS∗ = M SAA∗ S∗ + σ 2 I −1 . M = AS∗ SWS∗ + σ 2 I

(4.70)

See also Section A.4.3. Using the matrix inversion lemma4 an alternate form of the MMSE filter (4.70) can be found as follows: −1 S  W  S∗ +  σ2 I M = AS∗  B

C

D

A

 −1 1 1 1 ∗ ∗ ∗ 1 −1 = 2 AS − 2 AS S S 2 S + W σ σ σ σ2   −1 ∗ 1 S = 2 A I − R R + σ 2 W−1 σ − 1 −1 = A∗ R + σ 2 W−1 S∗ .

(4.71)

Scaling the filter output by A∗ −1 does not affect hard decision, hence we propose the following Definition 4.2 (Minimum Mean-Square Error Detector). The Minimum Mean-Square Error Filter (MMSE) Detector outputs  −1 ∗  dMMSE = sgn R + σ 2 W−1 S r 4

Given as

−1

(A + BCD)−1 = A−1 − A−1 B(DA−1 B + C −1 ) DA−1

126

4 Multiuser Detection

The MMSE detector ignores the data structure of the interference, and takes into account only its signal structure. In this class of receivers, the MMSE is optimal, and the capacity of the layered channel is maximized. The MMSE filter application to CDMA joint detection was first proposed in [171], and then further developed in [107] and [82]. 4.4.6 Error Performance of the MMSE The error probability of the MMSE receiver can be calculated analogously to the decorrelator. Again, assume dk = −1. An error occurs if dMMSE,k = 1 (for BPSK), and the error probability is calculated as   −1  (4.72) R + σ 2 W−1 y > 0 . Pr (dMMSE,k = 1|dk = −1) = Pr k

−1 Introducing the abbreviation T = R + σ 2 W−1 , we continue as follows: Pr (dMMSE,k = 1|dk = −1) = Pr ((T (S∗ Rd + S∗ z))k > 0) = Pr ((TS∗ Rd)k +(TS∗ z)k > 0) .

(4.73)

It is straightforward to show that E [(TS∗ z)k ] = 0;

var((TS∗ z)k ) = E [TS∗ zz∗ ST] = σ 2 [TRT]kk

(4.74)

and that the error probability of user k is therefore given by 9 : ∗

Rd) (TS k Pr (dMMSE,k = 1|dk = −1) = 21−K . Q 2 [TRT] σ kk d

(4.75)

(dk =−1)

The error formula assuming dk = 1 is completely analogous. From (4.75) we can draw a few conclusions. First, the error probability depends on the powers of the interfering users, unlike for the decorrelator. The error probability also depends strongly on the spreading sequences S which are used, but, as σ 2 → 0, the error probability Pe → 0. It is very informative to express the error of the MMSE in terms of the spectral decomposition of the correlation matrix. Evaluating the mean square error of the MMSE receiver we obtain ; <   −1 2 E d − Mr = tr I − A∗ S∗ SWS∗ + σ 2 I SA  −1  = K − tr SWS∗ SWS∗ + σ 2 I =K−

L

i=1

K

σ2 λi = , λi + σ 2 λ + σ2 i=1 i

(4.76)

where the eigenvalues λi are the K eigenvalues of SWS∗ . This particular expression will be useful in the next section.

4.4 Signal Layering

127

4.4.7 The MMSE Receiver with Random Spreading Codes As for the decorrelator, little can be said in general for any given specific set of spreading sequences. However, for the class of random spreading sequences, the results that are obtained are very precise, and in the limit of large system, these results are no longer random, but become deterministic, as the variances of the random quantities involved vanish. It is this property, and the widespread acceptance of random spreading codes as practical alternatives, which make them popular both for theory and application. First we note from (4.76) that the sum of the mean square errors of all K users equals K K



σ2 . (4.77) MMSEk = λ + σ2 i=1 i k=1

In a random system MMSEk will be equal for all k if Pk = Pk′ , ∀k, k′ , and  K

1 1 1 K,L−→∞ ∞ MMSEk = (4.78) −→ fλ (λ) dλ, P λi K λ + 1 0 σ2 + 1 σ2 i=1

where fλ (λ) is the limiting eigenvalue distribution of SS∗ /K. We have come across this distribution in Chapter 3, Theorem 3.5, It was calculated by Bai and Yin [10] as [x − a(β)]+ [b(β) − 1]+ −1 fλ (x) = [1 − β ]δ(x) + 2πβx   2 2 a(β) = β − 1 ; b(β) = β + 1 ; [z]+ = max(0, z).

Verd´ u and Shamai [150] have shown that there exists the following closedform formula for the mean square error of the arbitrary user k in the case of synchronous CDMA:  −1  P P 1 , (4.79) ,β MMSEk = 1 + 2 − F σ 4 σ2 where the function F(x, z) is defined as 2   √ 2 √ 2 x 1+ z +1− x 1− z +1 . F(x, z) =

(4.80)

The signal-to-noise ratio and the mean square error of the MMSE receiver are related by 1 MMSEk = (4.81) 1 + SNRk and, conversely, 1 SNRk = − 1. (4.82) MMSEk Combining (4.82) and (4.79) we arrive at the following theorem.

128

4 Multiuser Detection

Theorem 4.2 (MMSE Performance in Random CDMA). If K → ∞, and L → ∞, such that the load β = K/L is constant, then, for synchronous CDMA   P 1 K,L−→∞ P F − , β . (4.83) SNRk −→ σ2 4 σ2

Since the residual noise of the MMSE receiver is Gaussian, the capacity of the corresponding single-user channel is given as straightforward application of the capacity formula (4.41), where the Gaussian noise SNR is replaced by (4.83), and, using P = REb in order to normalize equations, we obtain    REb REb β 1 CMMSE = log 1 + 2 − F 2 ,β . (4.84) 2 N0 4 N0 Setting βR = CMMSE leads to an implicit equation for the maximum spectral efficiency of the MMSE receiver, which is plotted in Figure 4.14 below for a few load values. Observation of these capacity curves reveals some interesting properties of the respective filters. The MMSE receiver is most efficient in the low signal-tonoise ratio regime, where thermal noise dominates the channel impairments. Here, however, the simple matched filter performs almost as well. As the signal-to-noise ratio improves, the differences between decorrelator and MMSE receivers diminish. Like the decorrelator, albeit not as severely, the efficiency of the MMSE receiver diminishes as the system load is increased in the high signal-to-noise ratio regime. Like the decorrelator, the MMSE receiver requires the inversion of a K × K correlation matrix at each symbol period. This represents an enormous complexity burden and is not desirable for actual systems implementations. We will present lower complexity iterative approximations for both of these preprocessors in the next chapter. 4.4.8 Whitening Filters The whitening filter approach represents an intermediate stage between filtering and cancellation receivers. It is based on a matrix decomposition method, namely the Cholesky decomposition [55], which leads to this detector structure. The symmetric matrix R allows the decomposition R = F∗ F, where F∗ is lower-triangular, as already encountered in Section 4.2.3. The Cholesky decomposition can be accomplished in O(n3 /6) operations, where n is the size of the matrix [55], and has therefore a complexity which is comparable to that of inverting the matrix. Furthermore, if R is invertible, so are F and F∗ . We now pre-multiply the correlator output (4.48) by F∗ −1 and obtain

4.4 Signal Layering

129

10

Capacity bits/dimension

AWGN Capacity

β=1 MMSE Capacities 1 β=2

β = 0.5

Decorrelator

β = 0.5

0.1

-2

0

2

4

6

8

10

12

14

16

Eb /N0 [dB]

Fig. 4.14. Shannon bounds for the AWGN channel and the MMSE layered singleuser channel for random CDMA. Compare with Figures 4.11 and 4.13.

yWF = F∗ −1 y = FAd + zWF .

(4.85)

The correlation of the filtered noise vector zWF is easily calculated as E [zWF z∗WF ] = F∗ −1 S∗ E [zz∗ ]SF−1 = σ 2 I,

(4.86)

that is, the noise samples zWF are white (uncorrelated) with common variance σ 2 per dimension. This property has also given the filter F∗ −1 the name noise whitening filter. In a sense F∗ −1 accomplishes the complement of the decorrelator, which whitens the data. The benefit of yWF is that, since F is lower triangular, ⎤ ⎡ ⎤ ⎡ F √P d 1 1 11 yWF,1 √ √ ⎥ ⎢ yWF,2 ⎥ ⎢ F21 P1 d1 + F22 P2 d2 ⎥ ⎢ ⎥ ⎢ ⎥ + zWF ⎢ yWF = ⎢ (4.87) ⎥=⎢ .. .. ⎥ ⎣ ⎦ ⎣ . ⎦ . √ 5Kn yWF,Kn F Pd i=1

Kn,i

i i

130

4 Multiuser Detection

and a successive interference cancellation detector suggests itself which starts with dˆ1 = sgn(yWF,1 ) and proceeds to decode successively ⎛ ⎞ i−1

dˆi = sgn ⎝yWF,i − Fij Pj dˆj ⎠ , (4.88) j=1

as shown in Figure 4.15. Since F is lower triangular with width K (see Section 4.4.9) even in the asynchronous case ⎛ ⎞ K

Fi(i−j) Pi−j dˆi−j ⎠ , (4.89) dˆi = sgn ⎝yWF,i − j=1

and we never need to cancel more than K − 1 symbols. However, in contrast to the decorrelator, and as for the MMSE receiver, this decoder requires the amplitudes A of all the users in order to function properly, a complication which must be handled by the estimation part of the receiver. Since F is lower triangular, only previous decisions are required at each level, forming the input to the i-th decision device. If all previous i−1 decisions are correct, i.e. dˆj = dj , j = 1, . . . , i − 1, the input to the i-th decision device is given by i−1

(4.90) Fij Pj dj = Fii Pi di + zWF,i , yWF,i − j=1

and yWF,i is free of interference from the other users’ symbols. The discrete signal-to-noise ratio at the input of the decision device can now be calculated as F 2 Pi SNRi = ii2 . (4.91) σ Since we are starting with i = 1, it is important to rearrange the users such that they are ordered with decreasing SNRi , or equivalently with decreasing Fii2 Pi . This is to minimize error propagation, since, if one of the decisions dˆi is erroneous, equation (4.91) no longer applies and we have an instance of error propagation typical for decision-feedback systems. It is interesting to note that the filter F∗ −1 maximizes SNRi for each symbol d[i], given correct previous decisions, as originally shown by [30, 31], and restated in Theorem 4.3 (Local Optimality of the Whitening Filter). Among all causal decision feedback detectors, F∗ −1 maximizes SNRi for all i, given correct previous decisions dˆj = dj , j = 1, . . . , i − 1.

4.4 Signal Layering

131

dˆ1

sgn( )

F21 Partial Decorrelator

-

dˆ2

sgn( )

F∗−1 FK1

FK2

-

sgn( )

dˆK

Fig. 4.15. The partial decorrelating feedback detector uses a whitened matrix filter as a linear processor, followed by successive cancellation.

Proof. Assume M is any filter such My = T′ d + z′ , where T′ is lower triangular. We may write T′ = TF, where T must necessarily also be lower trian 5  ∗ i gular. But E[z′ z′ ] = σ 2 TT∗ , and SNRi = t2ii Fii2 Pk / σ 2 j=1 t2ij , which is maximized for all i by setting T = I. Using optimal error control codes at each stage before cancellation user #1 could operate   at a at a maximum error free rate per dimension of R1 = 2 F11 P1 1 , and after cancellation the k-th decoder could operate at 2 log 1 + σ 2   2 Fkk P 1 Rk = 2 log 1 + σ2 k , yielding the maximum sum rate Rsum =

K

k=1

Rk =

K

1

k=1

  F 2 Pk log 1 + kk2 . 2 σ

(4.92)

=K ∗ Since F is triangular det(F) = k=1 Fkk , and hence det(F WF) = =K 2 ∗ ∗ k=1 Fkk Pk . We conclude further that, since (F WF) = (S WS). We obtain from (4.92)   S∗ WS 1 log det I + = Csum , (4.93) 2 σ2 which is the information theoretic capacity of the CDMA channel as derived in Chapter 3. Whitened filtering with cancellation therefore theoretically suffers no performance loss. It is a capacity achieving serial cancellation method. Note, furthermore, that power ordering is no longer required if each level is using capacity achieving coding. We will later show in Chapter 6 that a bank

132

4 Multiuser Detection

of MMSE filters with a successive cancellation structure like that in Figure 4.15 can also achieve the capacity of the CDMA channel. Despite this, there are a number of difficulties associated with partial decorrelation. Not only is exact knowledge of the power levels Pk required, but the transmission rates have to be tailored according to (4.92), and that requires accurate backward communication with the transmitters, i.e. feedback. The decoding delay may also become a problem since each stage has to wait until the previous stage has completed decoding of a particular symbol before cancellation can occur. 4.4.9 Whitening Filter for the Asynchronous Channel For asynchronous CDMA and continuous transmission the frame length n may become quite large. As a consequence the filter decomposition R = F∗ F via a straightforward Cholesky decomposition becomes too complex. Since the different symbol periods do not decouple like in the synchronous case, we need some way of calculating this decomposition efficiently, exploiting the fact that most entries in the asynchronous correlation matrix R are zero. We will make use of the fact that the matrix R has the special form of a band-diagonal matrix of width 2K − 1, i.e. ⎤ ⎡ R0 [0] R∗1 [1] ⎥ ⎢R1 [1] R1 [0] R∗2 [1] ⎥ ⎢ ∗ ⎥ ⎢ R2 [1] R2 [0] R3 [0] (4.94) R=⎢ ⎥. ⎥ ⎢ R2 [1] R3 [0] ⎦ ⎣ .. .. . .

It can be shown then ([55], or by considering F∗ F = R) that F is lower triangular with width K, or in block form ⎤ ⎡ F0 [0] ⎥ ⎢F0 [1] F1 [0] ⎥ ⎢ ⎥ ⎢ F1 [1] F2 [0] (4.95) F=⎢ ⎥, ⎥ ⎢ F2 [1] F3 [0] ⎦ ⎣ .. . where

(i) Fi [0] is lower triangular, and (ii) Fi [1] is strictly upper triangular. Expanding R = F∗ F we obtain the following set of equations for the individual blocks of F: Rn−1 [0] = F∗n−1 [0]Fn−1 [0] 0k

i=1 i